The First Year at School: An International Perspective (International Perspectives on Early Childhood Education and Development, 39) 3031285883, 9783031285882

This book explores an under-researched but vital part of education: the first year at primary/elementary school. The wor

107 85 6MB

English Pages 367 [342] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The First Year at School: An International Perspective (International Perspectives on Early Childhood Education and Development, 39)
 3031285883, 9783031285882

Table of contents :
Foreword
Introduction
Contents
Editors and Contributors
About the Editors
Contributors
Abbreviations
List of Figures
List of Tables
Part I: Overview of the Book and Background to the Research on Which it is Base
Chapter 1: A Reflection on Three Decades of Development
1.1 Introduction
First Steps
Development
Computerisation
1.2 Expansion and Transfer to Durham
Development and Decisions
Feedback to Teachers
Interventions
1.3 Other Countries
1.4 Developmental Issues
1.5 Quality Assurance
1.6 Opposition to Baseline Assessment
1.7 A New International Approach: iPIPS
Appendix: Details from Brazil, Russia and Australia
Brazil
Russia
Australia
References
Chapter 2: The First Year at School: A Perspective from a Personal Standpoint
References
Part II: Introduction: The Challenge of Assessing Young Children’s Progress Fairly and Making Comparisons
Introduction
References
Chapter 3: Educational Assessment of Young Children
3.1 Introduction
3.2 The Escalation in Assessment of Younger Children
3.3 Rationale for Assessment of Young Children
3.4 Functions of the Assessment of Children
3.5 Conclusion
References
Chapter 4: International Comparative Assessments in Education
4.1 Introduction
4.2 Purposes and Functions of International Comparative Assessments
Benefits to Emerging Countries
Benchmarking
Monitoring Education Systems
Explanatory Factors and Frameworks
4.3 Cross-National and Regional Validity of International Comparative Assessments: Methodological Considerations
Design of International Comparative Assessments
Research Questions and Conceptual Framework
Research Design
Target Population
Sampling
Instrument Development
Data Collection and Preparation
Data Analysis and Reporting
Other Non-methodological Considerations
References
Chapter 5: International Comparative Assessments of Young Children: Debates and Methodology
5.1 Introduction
5.2 International Comparative Assessments of Young Children
The IEA Preprimary Study, 1987–1997
International Early Learning Study (IELS)
Critique and Issues Related to IELS
5.3 Methodological Issues Regarding International Assessments of Young Children
Strategies Used to Assess Young Children outside of International Assessments
Reflection on Methodological Challenges and Solutions Addressed by the IPIPS Team
References
Chapter 6: Teachers’ Roles in the Assessment of Young Children
6.1 Introduction
6.2 Teacher Assessments
Teachers’ Conceptions of Assessment
Assessment Literacy of Teachers
Teachers’ Experiences of Implementing Assessment Systems and Accountability
Teacher Assessment in Research Studies
6.3 Validity and Reliability of Teachers’ Assessments and Ratings of Young Children Studies
Factors Influencing the Validity and Reliability of Teachers’ Assessments and Ratings of Young Children
6.4 iPIPS Methodology for Ensuring Validity and Reliability of the Assessments
References
Part III: Growth of the International Performance in Primary School Indicators Project
Introduction
References
Chapter 7: Packing 200,000 Years of Evolution in One Year of Individual Development: Developmental Constraints on Learning in the First Primary School Year
7.1 Introduction
7.2 Changing Developmental Priorities from Preschool to Primary School
Episodic Representations
Realistic Representational Thought
Rule-Based Thought
7.3 Learning in Language and Mathematics
Language
Mathematics
Dyslexia and Dyscalculia: Understanding Developmental Learning Difficulties
7.4 Outline for the FOSTIR Programme
Early Learning in Language and Mathematics
7.5 Conclusions
References
Chapter 8: Children’s Developmental Levels at the Start of School
8.1 Introduction
8.2 The iPIPS Assessment
8.3 Cognitive Development
8.4 Personal, Social and Emotional Development
8.5 Developmental Levels at the Start of School
South Africa
Context
iPIPS Sample
Cognitive Development
Personal, Social and Emotional Development
Lesotho
Context
iPIPS Sample
Cognitive Development
Brazil
Context
iPIPS Sample
Cognitive Development
Personal, Social and Emotional Development
Australia, England and Scotland
Context
Cognitive Development
Personal, Social and Emotional Development
Russia
Context
Cognitive Development
Personal, Social and Emotional Development (Fig. 8.13)
8.6 Discussion
References
Chapter 9: Progress Made During the First Year at School
9.1 Introduction
9.2 The First Year of School in England and the Western Cape of South Africa
England
Western Cape of South Africa
9.3 Validity of Comparisons
9.4 The Importance of the First Year at School
9.5 Summary
References
Part IV: The First Year at School: Education Inequality, Poverty and Children’s Cognitive Development
Introduction
References
Chapter 10: Measures of Family Background in the iPIPS Project – Possibilities and Limits of Comparative Studies Across Countries
10.1 Introduction
10.2 The Measures of Family Background in iPIPS Studies
The United Kingdom (England and Scotland)
Australia (Western Australia)
Russian Federation (Tartar Republic)
South Africa (Western Cape)
Brazil (Rio de Janeiro – RJ and Sobral – CE)
10.3 Preliminary Answers: Possibilities and Limits of Comparative SES
10.4 Final Remarks
Appendix
References
Chapter 11: The Association Between Adverse Socio-economic Circumstances and Cognitive Development Within an International Context
11.1 Introduction
11.2 Assessing Socio-economic Status in the Context of the IPIPS Project
What Has the IPIPS Project Contributed to Our Understanding of Early Socio-economic Impact?
Composite Measures of Socio-economic Status
Disadvantaged Group Status
Parental Education
Wider Background Measures
11.3 Summary
11.4 Challenges and Future Research Directions for Early Socio-economic Status Researchers
11.5 Implications for Policy
11.6 Conclusion
References
Part V: Using iPIPS Data for Teaching and Informing Policy
Introduction
References
Chapter 12: Strategies to Enhance Pedagogical Use of iPIPS Data and to Support Local Government Decision-Making in Brazil
12.1 Introduction
12.2 Strategies to Provide Incentive for Pedagogical Use by Schools in Brazil
12.3 Strategies to Inform Policy Makers
12.4 Concluding Remarks
References
Chapter 13: Using Assessment Data to Inform Teaching: An Example from Lesotho
13.1 The Education System in Lesotho
13.2 A Feasibility Pilot of iPIPS
13.3 Using iPIPS in Schools
13.4 Next Steps in Lesotho
References
Chapter 14: The Use of iPIPS Data for Policy Assessment, Government Evidence-Based Decision Making and Pedagogical Use by Schools in Russia
14.1 Introduction
14.2 Use of Assessment Results by Teachers, Head Teachers and Parents
14.3 Informing Policy Decision-Making
Chapter 15: Using Data to Inform Teaching: An Example from the Western Cape, South Africa
15.1 The Education System in the Western Cape and the iPIPS Project
15.2 Using iPIPS in School
References
Chapter 16: iPIPS Research Evidence: Case Studies to Promote Data Use
16.1 Introduction
16.2 Promoting Evidence into Use: Data Formats and Transfer
16.3 Strategies to Disseminate Data and to Motivate Data Use in Brazil, Lesotho, Russia and South Africa
16.4 Strategies to Promote Data-Use: Stakeholders, Type of Evidence and Data Transfer
16.5 Results and Impacts of the Data-Use Strategies
16.6 Concluding Remarks and Ways Forward
References
Part VI: Novel and Unexpected Findings from iPIPS
Introduction
References
Chapter 17: Phonological Processing and Learning Difficulties for Russian First-Graders
17.1 Introduction
17.2 Method
Participants
Instruments and Measures
Analysis Plan
17.3 Results
17.4 Discussion
References
Chapter 18: Name Writing Ability and Its Effect on Children Future Academic Attainment
18.1 Introduction
18.2 The Hypothesis and Study
18.3 Measures
18.4 The Analyses
18.5 Discussion
References
Chapter 19: The Early Physical-Motor Development Predictors of Young Children’s Mathematics Achievements
19.1 Introduction
19.2 Methods
19.3 Results and Discussion
19.4 Conclusion
References
Chapter 20: Inattentiveness Predicts Reading Achievement in Primary School, but with Hyperactivity/Impulsivity It’s More Complicated
20.1 Introduction
20.2 Method
Participants
Procedure
Instruments
20.3 Analysis and Results
20.4 Discussion
References
Chapter 21: The Effects of Class Composition on First-Graders’ Mathematics and Reading Results: Two Countries’ Cases
21.1 Introduction
21.2 The Case of Netherlands
Participants
Measures
Educational Programmes
Analysis Plan
Results
21.3 The Case of Russia
Participants
Measures
Analysis Plan
Results
21.4 Discussion
References
Part VII: Conclusion
Tribute to Christine Merrell
Introduction
Chapter 22: Reflections and Recommendations
22.1 Contexts
22.2 Research Findings
Generally Agreed
Context Specific
Surprising Results
22.3 New Questions
22.4 Recommendations and Conclusion
Recommendations for Parents
Recommendations for Schools
Recommendations for Education Authorities
Recommendations for Researchers
22.5 Conclusion
References
Index

Citation preview

International Perspectives on Early Childhood Education and Development 39

Peter Tymms · Tiago Bartholo · Sarah J. Howie · Elena Kardanova · Mariane Campelo Koslinski · Christine Merrell · Helen Wildy   Editors

The First Year at School: An International Perspective

International Perspectives on Early Childhood Education and Development Volume 39 Series Editors Marilyn Fleer, Monash University, Frankston, Australia Ingrid Pramling Samuelsson, Gothenburg University, Göteborg, Sweden Editorial Board Members Jane Bone, Monash University, Frankston, Australia Anne Edwards, University of Oxford, Oxford, UK Mariane Hedegaard, University of Copenhagen, Copenhagen, Denmark Eva Johansson, University of Stavanger, Stavanger, Norway Rebeca Mejía Arauz, ITESO, Jalisco, Mexico Cecilia Wallerstedt, Gothenburg University, Göteborg, Sweden Liang Li , Monash University, Frankston, Australia

Early childhood education in many countries has been built upon a strong tradition of a materially rich and active play-based pedagogy and environment. Yet what has become visible within the profession, is essentially a Western view of childhood preschool education and school education. It is timely that a series of books be published which present a broader view of early childhood education. This series seeks to provide an international perspective on early childhood education. In particular, the books published in this series will: • Examine how learning is organized across a range of cultures, particularly Indigenous communities • Make visible a range of ways in which early childhood pedagogy is framed and enacted across countries, including the majority poor countries • Critique how particular forms of knowledge are constructed in curriculum within and across countries • Explore policy imperatives which shape and have shaped how early childhood education is enacted across countries • Examine how early childhood education is researched locally and globally • Examine the theoretical informants driving pedagogy and practice, and seek to find alternative perspectives from those that dominate many Western heritage countries • Critique assessment practices and consider a broader set of ways of measuring children’s learning • Examine concept formation from within the context of country-specific pedagogy and learning outcomes The series will cover theoretical works, evidence-based pedagogical research, and international research studies. The series will also cover a broad range of countries, including poor majority countries. Classical areas of interest, such as play, the images of childhood, and family studies will also be examined. However the focus will be critical and international (not Western-centric). Please contact Astrid Noordermeer at [email protected] to submit a book proposal for the series.

Peter Tymms  •  Tiago Bartholo Sarah J. Howie  •  Elena Kardanova Mariane Campelo Koslinski Christine Merrell • Helen Wildy Editors

The First Year at School: An International Perspective

Editors Peter Tymms School of Education Durham University Durham, UK Sarah J. Howie Africa Centre for Scholarship Stellenbosch University Stellenbosch, South Africa Mariane Campelo Koslinski Federal University of Rio de Janeiro Rio de Janeiro, Rio de Janeiro, Brazil

Tiago Bartholo Federal University of Rio de Janeiro Rio de Janeiro, Brazil Elena Kardanova Institute of Education at HSE National Research University Higher School of Economics Moscow, Russia Christine Merrell (deceased) Durham, UK

Helen Wildy The University of Western Australia Crawley, WA, Australia

ISSN 2468-8746     ISSN 2468-8754 (electronic) International Perspectives on Early Childhood Education and Development ISBN 978-3-031-28588-2    ISBN 978-3-031-28589-9 (eBook) https://doi.org/10.1007/978-3-031-28589-9 © Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

This book is dedicated to Christine Merrell who was working on this edited book when she was suddenly taken from us after a short illness. Christine took her Master’s in Newcastle and was appointed as a Research Associate to work on PIPS in the CEM Centre. This was a most important appointment which led to an exceptionally productive career. The PIPS project, which was at a very early stage of development when Christine started work on it, went on to be used with more than 5 million children in numerous countries around the world and led to many articles, as well as associated investigation and grants. She rose through the ranks in CEM after it moved to Durham and became a Director before the Centre was sold by the university. The university then appointed her to faculty positions and she was offered the post of Pro Vice-Chancellor, but she turned it down as she wanted to retire and spend more time with her family. In her time as a senior member of the faculty, she was also

vi



instrumental in setting up the Durham Research Methods Centre which will have a lasting impact. Christine had a scientific mind which was dedicated to exploration, evidence and rigour. She also had a deep sense of integrity and a commitment to fairness; she wanted to make the world a better place and succeeded in doing so. She was compassionate and had a deep understanding of people. When these characteristics are found in one person, we have an exceptional individual.

Foreword

It is an honour for me to write about this book which deals with specialised literature not yet covered in depth: the first year at primary or elementary school and the transition to this stage in Early Childhood Education. The text is intended not as a mere instrument for academic purposes but also as part of a dialogue with policy makers, dealing with important issues connected to education policies such as child development, school effectiveness and monitoring and evaluation. The importance of the initial year at school is attached to the opportunity to level the playing field when the preschool experience is not sufficient to undertake such a task. But it can also be connected to the quality of the introduction to literacy and numeracy upon which will be built much of the additional learning. As the book states, “It is perhaps the most important year in the educational career of learners. Although children start at different ages across the world, their first year is always a major transition involving all that it entails including stress, anxiety and excitement.” What this interesting collective work shows is that effective learning during this period varies a great deal and hints for promising practices can be found when looking deeper into these differences and that indicators, such as the ones established in the iPIPS – International Performance Indicators in Primary Schools – project can provide some light into the search for examples to be replicated and scaled. The iPIPS project has indeed played an important role, as authors in the various chapters have tried to understand the reasons for apparent successes in different contexts and the challenges that still remain in ensuring a healthy and effective first year of school. I feel, on a personal capacity, connected to this work, since as secretary of Education of Rio I had the opportunity to assess reading and numeracy at first grade, and this resulted in interesting findings that I had the opportunity to share with Peter Tymms, the organiser of the book, back in 2013. Although my observations were then based on external and standardised assessments, my comments then were centred on my conversations with the ten first grade teachers whose classes showed the

vii

viii

Foreword

biggest improvements, some of them in schools located in violent slums. I was also happy to see that among the chapters here presented are some in the Early Childhood Development Centers that we have established to build quality ECE and generate a smooth transition to first grade. This book is worth reading and might prove to be an important resource for both academics and policy makers. Claudia Costin Professor at Fundação Getulio Vargas Rio de Janeiro, Brazil Senior Director of Education at the World Bank Group (2014–2016) Washington, DC, USA Secretary of Education for Rio de Janeiro (2009–2014) Rio de Janeiro, Brazil

Introduction

The first year at school is one of the most important years in any individual’s education, and yet it has received relatively little attention in the research literature. This book brings together developmental work and investigations conducted over several decades in many countries to look at that first year at school, which, of course, is particular to every country, and yet there are unifying features which can be seen across the globe. The book is divided into seven parts. The first deals with the history of the project (Chap. 1) and the way in which the PIPS (Performance Indicators in Primary Schools) project was conceived and evolved over time. It looks at some of the initial ideas which were abandoned and some of the ideas which took off dramatically. This involved both the assessment itself and the feedback which was generated. PIPS, which aimed at providing feedback to schools and teachers, evolved into iPIPS (international PIPS), which aimed at policy makers, and that has had traction in a number of different countries leading up to the present time. In Part I there is also a chapter (Chap. 2) from a head teacher who was intimately involved with the original creation of the assessment. It indicates how she was able to influence and help take the project forward. She explores her role in the development of the PIPS baseline and her experience of its use. Part II looks at assessment in the early years from the perspective of large-scale international monitoring projects of education and draws lessons from the various studies. This reinforces the importance of early education, drawing on a particular UNESCO report. It describes the landscape of educational assessment conducted internationally on children and up to the end of their first year of schooling, including assessment undertaken in the formal early childhood setting (Chap. 3). In Chap. 4, the countries studied in this book are characterised in terms of their educational provision against the global background. The general pattern of higher attainment being associated with greater affluence is confirmed at the level of countries and the vital nature of development in the first few years of life is confirmed. This leads to the call for greater monitoring and a discussion of the tension between high stakes testing and the need for better data at all levels. Chapter 5 describes the

ix

x

Introduction

strategies that have been adopted in monitoring and assessing children in the early years of formal education. It also highlights the challenges associated with assessing young children which is seen as quite different from the assessment of older students. Finally, Chap. 6 comprises two foci. The first addresses the general issue of teacher assessments, and the second addresses the roles of teachers in research projects and specifically in the iPIPS project. The part concludes with a reflection on the methodological challenges and solutions found and addressed by the iPIPS team. The solutions adopted by iPIPS are seen as showing the way forward. The third part of this book is devoted to growth and the importance of a good foundation. It distinguishes between aspects of development which evolve naturally and those that are taught. It also differentiates between cognitive growth and social and emotional development in the context of different countries and curricula. Chapter 7 provides a unifying in-depth theoretical structure for understanding much of the content of the PIPS/iPIPS assessments, and it can also be seen as the basis for curriculum development as it may challenge the provision in some countries. It covers cognitive development between the ages of 3 and 5. This is when language really takes off as well as phonological awareness and basic arithmetic. The following chapter (Chap. 8) uses data from the iPIPS project to look at the variation in what children know and can do when they start school in Australia, Brazil, England, Lesotho, Russia, Scotland and South Africa. Although direct comparisons are not made between most of the countries, and are thought to be inappropriate given their very different circumstances, the broad situations are described using ‘Pedagogical ladders’ for early reading and mathematics. These show how children progress in these areas and the personal, social and emotional development is also addressed. These descriptions have important implications for practice and policy. The final chapter of Part III, Chap. 9, takes two contrasting countries, England and South Africa, and probes their contexts as well as children’s starting points and growth alongside changes in their personal, social and emotional development, and their behaviour. The chapter asks what difference schools make and how much they vary. After considering the validity of direct comparisons, it makes a strong case for not doing so because of the strikingly different contexts and histories. Part IV addresses the issue of inequality in education and its measurement in diverse situations highlighting findings from iPIPS. It provides scientific evidence for the impact of poverty on educational growth, and the authors are clear that such evidence is needed for designing policy interventions and for understanding the impact of any such interventions. The data produced in the iPIPS Project is unique compared to other international work and appropriate to understand the ‘poverty effect’. Two chapters approach the measurement of poverty and SES from different angles. The first (Chap. 10) gives a systematic review of the approach taken in the different countries using iPIPS which have involved national data sets linked to the children’s home areas, school-based data and questionnaires to parents or guardians. Similarities are found, but there are also important contextual differences due

Introduction

xi

to history, culture, the age of the children and varying quantities of missing data. The complex challenges of comparing data based on similar levels of poverty/SES are addressed and discussed. The second chapter (Chap. 11) also addresses the measurement issues but carries out a review of the link between cognitive development and deprivation. It concludes with the confirmation of the well-established link but points out that in all countries there are examples of success against the odds and notes that in order to study the impact of poverty on education levels one needs to have a high-quality baseline. It also addresses the issue of what research is now needed and notes that no studies have as much information as we would like from the home. Part V describes how the project has informed teaching and policy-making and the challenges involved. It does this through four country-focussed case studies and starts by pointing out that the potential for education to transform lives is enormous, and that this potential is greater the more deprived a community. It also notes the difficulties involved in using research to influence policy. The countries chosen are Brazil, Lesotho, Russia and South Africa. Chapter 12 recounts the systematic efforts to communicate iPIPS findings to teachers, headteachers and local governments in Brazil. This includes the use of reports, whole group meetings and face-to-face discussions. The ways in which the approaches have been adapted, as a result of experience, are described and future approaches are outlined. Chapter 13 describes the approach taken in Lesotho, one of the poorest countries in the world. An entirely paper-based approach has been adopted for assessment in Lesotho with meetings organised for teachers, headteachers and the Ministry of Education. Testimonials from schools are included as are plans for the expansion of the project. A much larger project across Russia is described in Chap. 14 which initially gave feedback only to teachers, headteachers and parents which was gradually refined as the result of feedback. Then, in one region high-quality data were provided to policy decision makers where the focus became more tightly based on where the action was needed. The South African position is taken up in a short Chap. 15 where the project, the feedback to schools and meetings with schools are described together with teachers’ encouragingly positive reactions to these meetings. Chapter 16 brings it all together in a wider discussion. The experiences of each country are described as they have provided information to teachers, schools and policy makers. This is all summarised and results in a discussion of the ways forward so far as influencing change is concerned. During the development of PIPS and iPIPS and the research associated with the project, there have been unexpected and interesting findings, and these are brought together in Part VI which also examines reasons for differences that are seen between students, classes, schools and districts. There are five chapters in the part. The first, Chap. 17, reports on phonological processing in relation to progress in mathematics and reading. It is well established that phonological processing is related to reading, but this chapter also notes a link to mathematics.

xii

Introduction

Chapter 18 reports the noteworthy link between the ability to write one’s name at the start of school and later progress. Contrary to expectation, the ability to write one’s name was not associated with the length of the name but is linked to the development of literacy and mathematics. The next chapter in this part, Chap. 19, links non-aerobic physical fitness to cognitive development based on work carried out in Brazil. The scores on the well-­ known sitting and rising test were used in well-controlled multi-level models to show a link to progress in mathematics although not with reading, leaving much to discuss. Chapter 20 examines the association between inattentiveness, hyperactivity and impulsiveness, ADHD characteristics, on reading scores at two follow-up points, following baseline assessment on entry to school. The large detailed Russian study confirmed the long-term negative association of reading with attentiveness, but it reported a much weaker and more complex link to hyperactivity/impulsiveness. Class composition is an important topic, and Chap. 21 addresses the issue by comparing and contrasting two studies; one from the Netherlands and the other from Russia which take different approaches to examining the class compositional effect in the first year at school. Both studies conclude that heterogeneity is good and the teacher is of central importance in mediating the effects of class composition. The book’s final chapter, Part VII, draws together its threads, providing the changing international contexts in which the PIPS assessments have been carried out; research findings that are generally agreed, those that are idiosyncratic to contexts and those unexpected findings that have emerged; as well as new questions arising from our research, with recommendations for parents, schools, education authorities and further research directions. The chapter concludes with an acknowledgement of the contributions of researchers, scholars, teachers, school leaders, policy makers and educational authorities, all of whom have participated in the PIPS and iPIPS journey of the past 30 years. Durham University Durham, UK [email protected]

Peter Tymms

Contents

Part I Overview of the Book and Background to the Research on Which it is Base 1

 Reflection on Three Decades of Development ����������������������������������    3 A Peter Tymms

2

The First Year at School: A Perspective from a Personal Standpoint������������������������������������������������������������������������������   25 Pat Preedy

Part II Introduction: The Challenge of Assessing Young Children’s Progress Fairly and Making Comparisons 3

Educational Assessment of Young Children������������������������������������������   35 Sarah J. Howie

4

 International Comparative Assessments in Education������������������������   43 Sarah J. Howie

5

International Comparative Assessments of Young Children: Debates and Methodology ����������������������������������������������������������������������   65 Sarah J. Howie

6

 Teachers’ Roles in the Assessment of Young Children ������������������������   83 Sarah J. Howie

Part III Growth of the International Performance in Primary School Indicators Project 7

 Packing 200,000 Years of Evolution in One Year of Individual Development: Developmental Constraints on Learning in the First Primary School Year����������������������������������������������������������������������  107 Andreas Demetriou

xiii

xiv

Contents

8

 Children’s Developmental Levels at the Start of School����������������������  125 Christine Merrell

9

 Progress Made During the First Year at School������������������������������������  149 Katharine Bailey

Part IV The First Year at School: Education Inequality, Poverty and Children’s Cognitive Development 10 Measures  of Family Background in the iPIPS Project – Possibilities and Limits of Comparative Studies Across Countries ������������������������������������������������������������������������  163 Maria Teresa Gonzaga Alves and Túlio Silva de Paula 11 T  he Association Between Adverse Socio-­economic Circumstances and Cognitive Development Within an International Context������������  185 Lee T. Copping Part V Using iPIPS Data for Teaching and Informing Policy 12 Strategies  to Enhance Pedagogical Use of iPIPS Data and to Support Local Government Decision-Making in Brazil����������  203 Tiago Bartholo, Mariane Koslinski, and Daniel Lopes de Castro 13 Using  Assessment Data to Inform Teaching: An Example from Lesotho��������������������������������������������������������������������������������������������  213 Ajayagosh Narayanan, Christine Merrell, and Davis Pasa 14 The  Use of iPIPS Data for Policy Assessment, Government Evidence-­Based Decision Making and Pedagogical Use by Schools in Russia�������������������������������������������������������������������������  221 Alina Ivanova 15 Using  Data to Inform Teaching: An Example from the Western Cape, South Africa ����������������������������������������������������������������������������������  229 Christine Merrell 16 iPIPS  Research Evidence: Case Studies to Promote Data Use������������  233 Mariane Koslinski and Tiago Bartholo Part VI Novel and Unexpected Findings from iPIPS 17 Phonological  Processing and Learning Difficulties for Russian First-­Graders ����������������������������������������������������������������������  249 Yulia Kuzmina and Natalia Ilyushina 18 Name  Writing Ability and Its Effect on Children Future Academic Attainment������������������������������������������������������������������������������  265 Lee T. Copping

Contents

xv

19 The  Early Physical-Motor Development Predictors of Young Children’s Mathematics Achievements��������������������������������������������������  271 Daniel Kreuger de Aguiar and Tiago Bartholo 20 Inattentiveness  Predicts Reading Achievement in Primary School, but with Hyperactivity/Impulsivity It’s More Complicated����������������  281 Alena Kulikova, Alina Ivanova, Ekaterina Orel, and Anastasia Petrakova 21 The  Effects of Class Composition on First-­Graders’ Mathematics and Reading Results: Two Countries’ Cases ����������������������������������������  293 Alina Ivanova and Yulia Kuzmina Part VII Conclusion 22 Reflections and Recommendations��������������������������������������������������������  309 Helen Wildy Index������������������������������������������������������������������������������������������������������������������  333

Editors and Contributors

About the Editors Peter Tymms, after taking a degree from Cambridge University in Natural Sciences, taught in a wide variety of schools from Central Africa to the north-east of England before starting an academic career. He was “Lecturer in Performance Indicators” at Moray House, Edinburgh, before moving to Newcastle University and then to Durham University where he was promoted to Professor of Education. He is now an Emeritus Professor. His main research interests include monitoring, assessment, performance indicators, ADHD, reading, and research methodology. He originated the PIPS project, which was designed to monitor the affective and cognitive progress of children through primary schools starting with a computer adaptive on-entry baseline assessment. It has been used with more than a million children worldwide. Peter Tymms was Director of the Centre for Evaluation and Monitoring (CEM) until 2011 when he took over as Head of Department and Chair of the Board of Studies in the School of Education. At present he is devoting some of his time to an international project designed to study children starting school around the world. The project is known as iPIPS.  

Tiago  Bartholo is an Associate Professor at the Federal University of Rio de Janeiro. His research interests include the robust evaluation of education as a lifelong process, focused on issues of effectiveness, equity, and well-being. He has more than 16 years of experience teaching at schools and universities and working on research and professional projects in Brazil, England, and Spain. He worked as a consultant for the FNDE (National Fund for the Development of Education, 2009–2010), the Ministry of Education (Programa Segundo Tempo, 2010–2011), Carlos Chagas Foundation, São Paulo (2014–2020), the Inter-American Development Bank (2017–2018) and the United Nations Development Program (PNUD Brasil, 2022). He has published extensively in Brazil and internationally in peer-reviewed journals. He was a Visiting Researcher at Durham University,  

xvii

xviii

Editors and Contributors

England (2014 and 2017), University of Birmingham, England (2012 and 2013), and the University of León, Spain (2009). Tiago is currently one of the leading researchers in a national impact study of Covid-19 on students’ development and well-being in Brazil, helping design a national recovery plan for public education. Sarah  J.  Howie is the founding Director of the Africa Centre for Scholarship, Director of Unit for International Credentialling, and Professor of Education at Stellenbosch University, South Africa since 2017. Previously she was the founding Director of the Centre for Evaluation and Assessment (CEA) and Professor of Education, Faculty of Education, at the University of Pretoria for 17 years. Her career began at the Foundation for Research Development (now the National Research Foundation) after she had completed her undergraduate and honours degrees from Stellenbosch University and University of Cape Town, respectively. She obtained her Master’s degree in Education from the University of Witwatersrand, whilst working fulltime at the Human Sciences Research Council. She completed her PhD at the University of Twente, the Netherlands, and thereafter established the research Centre for Evaluation and Assessment at the University of Pretoria where she was appointed as a Research Director in 2001 and Full Professor in 2007–2017. She has published extensively internationally in peer-reviewed journals and edited and co-edited books. She has served as the National Research Coordinator of the IEA’s Trends in Mathematics and Science Study (TIMSS) 1995 (Pop3), 1999; Progress in International Reading Literacy Study (PIRLS) 2006, 2011, 2016; and Second International Technology in Education Study (SITES) 1999 and 2006 focused on international assessments in primary and secondary schools. She was also the national coordinator for the Nuffield Foundation funded International Primary Indicators in Primary School (iPIPS) study. She was inducted into the Academy of Science for South Africa in 2006. Internationally, she was a member of the UNESCO-Brookings Institute Learning Metrics Task Force for Post-primary (in preparation for Education for All 2015). She was also a member of five international technical research committees associated with design and development of international large-scale assessments of the International Association for Evaluation of Educational Achievement (IEA) (PIRLS 2011 & 2015 and TALIS) and the Organisation for Economic Cooperation Development (OECD) (PISA 2015 & 2018). She is a member of a number of Editorial Boards of international journals including those in the Taylor and Francis and Elsevier publishing houses. She currently serves on the IEA’s International Publications Committee.  

Elena Kardanova, with a PhD degree in mathematics from Saint Petersburg State University, has worked as a Director of the Centre for Psychometrics and Educational Measurement with the National Research University Higher School of Economics since 2013. She is a Tenured Professor at HSE and a Scientific Adviser of the master Program Science of Learning and Assessment. She teaches psychometrics to master and PhD students.  

Editors and Contributors

xix

Her main research interests include psychometrics, test development, Item Response Theory, Rasch modelling, assessing complex constructs, and adjacent research areas. Elena Kardanova has authored many research papers and was and still is in charge of many research projects including international ones. In particular, she initiated iPIPS in Russia in 2014 and supervised the development of its Russian-language version. Mariane Koslinski is an Associate Professor in the Faculty of Education and the Educational Opportunities Research Laboratory coordinator at the Federal University of Rio de Janeiro. She is currently the Coordinator of the “Research Committee 17: Education and Society” of the Brazilian Society of Sociology. Her research areas include social inequality and education, educational policy and program implementation and evaluation, residential segregation and educational inequalities.  

Christine  Merrell started working as a biological scientist before becoming a teacher and studying Master’s degree at Newcastle University where she got a distinction. She then took on the role of Research Associate for PIPS in the Centre for Evaluation and Monitoring (CEM). Her doctorate studied the attainment of pupils with ADHD characteristics in schools. That work led to a series of publications and grants as well as membership of the relevant NICE (National Institute for Health and Care Excellence) Committee. Christine rose through the ranks in CEM to become a Director before moving to the Faculty of Social Sciences and Health as Deputy Pro-VC for research. Christine was an important figure in the iPIPS project and in the editing of this book until she passed away in 2021, well before her time.  

Helen Wildy grew up on a wheat and sheep farm in rural Western Australia and attended a one teacher school before completing her secondary schooling at a boarding school in the capital city, Perth, and attending The University of Western Australia (UWA) to prepare to become a Mathematics teacher. After teaching in Western Australia and Victoria, Helen completed Master’s and Doctoral degrees again at UWA. Helen worked closely with school systems in WA, developing school principals’ leadership capacity and skills to use national assessment program data to improve teaching and learning. Subsequently, she spent 30 years in academic positions in three of WA’s universities, including the final decade as Dean of Education at UWA where she was responsible for teaching and research, and the preparation of teachers. Throughout that time, her particular interest was the teaching and learning of children in their first year of school, and she coordinated the PIPS project across Australia for 20 years. She is now an Emeritus Professor. As well as volunteer roles as a Guide at the Art Gallery of Western Australia and Director of a charity supporting the teaching and learning of children in Bhutan, Helen is currently Acting Deputy Vice Chancellor of Murdoch University in WA.  

xx

Editors and Contributors

Contributors Maria  Teresa  Gonzaga  Alves  Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil Katharine Bailey  Cambridge University Press & Assessment, Cambridge, UK Tiago Bartholo  Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Lee T. Copping  Teesside University, Middlesbrough, UK Daniel  Kreuger  de Aguiar  Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Daniel Lopes de Castro  Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Andreas  Demetriou  Cyprus Academy of Sciences, Letters, and Arts, Nicosia, Cyprus University of Cyprus, Nicosia, Cyprus University of Nicosia, Nicosia, Cyprus Túlio  Silva  de Paula  Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil Sarah J. Howie  Stellenbosch University, Stellenbosch, South Africa Natalia  Ilyushina  National Research University Higher School of Economics, Moscow, Russia Alina  Ivanova  National Research University Higher School of Economics, Moscow, Russia Mariane Koslinski  Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Alena  Kulikova  National Research University Higher School of Economics, Moscow, Russia Yulia  Kuzmina  National Research University Higher School of Economics, Moscow, Russia Christine Merrell (deceased)  Durham University, Durham, UK iPIPS, Durham University, Durham, UK Ajayagosh Narayanan  Seliba Sa Boithuto, Maseru, Lesotho Ekaterina  Orel  National Research University Higher School of Economics, Moscow, Russia Davis Pasa  Seliba Sa Boithuto, Maseru, Lesotho Anastasia Petrakova  National Research University Higher School of Economics, Moscow, Russia

Editors and Contributors

Pat Preedy  Ohio Dominican University, Columbus, OH, USA Peter Tymms  Durham University, Durham, UK Helen Wildy  The University of Western Australia, Crawley, WA, Australia

xxi

Abbreviations

ABS Australian Bureau of Statistics ACARA Australian Curriculum, Assessment and Reporting Authority ADEC Abu Dhabi Education Council ADHD Attention-Deficit/Hyperactivity Disorder ALIS A Level Information System ANS Approximate Number System APA American Psychiatric Organisation ASQ Ages & Stages Questionnaires BIL Baterıa de Inicio a la Lectura BSID Bayley Scales of Infant Development CBCL Achenbach Child Behavior Checklist CCT Conditional Cash Transfer CE Church of England CEA Centre for Evaluation and Assessment CELF Clinical Evaluation of Language Fundamentals–Preschool CEM Centre for Evaluation and Monitoring CTOPP Comprehensive Test of Phonological Processing DAP Developmentally Appropriate Practice DBDM Data-Based Decision-Making DDDM Data-Driven Decision-Making DIBELS Dynamic Indicators of Basic Early Literacy Skills DIPF Leibniz Institute for Research and Information in Education DSM Diagnostic and Statistical Manual of Mental Disorders DWP Department for Work and Pensions EAL English Additional Language ECD Early Childhood Development ECDI Early Child Development Index ECE Early Childhood Education EGMA Early Grade Mathematics Assessment EGRA Early Grade Reading Assessment EPPE Effective Provision of Pre-School Education xxiii

xxiv

EPPSE ESL EYFSP FAPERJ FIPS FSM GCSE GEM HMCI HMRC HSE ICSEA IDACI IDD IEA IELS IMD IoD iPIPS IRT ISCED IQ KS1 KS2 LCE LGM LSOA MABC MD MDRD MICS MODA MoET NAESP NAHT NAPLAN NICE NR NRS OBIS OECD Ofsted PCA PIPS

Abbreviations

Effective Pre-School, Primary and Secondary Educational Project English as a Second Language Early Years Foundation Stage Profile Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro Fähigkeitsindikatoren Primarschule Free School Meals General Certificate of Secondary Education Global Education Monitoring Her Majesty’s Chief Inspector Her Majesty’s Revenue and Customs Higher School of Economics Index of Community Socio-Educational Advantage Income Deprivation Affecting Children Index Income Deprivation Domain International Association for the Evaluation of Educational Achievement International Early Learning and Child Well-being Study Index of Multiple Deprivation Indices of Deprivation International Performance Indicators in Primary Schools Item Response Theory International Standard Classification of Education Intelligence Quotient Key Stage 1 Key Stage 2 Lesotho College of Education Latent Growth Modelling Lower-layer Super Output Areas Movement Assessment Battery for Children Mathematical Difficulties Mathematics and Reading Difficulties Multiple Indicator Cluster Surveys Multiple Overlapping Deprivation Analysis Ministry of Education and Training National Association of Elementary School Principals National Association of Head Teachers National Assessment Program – Literacy and Numeracy National Institute for Health and Care Excellence Number Recognition National Records of Scotland Onderbouwinformatiesysteem (Lower School Information System) Organisation for Economic Co-operation and Development Office for Standards in Education Principal Component Analysis Performance Indications in Primary Schools

Abbreviations

PIPS-BLA PIRLS PISA POP PP PSED RA RAN RD SACMEQ SAPE SAT SD SDG SDQ SEND SES SIMD SIMS SQA SRT Tamba TD TES TIMSS ToM UFRJ UK UNESCO UNICEF USA WISC WM WPPSI

xxv

PIPS Baseline Assessment Progress in International Reading Literacy Study Progress for International Student Assessment Problems of Position Phonological Processing Personal, Social and Emotional Development Research Associate Rapid Automatised Naming Reading Difficulties Southern and Eastern Africa Consortium for Monitoring Educational Quality Small Area Population Estimates Standardised Assessment Test Standard Deviation Sustainable Development Goal Strengths and Difficulties Questionnaire Special Educational Needs Socio-Economic Status Scottish Index of Multiple Deprivation Second International Mathematics Study Scottish Qualifications Authority Sitting-Rising Test Twins and Multiple Births Society Typical Development Times Educational Supplement Trends in International Mathematics and Science Study Theory of Mind Federal University of Rio de Janeiro United Kingdom United Nations Educational, Scientific and Cultural Organization United Nations Children’s Fund United States of America Wechsler Intelligence Scale for Children Working Memory Wechsler Preschool and Primary Scale of Intelligence

List of Figures

Fig. 1.1 Feedback in box-and-whisker plots��������������������������������������������������    9 Fig. 1.2 Stacked graph: Reading (green), maths (red) and phonological awareness (blue)���������������������������������������������������   10 Fig. 1.3 Scattergram���������������������������������������������������������������������������������������   11 Fig. 1.4 Australian image�������������������������������������������������������������������������������   13 Fig. 1.5 Identifying special needs – PIPS Baseline (BLA) to outcomes 7 years later�������������������������������������������������������������������   15 Fig. 4.1 Example of item from PISA 2018 closed item���������������������������������   54 Fig. 4.2 Example of released item: free-response/open ended/ extended response. (Source: TIMSS 2019 Grade 4 (http://research.acer.edu.au))�������������������������������������������������������������   55 Fig. 4.3 Example of a rotated test design overview. (Source: Wu, n.d.)�����������������������������������������������������������������������������   56 Fig. 4.4 Example of item that demonstrates possible cultural differences cross-nationally�������������������������������������������������   57 Fig. 6.1 Elements of an Assessment Strategy. (Source: Howie, 2007)����������������������������������������������������������������������   84 Fig. 6.2 Conceptions of assessment continuum. (Source: Barnes et al., 2015)�������������������������������������������������������������   86 Fig. 6.3 Three-dimensional model of assessment literacy. (Source: Pastore & Andrade, 2019:135)�������������������������������������������   89 Fig. 8.1 Ideas about reading scene from Lesotho version������������������������������  127 Fig. 8.2 Reading at the start of school, Western Cape Province, South Africa�����������������������������������������������  131 Fig. 8.3 Mathematics at the start of school, Western Cape Province, South Africa�����������������������������������������������  131 Fig. 8.4 Personal, social and emotional development mean scores, Western Cape Province, South Africa�����������������������������������  132 Fig. 8.5 Vocabulary at the start of school, Lesotho����������������������������������������  135 xxvii

xxviii

List of Figures

Fig. 8.6 Fig. 8.7 Fig. 8.8 Fig. 8.9 Fig. 8.10

Reading at the start of school, Lesotho���������������������������������������������  135 Mathematics at the start of school, Lesotho��������������������������������������  136 Reading at the end of pre-school, Rio de Janeiro�����������������������������  138 Mathematics at the end of pre-school, Rio de Janeiro����������������������  139 Personal, Social and Emotional Development Mean Scores, Rio de Janeiro�������������������������������������������������������������  140 Fig. 8.11 Personal, social and emotional development mean scores, England and Scotland�����������������������������������������������������������������������  142 Fig. 8.12 Reading at the start of school, Russia�����������������������������������������������  143 Fig. 8.13 Personal, social and emotional development Mean Scores, Russia�������������������������������������������������������������������������  144 Fig. 9.1 Summary of pedagogical ladders for reading and mathematics��������������������������������������������������������������������������������  151 Fig. 12.1 Classroom report tables showing information about pupils’ characteristics, number of children assessed and school complexity index. (Source: Laboratório de Pesquisa em Oportunidades Educacionais (LaPOpE)/UFRJ)��������������������������������������������������������  205 Fig. 12.2 Graph and table with pedagogical interpretation showing results for Mathematics at the start (Mar) and end (Nov) of the school year. (Source: Laboratório de Pesquisa em Oportunidades Educacionais (LaPOpE)/UFRJ)��������������������������������������������������������  206 Fig. 12.3 Graph and table with pedagogical interpretation showing results for Reading at the start (Mar) and end (Nov) of the school year. (Source: Laboratório de Pesquisa em Oportunidades Educacionais (LaPOpE)/UFRJ)��������������������������������������������������������  207 Fig. 13.1 Example of score sheet for vocabulary���������������������������������������������  215 Fig. 14.1 Snippet from the teacher report of the vocabulary section (in Russian)���������������������������������������������������������������������������  225 Fig. 14.2 One-page report for parents��������������������������������������������������������������  226 Fig. 17.1 The effect of phonological processing on the probability of moving into the MDRD group at the end of Grade 1 for pupils with different group status at the beginning of Grade 1���������������������������������������������������������������  256 Fig. 17.2 The effect of phonological processing on the probability to move into TD group at the end of Grade 1 for pupils with different group status at beginning of Grade 1�������������������������  258

List of Tables

Table II.1 Key indicators on pre-primary and primary education on selected countries participating in iPIPS�����������������������������������   32 Table 3.1 Comparing assessments and their purposes�����������������������������������   38 Table 4.1 Examples of research questions from SACMEQ 1������������������������   49 Table 4.2 Examples of research questions from TIMSS 1995�����������������������   49 Table 4.3 Examples of target populations across selected international studies and year of initiation�������������������������������������   51 Table 4.4 Examples of selected international studies’ sampling designs����������������������������������������������������������������������������   53 Table 5.1 Commonly found assessments internationally�������������������������������   74 Table 5.2 Ideal characteristics of an ECD assessment�����������������������������������   75 Table 6.1 Curriculum levels and their interpretation��������������������������������������   85 Table 7.1 Developmental priorities, school learning priorities, and learning across developmental cycles��������������������������������������  109 Table 10.1 Background variables in iPIPS studies�������������������������������������������  174 Table 16.1 Strategies to incentive data-use������������������������������������������������������  237 Table 16.2 Strategies to incentive data-use and their outcomes in Brazil, Lesotho, Russia and South Africa����������������������������������  238 Table 17.1 Descriptive statistics for mathematics, reading, phonological achievement and number recognition�����������������������  253 Table 17.2 Transitions between different groups from the beginning of grade 1 to the end of grade 1 (% from a group at the beginning of Grade 1)�������������������������������  254 Table 17.3 Level of phonological processing in different groups at the beginning and the end of Grade 1�����������������������������������������  254

xxix

xxx

List of Tables

Table 17.4 Results of multinomial logistic regression analysis for group status at the end of Grade 1 (TD is a reference group)���������������������������������������������������������������  255 Table 17.5 The predicted probability of different group status at the end Grade 1 depending on group status at the beginning of grade 1���������������������������������������������������  255 Table 17.6 Results of multinomial logistic regression analysis for group status at the end of Grade 1 (TD is a reference group) with interaction effects��������������������������  256 Table 17.7 Predicted probability of transition into MD, RD and MDRD groups at the end of grade 1 depending on status and level of phonological processing at the beginning of Grade 1������������������������������������������������������������  257 Table 18.1 Sample characteristics, mean age and distribution by country�������������������������������������������������������������  267 Table 19.1 Table 19.2 Table 19.3 Table 19.4

Variables used in the hierarchical linear models����������������������������  274 Descriptive statistics of the participants�����������������������������������������  275 Bivariate correlations between all key variables  275 Hierarchical linear regression models estimating 2nd wave cognitive measure of Mathematics (first year of preschool)������������������������������������������������������������������  276

Table 20.1 Differences in reading achievements between pupils with high and low scores on the behaviour rating scale (in effect sizes)������������������������������������������������������������  285 Table 20.2 Results of quantile regression analysis for ‘not inattentive’ group���������������������������������������������������������������  285 Table 20.3 Results of quantile regression analysis for ‘inattentive’ group���������������������������������������������������������������������  286 Table 20.4 Results of quantile regression analysis for the groups without strong hyperactive/impulsive behaviour����������������������������������������������������  286 Table 20.5 Results of quantile regression analysis for ‘hyperactive/impulsive’ group��������������������������������������������������  287

Part I

Overview of the Book and Background to the Research on Which it is Base

Chapter 1

A Reflection on Three Decades of Development Peter Tymms

This book is based on developmental work over 30 years and this chapter outlines that learning journey involving key players across many institutions and countries.

1.1 Introduction In 1992, Carol Fitz-Gibbon (https://en.wikipedia.org/wiki/Carol_Fitz-­Gibbon) and I were mulling over a series of acronyms. A number of projects were being developed and we needed to give them names. One of the acronyms that we came up with was PIPS, Performance Indicators in Primary Schools, and with such a great name, we had to have a research project. So, the birth of PIPS (1992) was the result of an orphan acronym. PIPS expanded and went on to be used with more than 5 million students and was translated into 15 languages.1 I took the project on and drew strongly on the almost perfectly formed and operational monitoring project, the A Level2 Information System (ALIS) (Fitz-Gibbon, 1992, 1996), which Carol Fitz-Gibbon had created over several years and which remains exemplary to this day. ALIS sought to provide schools and colleges with feedback on the impact that they were having on their A-level students. It started when Carol Fitz-Gibbon was asked by a school head how good their results were. She  Dutch, Germany, Slovene, Spanish, Thai, Cantonese, Putonghua, Portuguese, French, Arabic, Sesotho, Afrikaans, IsiXhosa, British Sign Language and Welsh. 2  A Levels are the main pre-university qualifications in England. 1

P. Tymms (*) Durham University, Durham, UK e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_1

3

4

P. Tymms

replied, “How would I know, without knowing the starting point?” … and so ALIS collected baseline data (a measure of Developed Ability and prior attainment), measures of attitudes and aspirations as well as process measures (how the lesson were conducted) and the A-level results. The analyses, provided to schools and colleges, presented their results using codenames so that each department knew how they were doing in comparison with other departments with similar intakes. It did this in a comprehensive and detailed way, based on the literature, interviews, analyses and robust statistics; Carol was the co-author of a bestselling series on research methodology and statistics (for example, Fitz-Gibbon et  al., 1987). The lessons learnt in developing monitoring systems including ALIS and PIPS and for governments (England, Scotland and China3), can be found in Fitz-Gibbon and Tymms (2002). In 1992, there was not as much emphasis on early years education as there is now. Schools were more focused on the (publicly known) outcomes of schooling at the end of the compulsory schooling. The PIPS project is significant because it provided tracking in the primary years. Increasingly, we understood that no period in the school journey could be fully examined without knowing what happened in the first year at school. And from there it was a small step, but a major change, to focus on what happened during that year. These monitoring systems all ran within the Curriculum, Evaluation and Monitoring (CEM) Centre at Newcastle University in the north-east of England. In 1996, the Centre moved to Durham University and was later renamed as the Centre for Evaluation and Monitoring. In 2020, CEM was sold to Cambridge Assessment for £16 million.

First Steps In its original conception, PIPS covered the whole of primary/elementary school educational outcomes. We started by collecting outcome data, measures of things which were taught in school, alongside developed ability4 measures, something which is not explicitly taught, in Year 6 (age 11). The first step was to create tests. I had spent many years teaching science in schools and decided to begin with a science test. We found a school in Newcastle that was happy for us to collect initial data. The administration of the test went well and allowed the project to move to the next stage with more schools, more instruments and, after some analysis, refinement. At this time, 1992, in England, there was no national testing in primary schools, there was no Ofsted (an official high stakes inspection system) and there was no national curriculum. After we had collected the data, which took half an hour, the teacher said that she did not think the children would sit still and quiet for  We won a three-year project to advise the Ministry of Education in China on setting up a valueadded system in 2005. 4  Developed ability refers to an underlying trait which enables an individual to do well educationally, given the right opportunity. Conceptually it is close to IQ but that term has negative connotations, not the least being that it is fixed and innate. 3

1  A Reflection on Three Decades of Development

5

half hour; how far have we come since that time? How different schools are now! Eventually we would need reading and mathematics tests alongside a measure of developed ability for each year as well as attitude measures. A great help came from the well-known educational psychologist at Newcastle University, David Moseley, who allowed us to use a non-verbal ability test which he had developed, called POP (Problems of Position) (Moseley, 1976).

Development Shortly after this, Carol Fitz-Gibbon, Bruce Carrington and I, were able to fund the appointment of a Research Associate (RA) to work on the project. Simon Gallagher was with us for about 18 months, working in the schools and preparing assessment materials. We were fortunate in being able to appoint Christine Merrell in 1994 as our next RA. By the time Christine was appointed, we had already started a baseline assessment project which was conceived as a test for children starting school so that we could look to their relative progress (value-added) at later points. We began the baseline assessment by reviewing the literature, thinking that we might be able to use a baseline assessment off the shelf. However, it became clear that none would suit our purpose. For example, the literature review indicated that phonological awareness was an important element in children’s development towards literacy, which should be picked up in a baseline assessment, yet none of the available tests included phonological awareness. We therefore created a baseline assessment which included those factors which the literature said predicted reading, as well as those which would predict mathematics. The literature on mathematics was noticeably thin on this and many papers took a theoretical stance. In line with the literature, we included sections which could assess developed ability, such as the matching of shapes. At this time, we enlisted the help of the National Association of Head Teachers (NAHT) who gave a grant towards the development of the PIPS baseline and set up a committee to advise. The NAHT representative was Liz Paver, who was the president of the NAHT and Head of Intake First School in Doncaster. We also had good links to Solihull, a Local Education Authority, through Chris Trinick, its Director of Education. We worked with them on the PIPS baseline development. Pat Preedy, who was a headteacher and had been a Reception5 Class teacher, was a key figure in the Solihull PIPS committee (see Chap. 2). These two groups, from the NAHT and Solihull, provided a solid professional basis for development, providing advice and credibility to the project. As the project expanded, we started to amass a large amount of assessment data and to get feedback from teachers in the schools about the project. The teachers used a booklet for the baseline assessment, which they shared with the child, and a sheet on which they recorded the answers. This was entered into computer files,

 The Reception Class in England is the year before compulsory school starts in Year 1. Well over 90% of parents send their children to Reception, although this is not a legal requirement. 5

6

P. Tymms

analysed and the results were presented in reports to schools. A key feature of the assessment has been the use of sequences with stopping rules, something that had been used to good effect in Hunter Diack’s (1975) vocabulary tests and other assessments. It meant that one could include in the booklet a wide range of material which would not be seen by all children. This was because each section had items ordered by difficulty and each section was stopped when the pupil reached their limit. At this stage we became aware of two factors that shaped our work’s development. Firstly, teachers often said that they would like to conduct the assessment again at the end of the year, because they knew that their pupils made great progress during their time in Reception. By including more advanced material, we revamped the assessment so that it could be used at the end of the year. Many pupils learn to read during that first year at school as well as making rapid progress in early mathematics. Secondly, it became clear that some of the material within the assessment was not predicting well and could be replaced with more promising sections. For example, counting did not prove to be a good indicator of how well pupils would do later in mathematics, contrary to some of the literature, so we expanded other areas in mathematics (Tymms, 1999a, b). There was one exception to this generality; children who were unable to count only four objects were more likely to struggle than the rest of the assessment suggested. This was an unexpected finding. During the PIPS project’s ongoing development over 30  years in various countries, the researchers came across other surprises (see Part VI). The emphasis on prediction was deliberate as our aim was to help teachers by identifying a child’s development along the pathways to literacy, numeracy and success at school, rather than create an assessment which reflected the curriculum. A curriculum may or may not mention such factors but, across the world, all aim for literacy and numeracy. To this end, it was important to concentrate on the most pertinent factors in the little time that was available.

Computerisation As the project progressed, it became clear that the baseline assessment lent itself to computerisation. Schools were beginning to get computers and the complex adaptive nature of the assessment meant that it was possible to make errors when using the paper-­based version. We first went to an outside contractor but the arrangements did not work well. A key Local Education Authority used the computer version produced by a contractor in all their Reception classes. To our horror, we discovered that as the teachers were collecting the data, a bug in the software became apparent. After the teacher had assessed 30 children, all data were lost. Teachers who discovered this problem, then reassessed their children and in some cases, teachers assessed their class three times. On each occasion, the data were completely wiped out. This was so horrific that we thought that the PIPS baseline assessment project was dead. I lost sleep over it and spoke to more than 100 schools eating humble pie. To my amazement, they were very understanding and more or less said “Okay, that kind of thing happens. We’re still with you. Keep going.”

1  A Reflection on Three Decades of Development

7

Programming inhouse proved to be the way forward. For many years, CDs were produced every year to send out to school and, later, the software went online.

1.2 Expansion and Transfer to Durham The PIPS project, as a whole, continued to expand although it was not as large as the CEM’s secondary projects (see Tymms & Coe, 2003), which were starting to make a national and international mark. ALIS resulted in a consultancy for Scotland which produced a national parallel value-added system for Standard Grade, the Scottish leaving examination at the time, to Highers, the pre-university qualification. Also, about this time Carol approached Durham University to see if they would be interested in taking CEM, resulting in the wholesale move of CEM. In December of 1996, we moved PIPS, as an advance party to the School of Education in Durham University under David Galloway, then the Head of Department. The following year, the secondary projects followed with almost all CEM staff transferring to Durham University. We were now using Research Associates with programming skills to write the necessary software. First this was carried out by Brian Henderson, later Paul Jones and then Gideon Copestake. Shortly after arrival in Durham, we appointed a secretary, Kate Bailey, who became CEM’s international development officer before being appointed as one of CEM’s Directors and is now in overall charge of CEM.

Development and Decisions During the development of the PIPS baseline, several issues needed resolving. Some of the solutions to the problems have stood the test of time and they remain in place to this day, others were partially resolved and continue to be explored, whilst others have simply been left in abeyance. The major issues, which were largely solved in the early stages of the project, relate to the cognitive part of the assessment and much of its content. Assessing young children is quite different from administering an examination to teenagers. A pupil starting school usually does not read or write, and has a small short-term memory, perhaps limited to recalling two or three items. The time which a pupil can focus on a task is limited and anything more than 20 minutes is likely to be problematic. Finally, there is no universal global curriculum at this age, so, what should be asked and why? The solution involved a series of mini-tests with the use of sequences with stopping rules. This allowed for an extensive bank of material in the assessment, which covered very slow developers at the age of 4, to fast developers at the age of 6, involving the testing of a wide range from children who cannot recognise any letters or numbers, cannot point to simple words in a picture and have difficulty counting to four, to children who can read and understand complex passages and do sophisticated mathematics and answer questions such as “what is

8

P. Tymms

twice three doubled?” The pictures, interactions and the introduction of a series of new sections kept pupils engaged. The format also resulted in a remarkably high test-retest reliability (up to 0.99) ensuring that the results are independent of the administrator. It was also quick to administer, taking about 20 minutes. As noted earlier, the content was created on the basis of what best predicts later reading and mathematics. At first, analysis was concentrated on standardised test results a year later, but then to age 7 and beyond. That was a good starting point and the content was refined after a few years, once outcome data were available. The result is an assessment with high predictive validity; 0.7 after 1 year and 0.5 up to the age of 16 (Tymms et  al., 2018). More details of the construction of the assessment can be found in Merrell and Tymms (2016) (see also Chap. 8). Our focus was on children starting school at whatever age that might be. In England, the youngest children are just 4 years and the oldest are nearly five. In Scotland, they are generally 6 months older whilst it is more common, internationally, for children to start at the age of 6 or 7 (The World Bank, 2020). Issues, which were partially dealt with, include assessing children with special needs, which appear in a variety of forms, including hearing loss, visual impairment, autism and dyslexia. For some of these, we were able to adjust and provide suitable assessments. We were awarded a Nuffield grant to address the issue of the assessment with advice for children with hearing loss. With the help of two specially appointed RAs, we produced a British Sign Language version of the PIPS baseline assessment and a publication (Tymms et al., 2003) which confirmed the reading and mathematics delays of (D)deaf children and showed what meaningful assessments can be carried out at the start and end of the first year at school. For children with visual impairment, we developed physical material which could be used alongside the computer. A further unresolved challenge was that we wanted to collect data on more and more variables. For example, Pat Preedy had suggested that we collect data on physical development. We set up plans to collect information on height and weight as well as fine and gross motor coordination. This did not gain traction until Tiago Bartholo and Mariane Campelo Koslinki incorporated a measure in the Brazilian version of iPIPS. Pat was also keen that we collect data on twin status because of the lack of educational research into the education of twins and the growing number of twins and multiples in society; up to one in 30 live births (Tymms & Preedy, 1998). We also started to collect data on ADHD characteristics which produced useful findings, especially following an Economic and Social Research Council funded randomised control trial (Tymms & Merrell, 2006a, b; Sayal et al., 2010). Christine Merrell completed her PhD in this area, making a mark internationally with her work. This resulted in our being part of the high-profile National Institute for Health and Care Excellence (NICE) investigation into ADHD (https://www.nice.org.uk/ guidance/NG87), the first time that educationalists had been involved in NICE. There are many other things that were left in abeyance or were simply not thought of all those years ago. These include executive function and personality, but all is not possible. One has to consider how much time there is to collect data and how useful it is going to be.

1  A Reflection on Three Decades of Development

9

We did try to advance on various fronts and tried to make things available as optional extras, such as personal, social and emotional development, behaviour and short-term memory. Each aspect of the assessment was important in its own right, but eventually we dropped some of them; for example, the assessment of matching shapes, although the data from that assessment were involved in a later publication (Demetriou et al., 2017).

Feedback to Teachers Throughout the development of the PIPS baseline with follow up, we were keen to communicate the assessment results with schools as clearly as we could. This involved looking at distributions, means, standard deviations and scattergrams amongst other things. The feedback to teachers used these ideas in tables, charts and text. Two examples are given below. The box-and-whisker plots for a class, in Fig. 1.1, at the start of the year shows each pupil’s standardised T-score for early mathematics and reading. Teachers could readily see any clustering of scores and identify outliers who might need special attention. They could also see how their class compared to national data which had a mean of 50 and a Standard Deviation of 10. Later, when feedback was provided electronically, it was possible to hover the mouse over a point and the pupil would be identified with a line linking to the score on other measures (Fig. 1.2). The chart in Fig. 1.2 shows the results of each pupil in standardised scores. The rank order is for reading and mathematics together. By doing this, teachers could

Distribution of Scores 1.0 0.9 0.8 Ear ly reading 2 0.7 0.6

3 11.5 0.5 C 0.4

0.3 Early maths 1 0.2 0.1 0. 0 20 0

30

40

50 1

Score C14

Fig. 1.1  Feedback in box-and-whisker plots

60

70

80 2

10

P. Tymms

Kirsty Emma Inaam UI Christopher Angela Graeme Kyle Joshua Richard Jonathan

Pupils

Joseph Toby Sarah Lucy Louis Adam Amy Amy Robert Lucie Thomas Anisa Harry 0

20

40

60

80

100

120

140

160

180

Scores

Fig. 1.2  Stacked graph: Reading (green), maths (red) and phonological awareness (blue)

easily see when a reading or mathematics score, for a pupil, was very different and also when a phonological awareness score was low. This can be important as pupils who have early difficulty on that measure, are likely to have some difficulty with learning to read, enabling appropriate action to be taken. The scattergram in Fig. 1.3 shows the results for one class with the start of year standardised scores plotted against the end of year scores. The line of best fit is for the full population and the two lines parallel to the line of best fit hold 95% of the population. Using the chart, teachers could see results from two perspectives. One concentrated on individual children and the progress that they had made compared

1  A Reflection on Three Decades of Development

11

Fig. 1.3 Scattergram

with other children of similar starting points. The other concentrated on the class, as a whole. In the example, the class, on average, had made more progress than expected, given the pattern across all classes. This simple presentation enabled detailed discussion at the level of the child, the class, the school and nationally. In designing the feedback and running in-service courses, we concentrated on graphical feedback in the belief that this was the easiest approach. If teachers could understand the charts, then almost nothing else was needed. There was, however, a problem and signs started to appear. After attending an in-service training course, one headteacher described how she had gone back to school full of the new ideas she had just heard, but when she tried to explain value-added to the staff, she realised that she had not grasped some of the key concepts. We also discovered that there were some staff who could not plot points on a scattergram. The crunch came in 1996 when, as part of the Value-Added National Project, we randomly assigned primary schools to receive data in tables, graphs or both graphically and in tables. The data did not involve baseline measures but related to classes of 11-year-olds. When we checked the results the following year, the school that had been given tables did better than those who had been given graphical feedback (Tymms, 1997). Graphs were simply not as effective as tables. I had been wrong, at least so far as the charts we were using were concerned. We tried another approach to enhancing the impact of the data, which we believe was more successful, although it was never formally evaluated. This involved a dedicated seconded headteacher working directly with schools, talking with the head teacher and class teachers about their PIPS results and discussing, professional to

12

P. Tymms

professional, the implications for action which could be drawn from the data. That head teacher was Mike Kilyon who worked in Bradford Local Education Authority. Further discussion of work with teachers and schools can be found in Part V. Although the major aim of PIPS was to improve education through feedback to schools and teachers, the data which were collected lent themselves to academic research. Many educational questions could be explored through analysis of the data and at least 70 publications, including four doctoral theses, have arisen from the project, some of which are outlined in this book with a list given in the Appendix. Some of the research involved designing careful investigations in randomised controlled trials, that is, in true experiments.

Interventions Carol Fitz-Gibbon had called her most successful venture, the ALIS project, her ‘disreputable research’, because it could only ever find associations. ALIS was not able to establish causal relationships because there was no random assignment to an intervention, therefore not a true experiment. In PIPS we were always looking for ways in which we could try to improve education through interventions. One study aimed to help children who were severely inattentive, hyperactive and impulsive to succeed in the classroom and made use of the optional behaviour section of PIPS in addition to the cognitive development parts. The study involved randomly assigning interventions to schools, one of which was an advice booklet for teachers (Merrell & Tymms, 2002). Another intervention was simply to pass on the names of pupils with high scores on the behaviour rating scale. Some schools received both interventions and some received neither. It turned out that an advice booklet helped all children, rather than just those with high inattention, hyperactivity and impulsivity whilst the naming of children had no long-term impact (Sayal et al., 2010). Another approach to intervention research is to capitalise on the opportunities presented by natural events. Covid-19 presented such an opportunity for a ‘natural experiment’ which can potentially assess the impact of not attending school, something which could not be randomly assigned in the usual course of events. Tiago Bartholo and Mariane Koslinski are pursuing this possibility in Brazil.

1.3 Other Countries We started to work in other countries from 1999 and a big step forward was made in the Antipodes. A CEM Centre was set up in New Zealand to work on secondary school projects and the PIPS project was set up a few years later with a special New Zealand version of the baseline and follow up assessments. It never expanded after the initial setup, but it remains an important project to this day. This was followed by Western Australia and Helen Wildy, who was then at Edith Cowan University

1  A Reflection on Three Decades of Development

13

and set up a unit to develop PIPS. Durham provided an Australian version of the baseline and the project expanded dramatically. At its height in 2009, it was operating in every territory and state in Australia including complete coverage for Tasmania and the Australian Capital Territory. PIPS ebbs and flows, according to government policies and other local factors. Wherever it has grown, it has largely been instigated through local decision making. Cultural adaptations for Australia included a suitable voice-over for the software and some adjustment to the items and images. Discussion about a suitable Australian voice for the software revealed that a single recording would suffice to cover the whole of the enormous land mass of Australia. By contrast, not only did we need different accents in England and Scotland but even travelling a few miles can reveal different accents and even a local vocabulary. We advised schools in the UK to try the recorded voice and, if the children were not happy, to switch the sound down, and the teacher would read out the questions. The local accent provided a surprising oddity. We used a Cantonese voice for children from Hong Kong who were just starting to acquire English, but when we tried the voice-over in Hong Kong, it was clear that Cantonese with a Geordie (Tyneside) accent would not work there! Although most of the images worked well in Australia, there was one countryside picture which was simply too English for Australian eyes. We commissioned a new scene and we only shared it with trepidation as we thought it might be seen as being rude to Australians. But they loved it and it is shown below (Fig. 1.4). Other images were changed as we worked in southern Africa and elsewhere. The PIPS baseline began to be used in other countries. We started a project in Cantonese in Hong Kong under SM Tsui’s leadership and data were also collected

Fig. 1.4  Australian image

14

P. Tymms

in Beijing in Putonghua. In the Netherlands, Anneke van der Hoeven did exemplary work on OBIS (Onderbouwinformatiesysteem; Lower School Information System), as PIPS is known in the Netherlands (Van der Hoeven-van Doornum, 2005). To our great sadness, Anneke passed away in December 2010. Monika Wylde adapted PIPS for use in the German school in London (Tymms & Wylde, 2003) and ran it there to good effect. In German, PIPS is known as FIPS. Later it was adopted by DIPF (Leibniz Institute for Research and Information in Education) and run by the publisher Hogrefe. It is now run from Universität Tübingen. The Abu Dhabi Education Council (ADEC) worked with us to develop an Arabic version which was used by all state funded primary schools across the Emirate on a mandatory basis for 7 years. In South Africa, we worked with the Centre for Evaluation and Assessment (CEA) at the University of Pretoria in Gauteng under the direction of Professor Sarah Howie, from 2003–2012. The work involved 22 primary schools and secondary schools and the doctoral studies of Vanessa Scherman and Elizabeth Archer. In 2014, we won a Nuffield grant for work on the iPIPS project in the Western Cape which was reported in 2017  (see Chap. 15). In all of this work, much time was devoted to the Africanisation of the assessment and this involved a series of translations and adaptations. During this international expansion, our focus was on providing a culturally appropriate assessment and feedback to schools and teachers in their respective countries. Sometimes this involved adaptations into different languages. Our aim was not to compare across countries but when the iPIPS project, which is discussed later, was set up, there was the possibility of making comparisons.

1.4 Developmental Issues It has long been known that using baseline assessments to identify special needs at an early stage can be problematic (see, for example, Mercer et al., 1979). The difficulty arises because of the relatively high proportion of false positives and false negatives, something that we were able to see in our own data. That is to say, pupils can be wrongly identified as having special needs (false positive) or wrongly missed when they have special needs (false negatives). Figure 1.4 shows a scattergram of the baseline total scores against a composite outcome 7 years later in Year 6, made up from mathematicss, reading and science scores for just over 1000 students. Often a cut-off of two standard deviations below the mean of an assessment is used to identify children with special educational needs, and for the PIPS this is shown in Fig. 1.4 with a vertical line at −2. The horizontal line shows a similar cut-off on the Composite outcome. The bottom left-hand quadrant, created by the lines, shows the true positives, those below the cut score on both occasions. The top left and the bottom right quadrants respectively indicate false positives and false negatives. There were no true positives. This is a sobering finding and one that should make educationalists cautious about the early labelling of learning difficulties: prediction is possible but only to a limited extent. The limit is to account for about 50% of the

1  A Reflection on Three Decades of Development

15

Fig. 1.5  Identifying special needs – PIPS Baseline (BLA) to outcomes 7 years later

variance in tests a few years ahead, corresponding to a correlation of about 0.7 (Fig. 1.5). Success in a baseline assessment carried out on entry to school will have been influenced by all sorts of factors such as the effort at home to teach letters, the degree to which the child has settled at school and the amount and quality of stimulation. None of these were measured. But by re-assessing at the end of the first year at school, the uncertainty introduce by  some of these factors can be reduced and a better prediction can be made. If nothing else, a second assessment will reduce the uncertainties of measurement. Predicting the composite outcome from the total scores at the start and end of the year, produces a multiple R of 0.71. This is better than the 0.63 based on the start of year alone. It still does not allow for anything like a firm identification of special needs or giftedness. However, the baseline assessment could start the process of identifying special needs. It alerts teachers to children at risk at an early stage of their schooling, and their progress can be carefully monitored. We need to be cautious before labelling anyone as having a particular problem. However, our purpose with the baseline was not to identify special needs. It was to form a base for looking at value-added measures and, in particular, to provide feedback to teachers.

16

P. Tymms

Around the mid-1990s, there was a resurgence of national and international interest in performance indicators that was picked up and promoted by the English government. The Department for Education set up a project to examine the possibility of implementing value-added measures nationally and CEM won the contract. After 2 years, a clear way to set up value-added measures for secondary schools, on a national basis, was outlined. Had the final report (Fitz-Gibbon, 1997) been implemented, it would have been the end of CEM. The DfE contract was extended to look at baselines in primary schools and their potential for creating value-added measures. The final report (Tymms & Williams, 1996) was clear that good baselines could form the basis for value-added measures and it was decided that schools would have to do one baseline which they could choose from ‘approved’ tests. Strategically, there is a difficulty with government involvement in baseline assessment. It is difficult for some people in power to see a baseline as something that would only give information to teachers and schools. Their mindset can be to hold schools to account, be that by target setting or through inspectors looking at data. Part II discusses accountability and high stakes behaviours in large-scale assessments and monitoring. We were strongly opposed to that and saw PIPS as providing information for teachers and that once one started to use it in other ways, the data would be corrupted; “where there is fear you get the wrong figures” (W. Edwards Deming; https://quotes.deming.org/). This is an issue which has come up repeatedly, especially where a project starts to be large scale and successful. The issue creates tensions and ultimately is almost unresolvable.

1.5 Quality Assurance During the development and later work on the PIPS baseline with follow up, we were always mindful of quality assurance. That was approached in three ways. The first involved listening to teachers’ opinions and making adjustments in the light of their feedback. Sometimes that came from the committees such as the NAHT group and the parallel committee in Solihull, but it also came from the enormous amount in-service work we did in schools around the UK. The second approach involved statistical analysis and concentration on the internal reliability of the scales using classical test theory to start with and later Rasch measurement. It also involved test-­ retest, which involved an independent researcher reassessing children whom teachers had already assessed. From these approaches, which have been employed across all of the PIPS projects, we knew that we had highly reliable data; it didn’t matter which teacher carries out the assessment for the objective parts, although the teachers’ ratings of the optional sections did vary somewhat from teacher to teacher. To assess the extent of that variation, we asked teachers and their assistants working in the same classes to rate all children on the same scales without discussion. A final and crucial further check on quality involved ascertaining the extent to which the assessment was predictive of later success, as discussed earlier.

1  A Reflection on Three Decades of Development

17

1.6 Opposition to Baseline Assessment Although teachers generally liked PIPS and although schools bought into it on a large scale with at least 5 million children being assessed since the start, there was a growing movement which opposed the assessment of young children. The origin appeared to be a fear that young children’s education was moving in the direction of the ‘Gradgrind’ seen in the later years and a fear that educational performance indicators and holding schools to account would become part of pre-school  - a not ungrounded fear. There was also apprehension about simply assessing young children. Very strong views are held in this area (Guardian, 2019). Our view is that it would be wrong not to assess children. It is their right to have high quality development checks throughout childhood. If this does not happen, key issues, that can be dealt with at a point during development, will be missed. People are also worried about labelling, but, as noted earlier, we developed evidence to the contrary. Despite the arguments and evidence, the movement against baseline assessment, and indeed against testing more generally, grew in strength and PIPS lost some of its widespread appeal and use. See for example https://www.morethanascore.org. uk/. It also meant that observational systems gained strength in the mistaken belief that it somehow produced more valid data for all outcomes. Jan Dubiel was asked in an interview whether he would consider creating a non-observational assessment and he replied: “Well, no, because it would be unreliable” (Murray, 2018). Our position has always been that some variables are best assessed by observation and some by asking questions, and some by both teacher observations and assessments using highly reliable processes as offered by PIPS.

1.7 A New International Approach: iPIPS By 2013, it became clear that there was another possible use for the PIPS baseline and that was as an international study to parallel the well-known studies such as PISA, PIRLS and TIMSS which are discussed in Part 2. The new project called iPIPS (International PIPS) was set up. We had financial backing from Durham University to create a world-wide project. We had initial success in the UK with the help of David Hawker, who had been Head of the Department of Children, Education, Lifelong Learning and Skills, responsible to the Welsh Cabinet for the management of the Welsh education system and whom we knew from the Value-­ Added National project. Based on existing data, we produced iPIPS reports for England (Tymms et al., 2014) and Scotland (Tymms et al., 2015) and forged good relations with the OECD, who were also aiming to set up an international project for young children. Together with the OECD, we found it hard to make headway against the opposition to assessing young children although we did get some new projects going. The OECD eventually set up the International Early Learning and Child

18

P. Tymms

Well-being Study (IELS) with just three countries: England, the United States and Estonia. We were able to link up with academics in Russia, South Africa, Lesotho and Mexico, and set up iPIPS projects. In Brazil, work was started with the Instituto Alfa Beto, and later, a second active group was formed in the Universidade Federal do Rio de Janeiro. In addition to working directly with schools, a major aim of the iPIPS work was to provide useful information for policy makers  – something which all countries have attempted to do (Part II). During this work we had extensive discussions about comparisons both within and between countries. This discussion covered feasibility: does it make sense to compare the mathematics level of children starting school in Scotland with similar data from South Africa when the age of children, the cultural background and language(s) spoken at home may be different? Even before considering the feasibility of comparison, we have to address the issue of adaptation of the assessments for use in differing contexts. The other major concern was impact. Would any attempt at comparison have a positive impact? (see Parts II and III).

Appendix: Details from Brazil, Russia and Australia Brazil In the past 3 years (2017–2018-2019), researchers from the Federal University of Rio de Janeiro (UFRJ) have analysed children’s development in their first 2 years at school – ages 4 to 6 in the Brazilian context - within the scope of iPIPS project. Three projects were conducted with different samples in two cities (public and private schools) with longitudinal data being collected for around 4800 children enrolled in a total of 123 schools. The projects provided an opportunity to describe what children know and can do when they start school and what they learn in the first 2  years at school. They also had the objective to understand the patterns of educational inequality and to identify the impact of education policies/programmes and school characteristics associated with children’s learning/development during the first years of compulsory schooling. The first project was developed during 2017–2018 and included a random sample of 46 public schools and 2716 children from Rio de Janeiro’s local education system. The project included three waves of data collection: at the start of the first year of pre-school/compulsory education (March/April 2017), at the end of the first year of pre-school (November/December of 2017), and at the end of the second year of pre-school (November/December of 2018). The study tracked children that were at the 46 schools in the first wave of data collection but moved to other public schools during 2017 or 2018. It also included the children that entered the 46 sample schools during the study (after the 1st wave). Therefore, the project assessed, overall, 4302 children.

1  A Reflection on Three Decades of Development

19

The second and third projects were carried out during the academic year of 2019. They included a non-random sample of 36 private schools and 1407 children from Rio de Janeiro, and a random sample of 670 children enrolled in 41 public schools in the city of Sobral. Both projects collected data at the beginning (March/April 2019) and end at the end of the academic year (October/November 2019) with children enrolled in the first year and the second year of compulsory education (children aged 4–6), simultaneously. Projects Rio de Janeiro – public schools Rio de Janeiro – private schools Sobral – public schools Total

Children Start date End date assessed Mar/2017 Dec/2018 4302

N of schools 46

Mar/2019 Nov/2019 1407

36

Mar/2019 Oct/2019 670

41

6379

Data collected O1: PIPS, SRT O2: PIPS, SRT, PSED, height/weight O3: PIPS, SRT, PSED, CLASS, contextual questionnaires O1: PIPS, SRT, family questionnaire O2: PIPS, SRT, PSED, school questionnaires O1: PIPS, SRT, family questionnaire O2: PIPS, SRT, PSED, school questionnaires

123

The data have been used for research purposes as well as to give feedback to local governments and to provide reports and workshops to schools (teachers and headteachers). We are currently planning a third wave of data collection with children enrolled in the samples of private schools in Rio de Janeiro and public schools in Sobral to investigate the impact of school closures and quarantine, due to the COVID-19 pandemic, on children development.

Russia Development of the Russian version of the iPIPS baseline and follow-up assessment was initiated by the Institute of Education, National Research University Higher School of Economics in 2013. Since then, every year at least one regional sample of the Russian first-graders has been assessed using the iPIPS instrument. During this period, the project has developed from the “paper booklet and App” version to the computer adaptive testing (based on sequences with stopping rules) version. A number of regions in Russia received comprehensive feedback on what children know and can do at start of schooling, and what progress they show during their first year at school. Several sets of large longitudinal data were used for research purposes, including the grant work for the Russian Science Foundation. The first pilot assessment, serving as a  feasibility study, was conducted in Autumn 2013. The Russian sample consisted of 310 children recruited from

20

P. Tymms

21 classes from 21 schools in the Velikiy Novgorod region, located in the central part of Russia. This region was selected because its socio-economic characteristics were similar to those in the country as a whole, based on the 2010 census (Social and Demographic Portrait of Russia, 2010). For example, the distribution of the region’s population by educational level (62% college and above; 30% high school; 8% below high school) was similar to the national figures (65% college and above; 29% high school; 6% below high school), as was the ratio of urban to rural students in the region (72% urban; 28% rural). The sample was randomly selected after stratification on two parameters: (i) the school location (rural or urban area) and (ii) the different status of schools (there are three main types of schools in Russia: comprehensive (general regular) schools, schools specialising in a certain subject and gymnasia. All the chosen schools consented to participate. Parental consent was obtained for children to participate in the study (the majority of parents (around 90%) gave permission for their children to participate). The assessment cycle included two waves of data collection: baseline assessment was conducted at the beginning of October (children in Russia start the first grade on 1st September) and the follow-up assessment was conducted at the end of April. The average age of children in the sample was around 7.4 years in October. The next cycles used more or less the same scheme of the assessment procedures (including data dates of assessments) and sample construction, and covered several central, southern and eastern (Siberian) regions of Russia. The only exception was a series of assessment cycles in Moscow, which included a non-random sample of around 10–15 schools, as well as private schools in separate cases. The table summarises the main assessments cycles across years and regions. Year 2013– 2014

Projects Velikiy Novgorod

Children assessed 310

N of schools 21

2014– 2015

Krasnoyarsk

1440

27

Tatar Republic (Kazan)

1568

38

Moscow

1104

10

Data collected Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires (continued)

1  A Reflection on Three Decades of Development

Year 2015– 2016

2016– 2017

2017– 2018

Children assessed 5265

N of schools 152

Sevastopol

1283

21

Tambov

1458

9

Ekaterinburg

129

1

Moscow

3162

16

Tatar Republic (Kazan)

1289

30

Petrozavodsk

10

1

Ekaterinburg

124

1

Moscow

2189

13

Tatar Republic (14 jurisdictions)

4929

148

Ekaterinburg

108

1

378

3

Projects Tatar Republic (14 jurisdictions)

21

Data collected Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED Wave 2: PIPS, PSED Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, Behaviour Rating Scale, contextual questionnaires Wave 1: PIPS, PSED Wave 2: PIPS, PSED Wave 1: PIPS, PSED Wave 2: PIPS, PSED Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, contextual questionnaires Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, contextual questionnaires Wave 1: PIPS, PSED Wave 2: PIPS, PSED Wave 1: PIPS, PSED, contextual questionnaires Wave 2: PIPS, PSED, contextual questionnaires

22

P. Tymms

In 2021, we launched a process of “internalising” the iPIPS instrument to more nationally-oriented and rather “iPIPS-motivated” instrument which we call START and which accounts for specific local pedagogical requests of the Russian teachers and regional educational authorities.

Australia PIPS Baseline Assessment (PIPS-BLA) has always been offered in Australia as a university research project, introduced on a trial basis to Australia by Dr. Helen Wildy in 2001 and then implemented from 2002 to 2021. Initially the PIPS-BLA was taken up voluntarily by a small number of individual schools across Australia. Increasingly, education authorities introduced and mandated PIPS-BLA with student enrolments peaking in 2009, testing 27,000 students in 813 schools. Among the participating schools were those from nearly all of the 24 different education authorities in Australia  – government schools, Catholic Education schools and Independent Schools in the 6 states and 2 territories. The government education authorities of Tasmania and the Australian Capital Territory (ACT) each mandated PIPS-BLA from 2002. Throughout the two decades of its Australian life, various changes were made. For example, additional items were added to the Australian version of PIPS after a collaboration with Charles Darwin University to expand the range of questions at the lower end of the scale. By 2020 when BASE replaced the PIPS-BLA, only 371 schools took part, testing approximately 16,000 students. When Durham University’s CEM was sold to Cambridge Assessment in 2020, the Australian project continued to be run from The University of Western Australia until the end of 2021 until it was found to be financially unviable, when the remaining 300 schools were passed to the Cambridge Assessment team to manage. During the years Australian data were collected by CEM, allowing for a number of longitudinal studies to be conducted and published. The sample used for the comparative iPIPS research includes data from government schools in Tasmania and ACT.

References Demetriou, A., Merrell, C., & Tymms, P. (2017). Mapping and predicting literacy and reasoning skills from early to later primary school. Learning and Individual Differences, 54, 217–225. Diack, H. (1975). Test your own wordpower. http://www.robwaring.org/papers/various/diack.html Fitz-Gibbon, C.  T. (1992). School effects at A-level–genesis of an information system. In D.  Reynolds & P.  Cuttance (Eds.), School effectiveness: Research, Policy and Practice (pp. 96–120). Cassell. Fitz-Gibbon, C. T. (1996). Monitoring education. A & C Black. Fitz-Gibbon, C. T. (1997). The value added national project: Final report: Feasibility studies for a national system of value-added indicators. School Curriculum and Assessment Authority.

1  A Reflection on Three Decades of Development

23

Fitz-Gibbon, C.  T., & Tymms, P. (2002). Technical and ethical issues in indicator systems. Education Policy Analysis Archives, 10, 6. Fitz-Gibbon, C. T., Taylor, F. G. C., Morris, L. L., & Lyons, M. L. (1987). How to analyze data (Vol. No. 8). Sage. Guardian. (2019). Testing of under-fives goes ahead despite teaching union objections. https:// www.theguardian.com/education/2019/feb/27/testing-­of-­under-­fives-­goes-­ahead-­despite-­ teaching-­union-­objections downloaded15/6/20 Mercer, C. D., Algozzine, B., & Trifiletti, J. J. (1979). Early identification: Issues and considerations. Sage, 46, 52–54. Merrell, C., & Tymms, P. (2002). Working with difficult children in years 1 and 2: A guide for teachers: A research-based guide for teachers. Curriculum, Evaluation and Management Centre. Merrell, C., & Tymms, P. (2016). Assessing young children: Problems and solutions. UNESCO Institute for Statistics (UIS). Moseley, D. (1976). Helping with learning difficulties, course E201 OU: Block 10. Open University, Milton Keynes. Murray, C. (2018). Feature; Jan Dubiel. Schools week 23rd Jan 2018. https://schoolsweek. co.uk/jan-­d ubiel-­h ead-­o f-­n ational-­a nd-­i nternational-­d evelopment-­e arly-­e xcellence/ Downloaded 16/6/20 Sayal, K., Owen, V., White, K., Merrell, C., Tymms, P., & Taylor, E. (2010). Impact of early school-based screening and intervention programs for ADHD on children’s outcomes and access to services: Follow-up of a school-based trial at age 10 years. Archives of Pediatrics & Adolescent Medicine, 164(5), 462–469. The World Bank. (2020). https://data.worldbank.org/indicator/SE.PRM.AGES. Accessed 12/6/20. Tymms, P. (1997). The value added National Project. Technical Report Primary 3. School Curriculum and Assessment Authority. Tymms, P. B. (1999a). Baseline assessment, value-added and the prediction of reading. Journal of Research in Reading, 22(1), 27–36. Tymms, P. (1999b). Baseline assessment and monitoring in primary schools: Achievements. London, David Fulton Publishers. Tymms, P., & Coe, R. (2003). Celebration of the success of distributed research with schools: The CEM Centre, Durham. British Educational Research Journal, 29(5), 639–667. Tymms, P., & Merrell, C. (2006a). The impact of screening and advice on inattentive,Hyperactive and Impulsive Children. European Journal of Special Educational Needs, 21(3), 321–337. Tymms, P., & Merrell, C. (2006b). The impact of screening and advice on inattentive, hyperactive and impulsive children. European Journal of Special Needs Education, 21(3), 321–337. Tymms, P. B., & Preedy, P. (1998). The attainment and progress of twins at the start of school. Educational Research, 40(2), 243–249. Tymms, P., & Williams, D. (1996). Baseline assessment and value-added (REF: COM/96/578). School Curriculum and Assessment Authority. Tymms, P., & Wylde, M. (2003). Baseline assessment and monitoring in primary schools. Bamberg. Tymms, P., Brien, D., Merrell, C., Collins, J., & Jones, P. (2003). Young deaf children and the prediction of reading and mathematics. Journal of Early Childhood Research, 1(2), 197–212. Tymms, P., Merrell, C., Hawker, D., & Nicholson, F. (2014). Performance indicators in primary schools: A comparison of performance on entry to school and the progress made in the first year in England and four other jurisdictions: Research report. Department for Education. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/ file/318052/RR344_-­_Performance_Indicators_in_Primary_Schools.pdf Tymms, P., Merrell, C., & Buckley, H. (2015). Children’s development at the start of school in Scotland and the progress made during their first school year: An analysis of PIPS baseline and follow-up assessment data. School of Education, Centre for Evaluation & Monitoring. The Scottish Government. https://www.gov.scot/binaries/content/documents/govscot/ publications/research-­and-­analysis/2016/01/childrens-­development-­start-­school-­scotland-­ progress-­made-­during-­first-­school-­year-­analysis-­pips-­baseline-­follow-­up-­assessment-­data/

24

P. Tymms

documents/childrens-­development-­start-­school-­scotland-­progress-­made-­during-­first-­school-­ year-­a nalysis-­p ips-­b aseline-­f ollow-­u p-­a ssessment-­d ata/childrens-­d evelopment-­s tart-­ school-­scotland-­progress-­made-­during-­first-­school-­year-­analysis-­pips-­baseline-­follow-­up-­ assessment-­data/govscot%3Adocument/00490869.pdf Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. Van der Hoeven-van Doornum, A. (2005). Development on scale, instruction at measure: OBIS, a system of value added indicators in primary education. ITS/Radboud Universiteit Nijmegen.

Chapter 2

The First Year at School: A Perspective from a Personal Standpoint Pat Preedy

This chapter reflects on the development of PIPS baseline assessment from the perspective of a head teacher involved in its development and its implementation in her school.

When I was appointed head teacher of Knowle Church of England (CE) Infant School (Solihull) in the West of England, in 1989, I was thrilled to work with a team of staff, governors and local authority committed to Early Childhood Education. Although not required, Solihull provided nurseries across the borough (local area). I was fortunate to be one of the schools to be allocated a brand-new nursery where we were able to have a considerable say in the design, which included easy access to an attractive and natural outdoor learning area. There was an energetic local authority early years inspector and much collaborative work. We were delighted when the Start Right Report (Ball, 1994) underpinned by the Effective Provision of Pre-School Education (EPPE Project) was published. This report acknowledged the importance of Early Childhood Education and led to major changes in policy and funding. Our work at Knowle CE Infant School was acknowledged by the award of a government Charter Mark for excellence in public services. We were also one of the first schools to be awarded Beacon status in order to share our excellent practice. When nine sets of twins started at Knowle school making ten sets of twins in one small school, I began a study of the educational needs of multiple birth children for my PhD, supported by Tamba (Twins and Multiple Births Society, now Twins Trust). It is strange how all the elements that I was passionate about came together through my work with the PIPS team. P. Preedy (*) Ohio Dominican University, Columbus, OH, USA © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_2

25

26

P. Preedy

Although these were exciting and positive times, we were also extremely anxious about the introduction of Ofsted (Office for Standards in Education) in 1992. Chris Woodhead HMCI (Her Majesty’s Chief Inspector) left us in no doubt that the focus was on public accountability with extensive testing from Key Stage One (age 6/7  years) and the publication of league tables and inspection reports. The staff, governors and I were concerned that a focus on national Standardised Assessment Tests (SATs) and inspection would greatly reduce the curriculum and over-formalise pedagogy with a great deal of practising for external tests. Together we supported the belief that excellent SAT results and a well-rounded education could go handin-­hand, BUT we would need to prove this! Chris Trinick, Solihull’s Director of Education, made a point of regularly visiting every school in the borough with a focus on delivering educational quality. He demanded high standards and gave high praise where it was due and demanded action if things were found to be wanting. He introduced us to the concept of ‘value-­ added’ assessment developed through the CEM Centre located at Newcastle University, which later moved to the University of Durham. Chris argued strongly that we needed to present accurate and reliable data in order to enhance children’s learning and manage the new inspection regime. Supported by Councillor Geoffrey Wright, Chair of the Education Committee, Solihull entered into an agreement to develop a baseline assessment in order to provide an all-through value-added system of assessment from Reception (start of school) to sixth form. Right from the start Peter Tymms and the team were able to present statistical data in a format that was straightforward and made sense. We could see immediately that these data could be used to support teacher assessment, highlight areas for intervention and for accountability. Many people had said that it was not possible to produce a reliable baseline assessment for Reception-age children (many still do!). It was also argued that such assessments could be harmful and undermine Early Years principles as described in the Start Right Report. Nevertheless, we were invited to work with the PIPS team to produce assessments in early reading and mathematics, which were based on the principle of being effective predictors of future performance combing reliable objective measures and observations by teachers. The project was extremely attractive as it was designed to: • Profile strengths and weaknesses for planning appropriate learning experiences; • Identify special educational needs including the more able and children with English as an Additional Language (EAL); and • Monitor progress of pupils of the same age (not grade) and cohorts over time. The information was to be confidential and consisted of the following layers: • Provide diagnostics at the pupil-level for teachers; • Group and class trends for teachers and schools; and • School-level information for schools comparing themselves with anonymous other similar schools. The data analysis produced by the PIPS team would enable school leaders and teachers to make the following comparisons:

2  The First Year at School: A Perspective from a Personal Standpoint

27

• Children within a class/age; • Groups such as male/female/special educational needs (SEND), more able, English as an additional language (EAL); • Classes within a year-group; • Current cohorts with previous ones; and • Other schools in the local authority/nationally/internationally. Above all this was not a static, externally imposed model. Input from teachers was welcomed and used with on-going research, development, training and contact with Newcastle University. The PIPS assessments and principles are now embedded into education, although some still argue that any assessment in the early years is harmful. When I later became a Chief Academic Officer for an international education provider, I was delighted that PIPS had been developed further in order to enable international comparisons. A key strength of PIPS is how it enables the data to be used for research projects, such as the impact of inattention, hyperactivity and impulsivity in young children on their academic attainment and progress conducted by Professor Christine Merrell (Merrell & Tymms, 2001). In Early Childhood Education Redefined (Ball et  al., 2019), I argue that the Early Years Foundation Stage Profile (EYFSP: https://www.gov.uk/government/publications/early-­years-­ foundation-­stage-­profile-­handbook) used to assess young children purely by teacher assessment is unreliable and masks underachievement particularly in physical development. Independent schools can disapply from the EYFSP but state schools do not have this option. Once again (with the support of Sir Christopher Ball, author of the Start Right Report), I find myself challenging the system and calling for the EYFSP with its unreliable analysis of qualitative data to be stopped in favour of teacher assessment combined with standardised testing in the key areas of language, mathematics and physical development – full circle? Initially the PIPS baseline was to include early reading, phonological awareness, vocabulary, mathematics, personal, social and emotional development, motor development, behaviour and attitudes. The motor aspects of the assessment were not universally continued. I still believe this is an important (and predictive) part of children’s development, as demonstrated by the Movement for Learning Project (Preedy et al., 2019). I am pleased that physical development is to be incorporated in the future assessments and that it is active in Brazil. Having listened to the presentation regarding the proposed PIPS project, I wondered if Peter Tymms would be open to identifying the multiple birth sample of children as part of my PhD research. He was (as always) incredibly open to ideas and with the minimum of fuss agreed. This enabled the analysis of data from 1847 twins and 81 triplets compared with 95,001 singletons. On all scores, twins were behind singletons including the language-related measures of vocabulary and rhyming. However, the discrepancies were small and not educationally important. This led me to focus upon the personal, social and emotional development of multiples and the development of a unique model to assess the impact of the multiple relationship on such issues as separation. This work is still being used today by Twins Trust.

28

P. Preedy

The Solihull group worked with colleagues from the PIPS team to produce a test to be completed at the start and end of Reception. There was much refinement and checking of reliability by the team. It was important to conduct the first test as early as possible in the school year in order to capture the children’s starting point. Concerns that this would impact settling-in time did not materialise. We organised the school and staff to support this assessment so that it blended into the daily routine. Early years staff also appreciated the opportunity to assess a child on a one-to-­ one basis. Later when the tests were on-line, the task became less time-consuming and even easier to administer. It was with great interest that we examined our box and whisker plots and noted the wide spread of ability in our school. We were delighted (along with many others) when the data at the end of Reception indicated that this was the time of greatest impact. The deputy and I organised an assessment team where we would analyse the data and discuss the outcomes with class teachers. We were indeed able to look at individuals, groups and the whole school currently and over time. We were also able to evaluate the impact of interventions, for example, the use of a computer program to enhance phonics skills. In 1996, I cheekily wrote to Chris Woodhead HMCI in response to an article he had written in the Times Educational Supplement (TES) where he had asked the question Should Early Year Education be to ensure that children have the opportunity to exercise choice so that they become independent learners? I wrote a five-­ page letter ending: It is too simplistic to discuss Early Years Education in term of process versus knowledge and child centredness versus didactic teaching. Professional qualitative observations and quantitative data such as PIPS can provide a framework for challenging debate that is not based upon our own memories, sound bites or our own comfortable ideologies. Chris Woodhead responded by scribbling over my letter - agreeing with much and questioning other aspects (I have kept the letter including the coffee stain!). I wrote back to clarify the points he had raised and invited him to visit the school – which he did! I am grateful to PIPS that we were able to prove the excellent progress that all of our children were making. He went on to write a further article stating that I ran one of the most successful schools in the country, that we had bridged the traditional/progressive divide and that our inspection report was probably the best he could remember reading. After I became head of Knowle CE Primary School (not just the infant school), we were awarded Beacon status for the whole school in order to continue sharing and developing our initiatives. We continued to use PIPS throughout the school and Ofsted continued to acknowledge our high standards: Self-evaluation and improvement are well entrenched in the school. There is an excellent programme for the monitoring of standards, teaching and the curriculum. The rigorous analysis of performance data has helped the school to raise standards. For example, results are unpicked and analysed to identify areas where improvement is needed; detailed courses of action are then planned and regularly evaluated to ensure progress occurs as intended. (Ofsted Inspection February 2001  – featured in the HMCI Report of 2002 as an excellent example).

2  The First Year at School: A Perspective from a Personal Standpoint

29

I wrote at the time: It is a good feeling to be able to provide an accurate and complete picture to those to whom we are accountable. It is an even better feeling to be able to use data to improve children’s learning – which is after all why we became teachers. I stand by my words and believe them still to be true after 30 years.

References Ball, C. (1994). Start right: The importance of early learning. RSA. Ball, C., Preedy, P., & Sanderson, K. (2019). Early childhood education redefined. Routledge. Merrell, C., & Tymms, P. B. (2001). Inattention, hyperactivity and impulsiveness: Their impact on academic achievement and progress. British Journal of Educational Psychology, 71(1), 43–56. Preedy, P., Sanderson, K., & Ball, C. (2019). Early childhood education redefined. Routledge.

Part II

Introduction: The Challenge of Assessing Young Children’s Progress Fairly and Making Comparisons Sarah J. Howie

Introduction The early years of learning and primary school set the foundations for how and at what level children will develop, have access to opportunities and contribute to the citizenry in the future. However, millions of children miss out on the opportunity to learn. The most recent Global Monitoring Report indicates that “Identity, background and ability dictate education opportunities” as “in all but the high-income countries in Europe and Northern America, only 18 of the poorest youth complete secondary school compared to 100 of the richest youth”. Furthermore, there is a gender dimension: in at least 20 countries (mostly in sub-Saharan Africa) very few poor rural young women complete secondary school (UNESCO, 2020:10). In middle-­income countries (this includes several countries included in this book), only three out of four children are still in school by the age of 15. Of those, it is estimated that only half are learning the basics, and “many assessments overestimate how well students are doing” (UNESCO, 2020:10). In this book, data and experiences in the IPIPS project are described for Australia, Brazil, England, Lesotho, Russian Federation, Scotland and South Africa. As can be seen in Table II.1, this diverse group of countries vary greatly across major indicators of schooling adding both richness and challenges in terms of education and assessment, leading to key questions addressed in this section of the book. In summary, children in low- and middle-income countries are growing up at a disadvantage with more than 250 million children aged under 5 years worldwide either living in poverty and/or are stunted in growth (Black et al., 2017). It is well established that the first 5 years are critical for lifelong development (Shonkoff & Phillips, 2000). Skills developed before entering school help determine children’s

S. J. Howie (*) Stellenbosch University, Stellenbosch, South Africa e-mail: [email protected]

32

II  Introduction: The Challenge of Assessing Young Children’s Progress Fairly…

Table II.1  Key indicators on pre-primary and primary education on selected countries participating in iPIPS

Country Level of schooling Australia Brazil Lesotho Russian Federation South Africa United Kingdom

Starting age Duration (years) PreYears primary Primary

School- going population (000) Enrolment (000) PrePreprimary Primary primary Primary

Government expenditure per pupil (US$) Preprimary Primary

5 6 6 7

1 2 3 4

7 5 7 4

382 5298 143 7397

227 13,952 304 6838

528 5102 54 6253

2127 13,952 368 6574

5192 – – –

9524 3267 636 –

7

3

7

3510

7815

862

7582

785

2377

5

2

6

1647

4895

1763

4820

3069

10,280

Source: UNESCO (2020:348–347)

academic success. Furthermore, as is generally known, children from low-income families are at risk for health, academic and social problems that may affect educational achievement (Engle & Black, 2008) and these children are particularly vulnerable. School readiness, therefore, plays an important role in eluding poverty in countries such as the United States of America (USA) as well as, increasingly, in developing countries (Engle & Black, 2008). Calls for assessing younger children (especially in low and middle-income countries) accompany the United Nations Sustainable Development Goals (SDGs) which placed early child development on the global policy agenda for the first time (Fernald et al., 2017). In particular, Goal 4.2 aims for access to quality early child development for all and stresses the importance of early childhood development (Fernald et al., 2017). This focus on assessing younger children follows 30 years of escalating developments in large-scale assessments since the Jomtien conference in 1990 on Education for All (UNESCO, 19901). The conference is seen by many as a turning point for the introduction of increased monitoring and evaluation of the quality of education systems around the world. International debates have arisen about the nature and frequency of assessment and its impact on education systems with its intended and unintended consequences (Howie, 2012). High profile international agencies such as OECD and UNESCO promoted this thinking through statements suggesting that quality of education has an influence as to how quickly societies can become richer and how individuals can improve their productivity and income (Ross & Genevois, 2006).  https://bice.org/app/uploads/2014/10/unesco_world_declaration_on_education_for_all_ jomtien_thailandpdf 1

II  Introduction: The Challenge of Assessing Young Children’s Progress Fairly…

33

Concerns about the consequences of high-stakes testing, namely “those assessments that have serious consequences attached to them” (Nichols & Berliner, 2008: xv) and the situation in the USA has led to what some call high-stakes educational accountability systems (Linn, 2008; McDonnell, 2008) with undesirable and often unintended consequences (Howie, 2012). However, with the focus now moving earlier in the child’s development and targeted at early schooling (Diaz-Diaz et al., 2019; Lin & Lin, 2019; Moss et al., 2016), new debates rage, (particularly in Europe and the USA). There is an emergence of the implementation of large-scale assessments for young children and a shift from early childhood education and care, towards a discourse of outcomes and investment (Moss et  al., 2016). A previous Global Monitoring Report warned against “high-stakes tests based on narrow performance measures negatively impacting on learning and disproportionately punishing the marginalized” (UNESCO, 2017:7). In this book, the story of the IPIPS project (see Part I) in designing and developing assessment for young children addresses the concerns and issues raised above. The experience of the researchers involved in iPIPS and the precision involved in adapting, contextualising, collecting and analysing the iPIPS data collected across the participating countries reveals the extraordinary efforts required in obtaining valid and reliable data. Part II of the book specifically sets the context of assessment internationally for the research and its outcomes reported in the book. Firstly, it describes landscape of the educational assessment conducted internationally on children and up to the end of their first year of schooling, including assessment undertaken in the formal early childhood setting (Chap. 3). Methodological considerations are the focus of Chap. 4 describing the designs and methods adopted in cross national and regional studies. Chapter 5 discusses the debates and practices currently on the assessment of young children in the early years of formal education. Chapter 6 addresses the general issue of teacher assessments and the roles of teachers in research projects and specifically in the iPIPS project. The section concludes with a reflection on the methodological challenges and solutions found and addressed by the IPIPS team.

References Black, M., Walker, S., Fernald, L., Andersen, C., DiGirolamo, A., Lu, C., & McCoy, D. (2017). Advancing early childhood development: From science to scale 1. Journal of Autism Development Disorders, 47(3), 549–562. https://doi.org/10.1016/S0140-­6736(16)31389-­7 Diaz-Diaz, C., Semenec, P., & Moss, P. (2019). Editorial: Opening for debate and contestation: OECD’s international early learning and child Well-being study and the testing of children’s learning outcomes. Policy Futures in Education, 17(1), 1–10. https://doi.org/10.1177/ 1478210318823464 Engle, P. L., & Black, M. M. (2008). The effect of poverty on child development and educational outcmes. Annals of the New York Academy of Sciences, 1136, 243–256. https://doi.org/10.1196/ annals.1425.023

34

II  Introduction: The Challenge of Assessing Young Children’s Progress Fairly…

Fernald, L. C. H., Prado, E., Kariger, P., & Raikes, A. (2017). A Toolkit for measuring early childhood development in low- and middle-income countries. Prepared for the strategic impact evaluation fund, the World Bank. 17–27. http://documents.worldbank.org/curated/ en/384681513101293811/pdf/WB-­SIEF-­ECD-­MEASUREMENT-­TOOLKIT.pdf Howie, S. (2012). High-stakes testing in South Africa: Friend or foe? Assessment in education: Principles, Policy and Practice, 19(1), 81–98. https://doi.org/10.1080/0969594X.2011.613369 Lin, P. Y., & Lin, Y. C. (2019). International comparative assessment of early learning in exceptional learners: Potential benefits, caveats, and challenges. Policy Futures in Education, 17(1), 71–86. https://doi.org/10.1177/1478210318819226 Linn, R. (2008). Educational accountability systems. In K. E. Ryan & L. A. Shephard (Eds.), The future of test-based accountability (pp. 3–24). Lawrence Erlbaum Associates. McDonnell, L.  A. (2008). The politics of educational accountability: Can the clock be turned back? In K. E. Ryan & L. A. Shepard (Eds.), The future of test-based accountability (pp. 25–46). Lawrence Erlbaum Associates. Moss, P., Dahlberg, G., Grieshaber, S., Mantovani, S., May, H., Pence, A., Rayna, S., Swadener, B.  B., & Vandenbroeck, M. (2016). The organisation for economic co-operation and Development’s international early learning study: Opening for debate and contestation. Contemporary Issues in Early Childhood, 17(3), 343–351. https://doi.org/10.1177/ 1463949116661126 Nichols, S.  L., & Berliner, D.  C. (2008). How high stakes testing corrupts America’s schools. Harvard Education Press. Ross, K., & Genevois, I. (2006). Cross-national studies of the quality of education: Planning their design and managing their impact. https://doi.org/10.4324/9780203882146 Shonkoff, J. P., Phillips, D. A., & National Research Council. (2000). Communicating and learning. In From neurons to neighborhoods: The science of early childhood development. National Academies Press (US). UNESCO. (1990). World declaration on education for all, and framework for action to meet basic learning needs. UNESCO. UNESCO. (2017). Global education monitoring report, 2017/8. Accountability in education: Meeting our commitments. In educational administration quarterly (pp.  1–508). UNESCO. https://doi.org/10.1177/0013161X7401000101 UNESCO. (2020). Global education monitoring report 2020, inclusion and education: All means all. UNESCO. https://unesdoc.unesco.org/ark:/48223/pf0000265866

Chapter 3

Educational Assessment of Young Children Sarah J. Howie

This chapter describes the landscape of the assessment of young children given its expansion and scale in recent years. The rationale for the assessments and their functions are presented to provide further context for Part two and the chapters hereafter.

3.1 Introduction The early years of learning and primary school are the foundations for children’s development, learning and life opportunities. Links between early childhood development, home background and later outcomes are extensively documented (Tymms et  al., 2018). Shonkoff and Phillips (2000) emphasise the importance of the first 5  years as being critical for lifelong development and school readiness plays an important role in eluding poverty (Black et al., 2017; Engle & Black, 2008). One of the key targets in the Sustainable Development Goals is Target 4.2 Early Childhood: By 2030, ensure that all girls and boys have access to quality early childhood development, care and pre-primary education so that they are ready for primary education (UNESCO, 2020). Two of the thematic indicators for that target directly address formal education: 1. Gross early childhood education enrolment ratio in pre-primary education and early childhood educational development.

S. J. Howie (*) Stellenbosch University, Stellenbosch, South Africa e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_3

35

36

S. J. Howie

2. Number of years of free and compulsory pre-primary education guaranteed legal frameworks. These targets/indicators are driven partly by the need to reduce rising inequality within nations. As one can see from the participation and government expenditure figures (in Table II.1 in the Introduction to this Part II), there is a wide variation across countries in starting age for primary school and investments in pre-primary education. Countries tend to increase the participation either by expanding early childhood education or by attaching reception classes to primary schools (UNESCO, 2020) as they did in South Africa in public education. It is common to find that pre-­ primary education is driven mainly by the private sector. This has resulted in countries, such as Morocco, having no public pre-primary system of education. Internationally pre-primary varies from 1 to 4 years and on average, half the children of relevant age attend pre-primary, although few children in poor (lower income) countries benefit from pre-primary education (UNESCO, 2020). The varying duration and access to early childhood development and pre-­primary education is shown in the starting age and access to primary school. Across the world, the starting age of entering primary schools varies from 4 years of age in Northern Ireland to 5 in the Caribbean (and UK) to 7 years in Central Asia (and the Russian Federation and South Africa). Most regions start at 6 years of age including Europe and North America, Latin America, Oceania and Africa (UNESCO, 2020). Therefore, in the discussion in this section of the book, reference to young children stretches from appropriately 3 to 7 years of age and includes early childhood development, pre-primary and the first year of schooling.

3.2 The Escalation in Assessment of Younger Children Given the emerging evidence of dropouts, under-performance and growing concerns about the quality of education internationally (UNESCO, 2020) in upper primary and secondary schooling, it is not surprising that research has increased its focus on the early childhood years and education at the start of formal schooling. Teale et al. (2020) identified four reasons for this increased focus on younger children in the USA environment, namely: • Research on brain development across the early childhood spectrum, with resultant messaging targeted at parents (UNICEF, 2014), policy makers (Shonkoff, 2010), and early childhood educators and leaders (NAESP, 2014); • Highly visible reports on the economic benefits of high-quality early childhood education (Council of Economic Advisors, 2014; Heckman et al., 2013); • Universal pre-K initiatives and proposals at the US state (Barnett et al., 2017) and national (that is, US Department of Education, 2015) levels; and • The contribution of high-quality preschool programs national (that is, U.S. Department of Education, 2015) levels to continued academic achievement throughout schooling, especially for children growing up in economically poor situations (Teale et al., 2020; Yoshikawa et al., 2013).

3  Educational Assessment of Young Children

37

Following from this increased interest in research, and the emergence of evidence from upper primary and secondary school levels of underperformance and limited participation in low- and middle-income countries, interest has evolved in monitoring and evaluating the inception phases of development and schooling. Calls for assessing younger children (especially in low- and middle-income countries) accompany the United Nations Sustainable Development Goals (SDGs) (discussed earlier) which placed early child development on the global policy agenda for the first time. Goal 4.2 aims for access to quality early child development for all and stresses the importance of early childhood development. These calls argue the importance of assessing children to ascertain whether they are developing appropriately and to inform interventions in the form of policy and practice (Fernald et al., 2017).

3.3 Rationale for Assessment of Young Children On a global level, there may be varied and multiple reasons for assessing child development, progress and outcomes. These include population monitoring (also named national or systemic assessment); program(me) evaluation (to demonstrate the impact of specific policies or programmes), hypothesis-driven or exploratory research (Fernald et al., 2017) and classroom assessment. The purpose of the latter is multiple and serves both a diagnostic function and focus on teacher decision making (Greaney & Kellaghan, 2008; McMillan, 2001). The goal of the systemic assessment is to broadly monitor trends to inform policy whereas the goal of programme evaluation is to assess the impact of programmes or policies. Just as the goals differ, so do the applications and the requirements (Fernald et al., 2017). For instance, the systemic assessment may be intended to be comparable across populations but is not designed to quantify impact on child development as programme evaluations do. In contrast to traditional systemic assessments, IPiPS follows a hypothesis-­ driven approach consistent with being sensitive to a wider range of effects, both predicted and not specifically, enabling new discovery and using new technologies to advance the field (Fernald et al., 2017). This is applied on two-levels by firstly taking the traditional PIPS approach of feeding the information back to teachers as a prime objective whereby the hypothesis is that feedback will improve education (Fitz-­Gibbon & Tymms, 2002). On a second level, where there was an iPIPS national monitoring approach, the hypothesis is that providing data to policy makers will improve education (Peter Tymms communication, September 2020). However, since the Jomtien conference in 1990 on Education for All, the focus has been on assessing younger children following a 30-year development of large-­ scale assessments. The conference is seen by many as a turning point for the introduction of increased monitoring and evaluation of the quality of education systems around the world. Internationally, debates have arisen about the nature and frequency of assessment and its impact on education systems, with its intended and unintended consequences (Howie, 2012). The focus shifted significantly from

38

S. J. Howie

measuring inputs to an increased emphasis on educational quality outcomes, to ascertain the extent to which education systems meet the need to deliver quality in education (Kellaghan & Greaney, 2001). The impact of the 1990 Jomtien conference and the 2000 Dakar World Education Forum has been considerable (Ross & Genevois, 2006). Both these events called on nations to concentrate on and broaden access to education by aiming to “improve all aspects of the quality of education and ensure excellence, so that recognised and measurable learning outcomes are achieved by all” (Ross & Genevois, 2006:26). As the interest in measuring outcomes increased, explicit linkages between educational outcomes and quality were made to the notion that these were essential for educational development within a global economy (Ryan & Feller, 2009) with the increased political interest leading to direct links between educational quality and economic imperatives. High profile international agencies such as OECD and UNESCO promoted this thinking, suggesting that the quality of education influences the speed at which societies become richer and the ways individuals can improve their productivity and income (Ross & Genevois, 2006).

3.4 Functions of the Assessment of Children Assessment serves a number of different purposes and different characteristics dependent on the level of information required (see Table 3.1) (Howie, 2012). For instance, at the student level, it can be used to describe students’ learning and to diagnose learning problems. At the system level, the main purpose is to reach a judgement on the effectiveness of an education system or part thereof, which is primarily the interest of governments and policymakers. In these ways the nature of the assessment follows from the intended purpose (McMillan, 2001).

Table 3.1  Comparing assessments and their purposes Type Purpose

Frequency

Classroom assessment Multiple, primarily diagnostic and focused on teacher decision making Continuous

Who is tested?

Individual students

Coverage

Tailored to individual classes

System assessments To provide feedback to policymakers

Public examinations To certify and select students

Country dependent but for individual subjects offered on regular basis (annually-every 4 years) Usually a sample of students at particular grade or age level

Annually and more often where the system allows for repeats

Generally confined to one or two subjects

Source: Based on McMillan, 2001:7; Greaney & Kellaghan, 2008:18

All students who wish to take this examination at the examination grade Covers main subject areas

3  Educational Assessment of Young Children

39

Critical questions have been raised (Nichols & Berliner, 2008; Ryan & Feller, 2009; Tamassia & Adams, 2009; Torrance, 2009) about the relative merits of system assessment (also known as national assessment, learning assessment and assessment of learning outcomes (Kellaghan & Greaney, 2001) or testing on a large scale (which uses standardised assessments on district, province, state, national or international levels) (McMillan, 2001) being employed across the Western world, where national assessment (of some kind) has been in place for many years. A national assessment is designed to describe the achievement of students in a curriculum area, aggregated to provide an estimate of the achievement level in the education system as a whole at a particular age or grade level (Greaney & Kellaghan, 2008) and is normally conducted on either a sample or a whole population of students. System assessments are primarily concerned with quality in education, a dynamic concept (Ross & Genevois, 2006). Assessment in terms of the kind used by the reform movements (post Jomtien) suggest that the type of assessment that is likely to impact on quality is the one that focuses on outcomes, is conducted externally and with the expectation that the assessment will act as a lever of reform (Ross & Genevois, 2006:29). There are others, such as the editors of this book, who are of the view that quality could be improved by giving teachers detailed information about their students (as iPIPS does). Furthermore, concerns about the consequences of high-stakes testing, such as “those assessments that have serious consequences attached to them” (Nichols & Berliner, 2008: p.xv) have arisen in the USA (Bracey, 2000; Clarke et  al., 2000; Nichols & Berliner, 2008; Jones et  al., 2003; Kohn, 2000; Orfield & Kornhaber, 2001; Ryan, 2004), which has a long history of standardised testing. The situation in the USA has led to what some call high-stakes educational accountability systems (Linn, 2008; McDonnell, 2008) with sometimes undesirable and often unintended consequences (Howie, 2012). Part I of this book (see Tymms, Preedy) highlighted the tension between high stakes outcomes population assessment for accountability and classroom assessment designed to inform teachers, which lies at the heart of this book.

3.5 Conclusion As will become clear in the discussion in later chapters in this section, iPIPS is not considered a high stakes assessment in any of the countries where it was implemented. However, with its hypothesis- driven approach as highlighted earlier and direct implementation within the schools setting, there is an argument to be made about providing teachers and heads of schools with information directly about the capabilities of their students. This would allow teachers and principals to be better informed, so that they are able to enhance the quality of their teaching and learning within their schools. It is clear from the literature that PIPs and iPIPS are unique internationally in their focus on the monitoring and evaluation of young children, as will be seen later in Chap. 4.

40

S. J. Howie

References Barnett, W. S., Friedman-Krauss, A. H., Weisenfeld, G., Horowitz, M., Kasmin, R., & Squires, J.  H. (2017). The state of preschool 2016: State preschool yearbook. National Institute for Early Education Research. Black, M., Walker, S., Fernald, L., Andersen, C., DiGirolamo, A., Lu, C., & McCoy, D. (2017). Advancing early childhood development: From science to scale 1. Journal of Autism Development Disorders, 47(3), 549–562. https://doi.org/10.1016/S0140-­6736(16)31389-­7. Advancing Bracey, G. (2000). High stakes testing. Educational Policy Studies Laboratory, Arizona State University. Clarke, M., Haney, W., & Madaus, G. (2000). High stakes testing and high school completion. National Board on Educational Testing and Public Policy. Council of Economic Advisors. (2014). The economics of early childhood investments. Executive Office of the President of the United States. Engle, P.  L., & Black, M.  M. (2008). The effect of poverty on child development and educational outcomes. Annals of the New  York Academy of Sciences, 1136, 243–256. https://doi. org/10.1196/annals.1425.023 Fernald, L.  C. H., Prado, E., Kariger, P., & Raikes, A. (2017). A Toolkit for measuring early childhood development in low- and middle-income countries. Prepared for the strategic impact evaluation fund, the World Bank. 17–27. http://documents.worldbank.org/curated/ en/384681513101293811/pdf/WB-­SIEF-­ECD-­MEASUREMENT-­TOOLKIT.pdf Fitz-Gibbon, C.  T., & Tymms, P. (2002). Technical and ethical issues in indicator systems. Education Policy Analysis Archives, 10, 6. https://doi.org/10.14507/epaa.v10n6.2002 Greaney, V., & Kellaghan, T. (2008). Assessing national achievement levels in education. In National assessments of educational achievement 1. World Bank. Heckman, J., Pinto, R., & Savelyev, P. (2013). Understanding the mechanisms through which an influential early childhood program boosted adult outcomes. American Economic Review, 103(6), 2052–2086. Howie, S. (2012). High-stakes testing in South Africa: Friend or foe? Assessment in Education: Principles, Policy and Practice, 19(1), 81–98. https://doi.org/10.1080/0969594X.2011.613369 Jones, G. J., Jones, B., & Hargrove, T. (2003). The unintended consequences of high stakes testing. Rowman and Littlefield. Kellaghan, T., & Greaney, V. (2001). Using assessment to improve the quality of education. UNESCO. http://lst-iiep.iiep-unesco.org/cgi-bin/wwwi32.exe/[in=epidoc1.in]/?t2000=014672/(100) Kohn, A. (2000). The case against standardised testing: Raising the scores, ruining the schools. Heinemann. Linn, R. (2008). Educational accountability systems. In K. E. Ryan & L. A. Shephard (Eds.), The future of test-based accountability (pp. 3–24). Lawrence Erlbaum Associates. McDonnell, L. A. (2008). The politics of educational accountability: Can the clock be turned back? In K. E. Ryan & L. A. Shepard (Eds.), The future of test-based accountability (pp. 25–46). Lawrence Erlbaum Associates. McMillan, J. (2001). Essential assessment concepts for teachers and administrators. Experts in assessment. Corwin Press. National Association of Elementary School Principals (NAESP). (2014). Leading pre-K3 learning communities: Competencies for effective principal practice. Executive summary. Retrieved from https://www.naesp.org/sites/default/files/leading-­pre-­k-­3-­learning-­communities-­executive-­ summary.pdf. Nichols, S.  L., & Berliner, D.  C. (2008). How high stakes testing corrupts America’s schools. Harvard Education Press. Orfield, G., & Kornhaber, M.  L. (2001). Raising standards or raising barriers? Inequality and high stakes testing in public education. Century Foundation Press. Peter Tymms Communication, 24 September 2020.

3  Educational Assessment of Young Children

41

Ross, K. N., & Genevois, I. J. (2006). Cross-national studies of the quality of education: Planning their design and managing their impact. UNESCO. Ryan, J. (2004). The perverse incentives of the no child left behind act. New York University Law Review, 79(3), 932–989. Ryan, K.  E., & Feller, I. (2009). Evaluation, accountability, and performance measurement in national education systems. In K. E. Ryan & J. B. Cousins (Eds.), The sage international handbook of educational evaluation (pp. 169–190). Sage. Shonkoff, J. P. (2010). Building a new biodevelopmental framework to guide the future of early childhood policy. Child development, 81(1), 357–367. Shonkoff, J.  P., Phillips, D.  A., & National Research Council. (2000). Communicating and learning. In From neurons to neighborhoods: The science of early childhood development. Academies Press (US). Tamassia, C.  V., & Adams, R.  J. (2009). International assessments and indicators. In K. E. Ryan & J. B. Cousins (Eds.), The sage international handbook of educational evaluation (pp. 213–230). Sage. Teale, W. H., Whittingham, C. E., & Hoffman, E. B. (2020). Early literacy research, 2006–2015: A decade of measured progress. Journal of Early Childhood Literacy, 20(2), 169–222. https:// doi.org/10.1177/1468798418754939 Torrance, H. (2009). Pursuing the wrong indicators. In K. E. Ryan & J. B. Cousins (Eds.), The sage international handbook of educational evaluation (pp. 483–498). Sage. Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. https://doi.org/10.1080/0924345 3.2017.1404478 U.S. Department of Education. (2015). A matter of equity: Preschool in America. U.S. Department of Education. UNESCO. (2020). Global education monitoring report 2020, inclusion and education: All means all. UNESCO. https://unesdoc.unesco.org/ark:/48223/pf0000265866 United Nations Children’s Fund. (2014). UNICEF annual report (2014). UNICEF. Yoshikawa, H., Weiland, C., Brooks-Gunn, J., Burchinal, M., Espinosa, L., Gormley, W.  T., Ludwig, J., Magnuson, K., Phillips, D., & Zaslow, M. (2013). Investing in our future: The evidence base on preschool. Society for Research in Child Development.

Chapter 4

International Comparative Assessments in Education Sarah J. Howie

This chapter discusses the methodology and methodological challenges facing and demanded of assessing young children within the international landscape.

4.1 Introduction International comparative assessments of school children have escalated over the past 60 years, increasing in scale and complexity alongside technological advancements. They have also shifted in their focus from treating the world as a laboratory to being, in some cases, the political instruments of multiple agencies. Over the years they have served many functions mainly in developed economies, although in the past 20 years, they have taken root in several emerging nations. Given their varying functions, studies have evolved in terms of the disciplinary and grade/age level foci. Designs and methods have increased in sophistication with disciplinary and technological innovations resulting in multiple and complex considerations required at the initiation of such studies, some of which are discussed in this chapter.

S. J. Howie (*) Stellenbosch University, Stellenbosch, South Africa e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_4

43

44

S. J. Howie

4.2 Purposes and Functions of International Comparative Assessments There is a long tradition of international comparative studies in education (also known as International Large-Scale Assessments), which date back to the 1960s with the inception of the International Association for the Evaluation of Educational Achievement (IEA). Until the 1980s, the studies were driven by the interest of researchers. However, due to structural problems in our societies (including an oil crisis), as well as increased media attention that the larger comparative studies attracted, policy makers began to develop an interest in, and a concern for, the quality and the cost-effectiveness of education (Plomp et al., 2003). The interest of policy makers in cost-effectiveness and the quality of education (instead of just managing the quantitative growth of educational systems) also became manifest at a meeting of Ministers of Education of the Organisation for Economic and Cultural Development (OECD) in 1984 (Husén & Tuijnman, 1994; Kellaghan, 1996) following which, interest in internationally comparable indicators developed rapidly in the 1980s–2000s. The expansion and development of these types of studies was made possible by developments in sample survey methodology, group testing techniques, test development and data analysis (Husén & Tuijnman, 1994). The studies involve extensive collaboration, funding and negotiation between participants, organisers and funders resulting in a long-term commitment of all those involved (Plomp et al., 2003). International large-scale assessments have a variety of purposes, which include to compare levels of national achievement between countries; to identify the major determinants of national achievement, country by country; to examine the extent to which they are the same or differ across countries and to identify factors that affect differences between countries (Postlethwaite, 1999). The functions of these studies have been analysed and described by a number of authors, including Kellaghan (1996), Plomp (1998), and Postlethwaite (1999) and Plomp et  al. (2003). Plomp (1998) summarises these functions as description, benchmarking, monitoring, enlightenment, understanding and cross-national research (see also Plomp et  al., 2003). Porter and Gamoran (2002) highlighted, in addition, the purpose of contributing to the advancement of methodology. Finally, in the late 1980s and early 1990s, the international large-scale assessment studies served another important purpose, namely, the integration of formerly excluded and isolated education systems (e.g., countries in the former Soviet Bloc and South Africa). These studies allowed them to emerge from their previously isolated positions, join the international educational community and debates through their participation in projects such as the Third International Mathematics and Science Study (TIMSS, 1995 and 1999). Financial sponsorship was provided by the World Bank for those selected countries and training was administered by the IEA (Howie, 1999; Plomp et al., 2003).

4  International Comparative Assessments in Education

45

Benefits to Emerging Countries In addition to the general purposes of international comparative studies, there are other benefits that pertain more to developing or less developed countries. These may be divided into four areas (see Howie, 2000). Firstly, international studies (e.g., Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) and IEA) contribute substantially to the development of the research capacity in developing countries. TIMSS significantly developed the capacity in South Africa (and in other countries) to undertake large-scale surveys involving assessment. In nine African countries, SACMEQ has made a particular contribution to developing the capacity of ministerial staff. Researchers have been introduced to the latest research methods and provided with substantial training and assistance in their use throughout the research process. Secondly, these studies present an opportunity for developing countries to collect data that serve as a starting point (baseline) in certain subject areas at different grade/age levels, where previously there was a vacuum. Thirdly, establishing a national baseline through an international study heightens awareness of what other countries around the world are doing, and points to lessons that can be drawn from them. For example, TIMSS was the first international educational research project in which South Africa participated after years of political and academic isolation, providing the first opportunity to review and compare South African data with those of other countries. The disappointing result of this comparison led the-then Minister for Education announcing during a parliamentary debate, that his department would review the data in order to design new curricula to be introduced by 2005. Finally, education jostles for attention with many other needs in developing countries (e.g., health, poverty, HIV/Aids, rural development). The fact that the results of international and not merely national studies are available, assists researchers, practitioners and policymakers to highlight priorities.

Benchmarking Benchmarking was an increasingly dominant purpose of the studies in the 1990s (Gustafsson, 2010) and early twenty-first century (UNESCO, 2015). In the past decade, the international studies continue to serve a descriptive function (and the other functions as described by Plomp, 1998; Plomp et al., 2003), but have moved increasingly into attempting to infer causal relations (Wӧßmann, 2003), despite the issues around doing so with cross-sectional survey data (Gustafsson, 2010). This has been driven largely by needs arising from the tensions between research-­ oriented, explanatory-seeking IEA frameworks and policy-oriented and driven UNESCO development agendas and OECD frameworks as pressures have increased to explain and predict achievement patterns and deviations across contexts. There was a need to “go beyond the international student-achievement horse race and see

46

S. J. Howie

what works in and out of the classroom to improve student learning…” (Loveless, 2007, p. viii). These tensions and demands for rigour regarding the data collected, further exacerbate well-known, long-standing challenges in comparative research such as threats to content validity, problems of relevance and fitness (Dahllöf, 1971; Gustafsson, 2010; Rosen et  al., 2013; Tymms, 1998), analytical challenges from cross-sectional designs of endogeneity problems and omitted variables (Gustafsson, 2010; Raudenbush & Kim, 2002) and challenges related to language and culture. Regarding the latter, Tymms (1998), using language as an example, highlighted challenges such as the problems arising from differences in script (English, Russian, Hindi and Mandarin), differences in culture, which may emphasise different aspects of cognitive development in early childhood and differences in school starting ages. More about the methodological considerations is discussed later in this section. Some of these types of challenges prompted the interest of the World being a Laboratory (Keeves, 1995) for educational researchers in the IEA and cross-national research in the early days of the IEA studies. Whilst much progress has been made in addressing some of the challenges highlighted by Dahllöf and others (1971), Porter and Gamoran (2002) and Gustafsson (2010) note that much work still remains. This is “despite four decades of experience leading to improvements in methodology” (Porter & Gamoran, 2002:4). This includes a need to develop “a better appreciation of differences in the social and cultural contexts in which education takes place in different nations and the manner in which those contextual differences may be reflected in the results of achievement tests” (Porter & Gamoran, 2002:4), including cross-national validity (Howie & Plomp, 2001, 2005). Solid theoretical foundations are required to assist in addressing the challenges for cross-­ national research in the form of International Large-Scale Assessments. To date, despite many efforts, equating attitudinal variables, such as those using Likert and Likert-type variables cross-nationally, has remained elusive and need to be interpreted with care.

Monitoring Education Systems Increasingly the purpose of monitoring education systems in general is to evaluate progress in achievement across subjects in schooling in response to global calls for improving the quality of education for all (Howie, 2012, UNESCO, 2015). Specifically, it is to formally regulate desired levels of quality of educational outcomes and provisions; to hold education systems to account for their functioning and performance; to stimulate improvement and to follow the effect of de-­ centralisation of education (Scheerens et al., 2003). There is considerable interest in accountability as this pertains to the quality of provisioning, relevance of educational objectives and their effectiveness, equity in terms of fair and equal distributions of educational resources and efficiency (economic utilisation of the resources) (Scheerens et al., 2003). To gain insight into the quality of education, and the direct and indirect causal effects for designing and improving effective monitoring systems,

4  International Comparative Assessments in Education

47

considerable effort is being put into the design of frameworks comprising possible explanatory factors for student achievement to be collected simultaneously (Howie, 2002; Scherman, 2007; van Staden, 2010). However, in the past decade, these efforts have been overtaken with the drive towards randomised controlled trials in education which are deemed the gold standard for evaluating these relationships.

Explanatory Factors and Frameworks A turning point in the literature on explanatory factors for students’ achievement is considered to be the Coleman report of 1966 (Coleman et al., 1966), as this seminal report revealed the extent of the link to the home environment on achievement in the USA. Since the publication of that report, a variety of frameworks have emerged, each seeking to explain the achievement of children in subjects such as language, mathematics and science in particular, known broadly as studies in educational effectiveness (Creemers, 1996; Creemers & Reezigt, 1999; Creemers & Kyriakides, 2008, 2015) and school effectiveness (Creemers, 1994; Scheerens & Bosker, 1997; Scheerens et al., 2003; Stringfield & Slavin, 1992). Large-scale studies have revealed the importance of school characteristics in terms of facilities, teachers, materials and curriculum and their ability to provide equal opportunities and their potential impact on skills and knowledge. (Tymms et al., 2018). As discussed in the previous chapter, there are few international comparative assessment studies of early childhood and early primary school and even fewer involving developing nations and therefore few frameworks have been developed based on international comparative empirical data. Most studies of these age groups were either local or national. The largest of the studies are described below. The first attempt at a cross-national assessment of pre-primary children was the IEA’s Preprimary Study, and, 30 years later, the OECD’s International Early Learning and Child Wellbeing Study (IELS). In between these, iPIPS emerged as an international study, evolving from its origins in the UK.

4.3 Cross-National and Regional Validity of International Comparative Assessments: Methodological Considerations Methodological issues regarding international comparative assessments of young children specifically involve a number of considerations that apply to all international comparative assessments. I will first describe some of the general methodological considerations before focusing on those impacting young children in particular. After 60 years of international comparative studies, there is quite a large body literature to draw upon regarding methodology. International comparative

48

S. J. Howie

assessments by their nature, aims and scope are complex studies requiring specific conditions (see Kellaghan, 1996; Plomp et al., 2003; Greaney & Kellaghan, 2008) for successful, valid and reliable implementation. Of importance is the relevance of such studies for policy makers (see below under research questions), the design of studies namely, accurately representing what students achieve, permitting valid comparisons, having clear purposes and justifying the resources needed to conduct them (Greaney & Kellaghan, 2008; Kellaghan, 1996). All of these require careful design, planning and monitoring. Firstly, I will discuss the phases in the design of this type of assessment followed by a number of considerations drawing extensively from previous work (Plomp et al., 2003; Howie, 2022).

Design of International Comparative Assessments The IEA (Martin et al., 1999) and others (e.g., Westat & US Department of Health and Human Services, 1991) developed standards for conducting international studies of achievement. These are applied during the study, and regular monitoring is required to ensure that the study proceeds according to the standards. Nowadays, many studies have well-documented and detailed procedures for quality monitoring. For the Progress for International Student Assessment (PISA), the OECD has produced detailed technical standards, revised for PISA 2003 in the light of experience with PISA 2000, and covering sampling, translation, test administration, quality monitoring, and data coding and entry. Other reference works, such as the International Encyclopaedia of Education (Husén & Postlethwaite, 1994), the IEA technical handbook (Keeves, 1992), and the International Encyclopaedia of Educational Evaluation (Keeves, 1994) World Bank Series on national assessment provide many excellent sources for the design and conduct of such studies and most recently (Nilsen et al., 2022). International comparative studies of educational achievement usually comprise a number of phases (e.g., Beaton et  al., 2000; Kellaghan & Grisay, 1995; Loxley, 1992; Postlethwaite, 1999) that constitute a framework for planning and design. In this section, each of the main phases is discussed below.

Research Questions and Conceptual Framework The design of any study is determined by the questions that it is required to answer (see examples in Tables 4.1 and 4.2). Research questions in international comparative studies often stem from the feeling within some countries that all is not well in education, that quality is a relative notion and that, although education systems may differ in many aspects, much can be learned from other countries (‘the world as a laboratory’). The research questions, therefore, address important theory-oriented and policy issues. For example, in SACMEQ, involving 15 countries in Africa, the first study, namely SACMEQ I aimed the research questions at the interest of policy

4  International Comparative Assessments in Education

49

Table 4.1  Examples of research questions from SACMEQ 1 1. What are the baseline data for selected inputs to primary schools? 2. How do the conditions of primary schooling compare with the Ministry’s own benchmark standards? 3. Have educational inputs to primary schools been allocated in an equitable fashion among and within education districts? 4. What is the level of reading achievement for Grade 6 pupils? 5. Which educational inputs to primary schools have most impact upon the reading achievement of Grade 6 pupils? Table 4.2  Examples of research questions from TIMSS 1995 1. What are students expected to learn? 2. Who provides the instruction? 3. How is instruction organised? 4. What have students learned?

makers and educational planners, and these questions and the instruments were developed by the policymakers themselves. In contrast, the researchers contracted from professional research entities (including universities) for PISA 2000, designed policy-oriented and internationally comparable indicators of student achievement for the OECD on a regular basis. It is important to note that SACMEQ was not intent on designing comparable indicators of achievement, rather it was decided explicitly not to compare across countries. In addition to such policy-related aims, the IEA studies and the OECD’s PISA also include research questions aimed at understanding reasons for observed differences. The general research questions underlying a study are often descriptive in nature (see Table 4.2). In PISA, one finds descriptive questions about differences in patterns of achievement within countries and differences among schools in the strength of relationships between student achievement levels and the economic, social and cultural capital of their families. There are also questions about the relationship between achievement levels and the context of instruction, and a variety of other school characteristics. The first step from the research questions to the design of a study is the development of a conceptual framework. We have already noted that the research questions largely determine this framework, but there are two distinct dimensions to be considered. One is the nature of the outcome measures to be used (e.g., reading literacy, mathematical literacy, and scientific literacy, as defined in PISA). The other is the factors to be studied in relation to the chosen outcome measures (e.g., as specified in the IEA conceptual framework). These are also known as Assessment Frameworks in some studies (see IEA’s Progress in International Reading Literacy Study (PIRLS) 2016). Interesting until PISA 2015, PISA did not have a conceptual framework underpinning its studies, unlike the IEA studies. SACMEQ to date has not published a framework. Perhaps these two examples result from the differences between studies started by academics as opposed to policymakers.

50

S. J. Howie

Research Design The research design focuses on the issues that arise from the research questions and provides a balance between the complexity required to respond to specific questions and the need for simplicity, timeliness and cost-effectiveness (Beaton et al., 2000). The choice between selecting a design that is cross-sectional (where students are assessed at one point in time) or longitudinal (where the same students are evaluated at two or more points in time) depends on the goals of the study as well as the significant funding requirements and logistical challenges of longitudinal studies. The latter two may result in the reformulation of the project goals. Most studies have a cross-sectional design in which data are collected from students at certain age/grade levels. However, the IEA’s TIMSS provides an example of a compromise situation. Ultimately, TIMSS’ multi-grade design was a compromise between a cross-­ sectional and longitudinal survey design, as it included adjacent grades for the populations of primary and junior secondary education. This allowed more information about achievement to be obtained across five grades and an analysis of variation in cumulative schooling effects. Both PISA and TIMSS gather data in successive waves with cross-sectional samples at the same age/grade levels several years apart to examine trends, but these are not longitudinal studies. TIMSS offered a further variant by organising a repeat (TIMSS-R or TIMSS 1999) in which data were gathered for the lower secondary population (Grade 8 in 1999) in the year in which the primary population in the original TIMSS (Grade 4 in 1995) had reached that year of secondary schooling, a follow up survey design. Because of their complexity, international, comparative longitudinal studies are rare. It is not only difficult to retain contact with students over time, but the expected high attrition rates have to be reflected in the design, and this adds to the cost of the studies. One example, however, is the sub-study of the IEA Second International Mathematics Study (SIMS), in which eight countries collected achievement data at the beginning and the end of the school year in addition to the main study, as well as data on classroom processes (Burstein, 1992). Given the type of research questions that were posed, most IEA studies included curriculum analysis as part of the research design. In this, they differ from SACMEQ and OECD studies. The research design allows for good estimates of the variables in question, both for international comparisons and for national statistics. An important decision in each study relates to the level at which analysis will be focused. If the focus of interest is the school, then it is appropriate to select only one class per school at the appropriate grade level, or a random sample of students. However, if the research questions also relate to class and teacher effects, then more than one class will have to be sampled within each school. This is required (by multilevel designs and analysis techniques) to identify the specific class and teacher effects as distinct from the school effects as two or more classes are compared.

4  International Comparative Assessments in Education

51

Target Population An important issue is whether to use a grade or age-based target population (see Table 4.3). The choice reflects the underlying question being addressed. If the question concerns what outcomes school systems achieve with students at a particular grade level (e.g., Grade 8), regardless of how long they have taken to reach the grade (e.g., delays caused by grade repetition), then a grade-based population is appropriate. If, however, the question concerns what outcomes school systems have achieved with students of a particular age, an age-based population is appropriate. In the SACMEQ I study, the question of interest was the former, so a grade-based population (Grade 6) was chosen. On the other hand, for OECD/PISA, the question of interest was the latter, so an age-based population definition was chosen (15-year-­ olds since education is compulsory to this age in most OECD countries). A grade-based target population is easier to work with. The difficulty with an age-based population is that, due to grade repetition and the timing of the cut-off date for school commencement, students at the target age can be in more than one grade, even in as many as four or more in some countries. The IEA uses a ‘grade-­ age’ definition as a compromise to reduce the complexity of sampling, while still throwing light on what systems have achieved with their students by the time they have reached a particular age, rather than a particular grade. For example, in TIMSS, where the interest at the junior secondary level was in the achievements of 13-year-­ olds, the target population was defined as the two adjacent grades where most 13-year-olds could be found at the chosen point in the school year. This approach avoids the additional sampling complexity of a pure, age-based sample such as that used by PISA but, to the extent that there are 13-year-olds in grade levels not sampled, provides only a partial estimate of the achievements of students at the age of interest. Countries also determine whether to identify separate sub-populations (language groups, ethnic groups, geographical areas) since an increase in sample size will usually be required to obtain estimates for them. In some countries, there is interest in regional differences, particularly those in which responsibility for education is decentralised to regional (state or provincial) level. There may also be an also interest in differences between distinct language or cultural communities. Table 4.3  Examples of target populations across selected international studies and year of initiation Studies PIRLS TIMSS

First year of study 2001 1995

Target population of pupils Grade 4 (5 or 6) Grades 4, 8, (Grades 4, 8, 12 in 1995); Advanced TIMSS Grade 11, TIMSS Numeracy 2015 (grades 4–6) PISA 2000 15-year-old students (minimum Grade 7) SACMEQ 1995, 2004, 2011, 2014 Students (Grade 6)

52

S. J. Howie

In any definition of a target population, the possibility of exclusions is addressed. For example, it may be expensive to collect data from students in special education schools or in isolated areas. Normally, the research design indicates that the excluded population in a country is not more than, for example, 5%.

Sampling In the past decade, considerable attention has been paid to the issue of sampling, which critiques of earlier international studies had highlighted as a problem. International comparative studies need good quality, representative samples. These cannot be obtained without careful planning and execution, following established procedures such as those developed by IEA (Martin et  al., 1999) or in the PISA technical standards. These procedures consist of rules concerning the selection of the samples and the evaluation of their technical precision. For example, IEA standards require (among other things) that the target population(s) have comparable definitions across countries, that they are defined operationally, and that any discrepancy between nationally desired and defined populations is clearly described (Martin et al., 1999). In each participating country, basic requirements are met in a number of areas, such as specification of domains and strata, sampling error requirements, size of sample, preparation of sampling frame, mechanical selection procedure, sampling weights and sampling errors, response or participation rates (Postlethwaite, 1999). All samples in large-scale studies involve two or more stages (see examples in Table 4.4). In a two-stage sample, the first consists of drawing a sample of schools proportional to their size. At the second stage, students within the school are drawn (either by selecting one or more classes at random or by selecting students at random across all classes). In a three-stage sampling design, the first stage would be the region, followed by the selection of schools, and then students within schools. The size of sample required is a function of a number of factors, such as the size of sampling error that is acceptable, the degree of homogeneity or heterogeneity of students attending schools, and the levels of analysis to be carried out. Depending on the research questions, researchers may choose one or two classes per school, or a number of students drawn at random from all classes at a particular grade level. The IEA usually recommends 150 classrooms. The simple equivalent sample size (that is, if students had been sampled randomly from the total population rather than from schools) should not be less than 400 students (Beaton et al., 2000; Postlethwaite, 1999). PISA requires a minimum national sample of 150 schools and a random sample of 35 15-year-olds from each school, or all students where there are fewer than 35. As most studies are dependent on the voluntary participation of a school, there is the possibility that they decline to participate. Therefore ‘replacement schools’ matching the characteristics of the ‘first choice’ schools are also drawn in a sample to enable schools to be substituted in the event that a school in the primary

4  International Comparative Assessments in Education

53

Table 4.4  Examples of selected international studies’ sampling designs PIRLS

Minimum 150 schools with 4000 students per country; 3-stage stratified cluster sample: schools randomly PPS, Classes randomly sampled, all students in sampled classes TIMSS Minimum 150 schools with 4000 students per country; 3-stage stratified cluster sample: schools randomly PPS, Classes randomly sampled, all students in sampled classes PISA Minimum 4500 students per country; 2-stage stratified PPS, students randomly sampled SACMEQ Minimum 25 students, pre-selected school. Two-stage stratified: (1) schools sampled with PPS; (2) students randomly sampled. Assessment of teachers – those teachers who teach relevant subjects in the three largest Grade 6 classes are selected by the test administrator from each selected school (SACMEQ)

sample declines to participate. In TIMSS, two replacement schools were drawn for each school in the primary sample. The response rate of the sample is an important quality consideration since non-­ response can introduce bias. Postlethwaite (1999) says that the minimum acceptable response rate is 90% of students in 90% of schools in the sample (including replacement schools where they have been chosen). IEA uses 85% after use of replacement schools. PISA requires 85% if there is no use of replacement schools, but a progressively higher figure as the proportion of schools drawn from the reserve sample rises. If the response rate from the primary sample drops to 65%, the minimum acceptable in PISA, then the required minimum response rate after replacement is 95%. It has become common practice in international comparative studies to ‘flag’ countries that do not meet sampling criteria, or even to drop them from the study. In the IEA studies, the results for countries that failed to meet minimum criteria are reported but the countries are identified. In IEA studies and PISA, countries that fail to satisfy sampling criteria are excluded from the report unless they fall only marginally short and can provide unequivocal quantitative evidence from other sources that non-responding schools do not differ systematically from responding schools, thus allaying fears of response bias in their results. Interestingly, it is common for these flags to be ignored by the press and/or politicians when it is convenient to do so. Some writers of reports are aware of this issue.

Instrument Development A crucial element in international studies is the development of achievement tests, background questionnaires and attitude scales. In curriculum-driven studies, such as the IEA assessments, the first step in test development is to analyse the curricula (analysing curriculum guides, textbooks and examinations but also, for example,

54

S. J. Howie

interviewing subject experts or observing teachers) of participating countries. This forms the basis of a framework or test grid, that represents the curricula of the countries involved in the study as much as possible. It usually has two dimensions: content and performance expectations, meaning the kind of performances expected of students (such as understanding, theorising and analysing, problem solving). Sometimes more dimensions are added, such as ‘perspectives’ which were included in the TIMSS study to refer to the nature of the presentation of the content in the curriculum materials (e.g., attitudes, careers, participation, increasing interest in the subject). This analysis leads to an agreed-upon blueprint for the tests that cover most of the curricula of participating countries. In studies that are not curriculum-driven (OECD/PISA, SACMEQ), it is important that the tests reflect the research questions and the practical needs of policy makers (Kellaghan, 1996). In PISA, the assessment frameworks are developed by international expert panels, negotiated with national project managers, and finally approved by a Board of Participating Countries consisting of senior officials from all the countries involved. Item formats (such as multiple-choice, free-response, and/or performance items) (see example in Fig. 4.1) are decided upon, and test items written to cover the cells in the blueprint. There are a variety of possibilities of item types for assessing students. Large-­ scale surveys conducted by the IEA and other bodies have traditionally used tests with multiple-choice questions. This type of test is popular since the conditions for the assessment are standardised, the cost is low and they can be machine-scored. However, there has recently been a growing awareness among educators that some Example of item from PISA 2018: closed item (hps://www. oecd-ilibrary.org/docserver/5c07e4f1en.pdf) Annex Figure 2.B.1. Task 1. Sample reading ease and efficiency task

Directions: Circle YES if the sentence makes sense. Circle NO if the sentence does not make sense. The red car had a flat tire

YES

NO

Airplanes are made of dogs

YES

NO

The happy student read the book last night

YES

NO

If the cat had stayed out all night, it would not have been in the house at 2 a.m.

YES

NO

The man who is taller than the woman and the boy is shorter than both of them

YES

NO

Fig. 4.1  Example of item from PISA 2018 closed item

4  International Comparative Assessments in Education

55

Example of released item: free response/open ended/extended: TIMSS 2019 Grade 4 (hps://research. acer.edu.au) The graph shows the water level in a dam for 10 weeks. Water Level in the Dam

Water Level (m)

18 16 14 12 10 8 6 4 2 0

1

2

3

4

5

6

7

8

9

10

Week

What was the water level for week 8? Answer:

16

m

(Source: TIMSS 2019 Grade 4 (http://research.acer.edu.au)) Fig. 4.2  Example of released item: free-response/open ended/ extended response. (Source: TIMSS 2019 Grade 4 (http://research.acer.edu.au))

important achievement outcomes are either difficult or impossible to measure using the multiple-choice format. Hence, during the development phase of TIMSS, it was decided that open-format questions (see example in Fig. 4.2) would be included, where students would have to write their answers. In PISA, one third of the items were open-ended and are marked manually. TIMSS developed a two-digit coding scheme to diagnose student answers to the open-ended questions. For example, the first digit registered degree of correctness, while the second was used to code the type of correct or incorrect answer given. The aim of this scheme was to provide a rich database for research on students’ cognitive processes, problem-solving strategies and common misconceptions (Robitaille & Garden, 1996). A third type of test item (not often utilised in large-scale studies) involves performance assessment, which consists of a set of practical tasks. Its proponents argue that the practical nature of the tasks permits a richer and deeper understanding of aspects of student knowledge and understanding than is possible with written tests alone. In TIMSS, performance assessment was an option (Harmon et  al., 1997).

56

S. J. Howie

Fig. 4.3  Example of a rotated test design overview. (Source: Wu, n.d.)

In TIMSS 1995, performance assessments were included optionally for countries. They proved to be demanding logistically and were time-consuming. They were not included in the studies thereafter. When student responses are scaled, agreement is reached on the substantive meaning of the scale in terms of student performance on specified points of the scale (Beaton et al., 2000). There is also international agreement on the appropriateness of the items, and the reliability of the tests, which requires extensive try-outs (Beaton et al., 2000). A rotated test design (see example in Fig. 4.3) may be applied in which individual students respond to only a proportion of the items in the assessment. This reduces testing time per student, while allowing data to be collected on a large number of items covering major aspects of curricula. However, this also has implications for how the achievement is analysed. In most studies, questionnaires are used to collect background data about students, teachers and schools, which are used as contextual information in the interpretation of achievement results. Such data allow researchers to develop constructs relating to the inputs and processes of education, and to address research questions concerning what factors contribute to good quality education (e.g., Postlethwaite & Ross, 1992 who concluded that a large number of background variables influenced reading achievement). The collection of such data also enables countries to search for determinants of national results in an international context.

4  International Comparative Assessments in Education

57

Example items form questionnaires – demonstrating possible cultural differences (PISA 2018 Student questionnaire)

Which of the following are in your home? (Please select one response in each now) Yes

No

A desk to study at A room of your own A quietplace to study A computer you can use for school work Educational software A link to the Internet Classic Iiterature (e.g., Shakespeare) Books of poetry Works of art (e.g., paintings) Books to help with your school work Technical reference boooks or manuals A dictionary Books on art, music, or design A guest room A high-speed internet connection A musical instrument

https://nces.ed.gov/surveys/pisa/pisa2018/questionnaires/School_Q_English.html

Fig. 4.4  Example of item that demonstrates possible cultural differences cross-nationally

A problem with these types of data, however, is that many variables are culturally dependent (see Part IV). Questions that researchers may want to ask about the home environment vary from country to country (see example in Fig. 4.4). For example, when asking about family living in the student’s family, interpretation will take account of differences in family structure. Questions regarding socio-economic status raise another set of problems; in countries in Western Europe, satellite televisions, computers and other digital devices, and motor vehicles are common-place and so would not be helpful as discriminators of socio-economic status. They may, however, distinguish status in newly developed countries, but would probably not be very useful in many less developed ones where indicators of wealth (especially in rural areas) could more appropriately be other possessions, such as animals or specific types of housing. These issues point to the need for a thorough under(continued)

58

S. J. Howie

standing of the context in which data are gathered, when designing questions and in interpreting responses. That is challenging in any international comparative study, but particularly so in ones that cover a diverse range of countries. In international comparative studies, the translation and adaptation of instruments is an important and culturally sensitive issue. For example, in IEA, participants use a system for translating the test and questionnaires that incorporates an independent check on the translations. All deviations (from the international version of the instruments) in vocabulary, meaning or item-layout are recorded and forwarded to the International Study Centre. All translations and adaptations of tests and questionnaires are verified by professional translators, and deviations identified by the translator that may affect results, are addressed by the national study centre. Further verification is provided by the International Study Centre, which also conducts a thorough review of item statistics to detect any errors that might not have been corrected (Martin et al., 1999). In PISA, source tests and questionnaires are provided in English and French. Countries not using these languages are required to produce two independent translations, and strongly urged to produce one from each of the English and French sources. This clarifies the features of instruments that are essential and those that are unique to a particular language of presentation. The independent translations are checked by a third national translator, and all are finally checked by an independent international team. All instruments are piloted extensively in each language of testing, which makes instrument development a difficult, time consuming and complex endeavour. When attitude scales are used, it is necessary to describe the dimensions. Beaton et al. (2000) indicate that often three times as many items are needed for trial testing as will be required in the final instruments.

Data Collection and Preparation Data collection requirements are specified carefully to ensure the comparability of the procedures that are followed in the data collection. Manuals are generally written for the national centres (or national research coordinators), for the person coordinating the administration of the instruments in each school, and for the persons administering the instruments in each classroom or to a group of students. These are piloted simultaneously with the survey instruments to ensure that stated procedures can be followed or to identify changes that need to be made. In many countries, it is necessary to train the people who will administer the tests. In some developing countries, it is not possible to follow the specification that test materials be sent to schools by post. For a country such as South Africa, that approach has three problems. First, many schools do not have functioning postal addresses, or the names of

4  International Comparative Assessments in Education

59

the schools change, or the schools are difficult to locate. Secondly, the postal system is unreliable, which means that it would be almost impossible to ensure that materials would reach their destination safely or in time for the scheduled administration. Finally, there is an overriding need to ensure credibility of the results by employing test administrators who are independent of the participating school. In TIMSS and PIRLS in South Africa, an external field agency was employed to collect data. Out of similar considerations of credibility, PISA required that test administration not be undertaken by any teacher of students in the sample in the subjects being tested and recommended that the administration not be undertaken by any teacher from the school or any other school in the PISA sample. Student responses or, in the case of open-ended items, their coded results, are usually transferred onto computers at the national centre utilising special data entry software. The data are then sent to the International Study Centre for cleaning and file-building. Following this, country sampling weights are calculated to account for disproportionate selection across strata, inaccurate sampling frames and missing data. Advanced psychometric techniques are available to compensate for the effects of a rotated test design (e.g., Item Response Theory) and to reduce the effects of missing data (e.g., imputation). The results of this phase are data files that are the starting point for the international and national data analyses and reports. Data archives are often made available in the public domain (such as those already available from the IEA TIMSS and PIRLS studies and the OECD PISA) as the basis for secondary analyses (national and international).

Data Analysis and Reporting Analyses of data can follow simple or complex procedures. The first international and national reports of a study are usually descriptive and provide policy makers and educational practitioners with the most relevant outcomes of the study. Such reports are clearly written and address the research questions stated at the beginning of the study. Decisions are made at an early stage about issues such as units of analysis and the type of analyses to be employed, as well as various considerations with regard to policy, educational and societal relevance. After the first report, technical reports covering topics such as the psychometric properties of instruments and the characteristics of the achieved samples are usually prepared. Additional analytic reports explore relationships in the data sets in more detail than can be achieved in the time available for the preparation of the first report. Analyses often use more advanced techniques, such as multilevel analysis (to examine school-level, classroom-level, and student-level variations) and (multi-­ level) path analysis to test models explaining some of the complex relationships between achievement and its determinants. Additional reports can also include more detailed analyses of the errors that students made in incorrect responses, to provide teachers and other educational practitioners with insights into the nature of

60

S. J. Howie

students’ misconceptions. Secondary analyses aimed at understanding differences between and within educational systems are also often carried out. These analyses are possible given that many studies’ data (including all of the IEA) are released within days of the published main reports from the respective studies centres. These reports are available on websites and researchers can apply for access and permission to use the data.

Other Non-methodological Considerations There are a number of considerations besides those methodological considerations already discussed. The Word Bank produced a series of books containing many of the methodological and non-methodologically related issues to be take into account when undertaking large-scale and national assessments (see Greaney & Kellaghan, 2008). Some non-methodological issues include management, implementation, leadership, planning, quality assurance and documentation and these are discussed briefly below. Managing and Implementing Studies is a fundamental issue given the complexity and costs of international comparative achievement studies. The IEA even developed a number of standards for managing and implementing studies (Martin et al., 1999), which are also used in the SACMEQ studies, as did the World Bank in its series on national assessments (Greaney & Kellaghan, 2008). Other important components are the selection of an international coordinating centre, detailed planning, the development of a quality assurance programme, and technical reports and documentation. Similar issues are addressed at national level. Leadership and particularly effective leadership, is key at both international and national levels to manage, coordinate the design and implement international comparative studies. At the international level, the need for a centre to coordinate and lead the study is obvious. Such centres require knowledgeable, experienced people in areas such as assessment, psychometrics, statistics and content knowledge of the subjects being studied, as well as others skilled in communication and management. At the national level, the centre conducting the study requires strong project leadership skills which are both technical and substantive, but which also include management and communication skills. It is essential that national coordinators are knowledgeable about their contexts. They review the design of the study critically to make the necessary adaptations for their country, and to identify the experts that will be required to support the study technically. Detailed planning is required given that international comparative studies usually take a number of years to design, implement, analyse and report. Therefore, detailed short-, medium- and long-term planning is required (Greaney & Kellaghan, 2008; Loxley, 1992), which involves consideration of objectives, methods, time schedule, costs and outcomes of the study. Data gathering in the northern and southern hemispheres are undertaken at different times in the calendar year to obtain data at equivalent times in the school year.

4  International Comparative Assessments in Education

61

A quality assurance programme ensures that studies are conducted to a high standard. Such a programme is particularly important for data collection activities, which are often conducted by school personnel, teachers or especially appointed data collectors, and so are outside the control of the study staff. In the IEA, TIMSS and PIRLS studies as well as OECD’s PISA, quality control observers or independent monitors in each country made unannounced visits to a sample of schools to check whether data collection followed agreed procedures. Guidelines for verification of the translation of instruments are part of such a programme. Technical reports and documentation are particularly important so that studies can be evaluated and replicated. Typically, the study design, instrument development, data collection and analysis and reporting procedures of large-scale international studies are described in technical reports and documentation (for example, see Martin & Kelly, 1996; Martin et al., 1999, http://www.iea.nl; http://www.pisa. oecd.org).

References Beaton, A. E., Postlethwaite, T. N., Ross, K. N., Spearritt, D., & Wolf, R. M. (2000). The benefits and limitations of international educational achievement studies. International Institute for Educational Planning/International Academy of Education. Burstein, L. (Ed.). (1992). The IEA study of mathematics III: Student growth and classroom processes. Pergamon Press. Coleman, J. S., Campbell, E. Q., Hobson, C. J., McPartland, J., Mood, A. M., Weinfeld, F. D., & York, R. L. (1966). Equality of educational opportunity. U.S. Government Printing Office. Creemers, B. P. (1994). Effective instruction: An empirical basis for a theory of educational effectiveness. In Advances in school effectiveness research and practice (pp. 189–205). Pergamon. Creemers, B. P. M. (1996). The school effectiveness knowledge base. In D. Reynolds et al. (Eds.), Making good schools: Linking school effectiveness and school improvement (pp.  36–58). Routledge. Creemers, B. P. M., & Kyriakides, L. (2008). The dynamics of educational effectiveness: A contribution to policy, practice and theory in contemporary schools. Routledge. Creemers, B., & Kyriakides, L. (2015). Process-product research: A cornerstone in educational effectiveness research. Journal of Classroom Interaction, 50(2), 107–119. Creemers, B. P. M., & Reezigt, G. J. (1999). The concept of vision in educational effectiveness theory and research. Learning Environments Research, 2, 107–135. Dahllöf, U. S. (1971). Ability grouping, content validity, and curriculum process analysis. Teachers College Press. Greaney, V., & Kellaghan, T. (2008). Assessing national achievement levels in education. National Assessments of educational achievement 1. World Bank. Gustafsson, J.  E. (2010). Causal inference in international comparative research on student achievement: Methodological challenges and developments. International Association for the Evaluation of educational achievement. International Research Conference. Harmon, M., Smith, T. A., Martin, M. O., Kelly, D. L., Beaton, A. E., Mullis, I. V., Gonzalez, E. J., & Orpwood, G. (1997). Performance assessment in IEA’s third international mathematics and science study (TIMSS). TIMSS International Study Centre. Howie, S.  J. (1999). Challenges facing the reform of science education in South Africa: What do the TIMSS results mean for South Africa? In S. Ware (Ed.), World Bank series on science education in developing countries (pp. 199–214). World Bank.

62

S. J. Howie

Howie, S. J. (2000). TIMSS-R in South Africa, a developing country perspective. Paper presented at the American Educational Research Association annual meeting, April 24–28, in New Orleans. Howie, S. J. (2002). English language and other factors influencing Grade 8 pupils’ achievement in mathematics. University of Twente Press. Howie, S. (2012). High-stakes testing in South Africa: Friend or foe? Assessment in education: Principles. Policy and Practice, 19(1), 81–98. https://doi.org/10.1080/0969594X.2011.613369 Howie, S.  J. (2022). Regional studies in Non-Western countries, and the case of SACMEQ.  In T. Nilsen, A. Stancel-Piątak, & J. E. Gustafsson (Eds.), International handbook of comparative large-scale studies in education. Springer International Handbooks of Education. Springer. https://doi.org/10.1007/978-3-030-38298-8_18-1 Howie, S. J., & Plomp, T. (2001, April). English language proficiency and other factors influencing mathematics achievement at junior secondary level in South Africa. American Education Research Association Annual Meeting, Seattle. Howie, S., & Plomp, T. (2005). International comparative studies of education and large-scale change. In International handbook of educational policy (75–99). http://www.iea.nl http://www.pisa.oecd.org Husén, T., & Postlethwaite, T. N. (1994). International education. The International Encyclopedia of Education, 5, 2973–2976. Husén, T., & Tuijnman, A. C. (1994). Monitoring standards in education: Why and how it came about. In A. C. Tuijnman & T. N. Postlethwaite (Eds.), Monitoring the standards of education (pp. 1–22). Pergamon. Keeves, J. P. (Ed.). (1992). Methodology and measurement in international educational surveys. International Association for the Evaluation of Educational Achievement (IEA). Keeves, J. P. (1994). Methods of assessment in schools. In T. Husen & T. N. Postlethwaite (Eds.), International encyclopedia of education (2nd ed., pp. 362–370). Pergamon Press. Keeves, J.  P. (1995). The world of school learning. Selected key findings from 35 years of IEA research. The International Association for the Evaluation of Educational Achievement. Kellaghan, T. (1996). IEA studies and educational policy. Assessment in education. Principles, Policy & Practice, 3(2), 143–160. Kellaghan, T., & Grisay, A. (1995). International comparisons of student achievement: Problems and prospects. In Centre for Educational Research and Innovation, measuring what students learn (pp. 41–61). OECD. Loveless, T. (2007). Lessons learned: What international assessments tell us about math Achievement. Brookings Institution. Loxley, W. (1992). Managing international survey research. Prospects, 22(3), 289–296. Martin, M., & Kelly, D. L. (1996). Third international mathematics and science study. Technical report. Volume III: Implementation and analysis (pp. 91–120). Boston College. Martin, M., Rust, O., Adams, K., & Raymond, J. (1999). Technical standards for IEA studies. IEA. Nilsen, T., Stancel-Piątak, A., & Gustafsson, J. E. (Eds.). (2022). International handbook of comparative large-scale studies in education. Springer International Handbooks of Education. Springer, Champions. https://doi.org/10.1007/978-3-030-38298-8_18-1 Plomp, T. (1998). The potential of international comparative studies to monitor the quality of education. Prospects, 28(1), 45–59. Plomp, T., Howie, S. J., & McGaw, B. (2003). International studies of educational achievements. In D. Stufflebeam & T. Kellaghan (Eds.), International handbook on educational evaluation (pp. 951–978). Kluwer Academic Publishers. Porter, A., & Gamoran, A. (2002). Progress and challenges for large- scale studies. In A. C. Porter & A.  Gamoran (Eds.), Methodological advances in cross-National Surveys of educational achievement (pp. 3–23). National Academies Press. Postlethwaite, T.  N. (1999). International studies of educational achievement: Methodological issues. Comparative Education Research Centre, University of Hong Kong.

4  International Comparative Assessments in Education

63

Postlethwaite, T. N., & Ross, K. N. (1992). Effective schools in reading: Implications for educational planners. An exploratory study. IEA. Raudenbush, S. W., & Kim, J. S. (2002). Statistical issues in analysis of international comparisons of educational achievement. In A. C. Porter & A. Gamoran (Eds.), Methodological advances in cross-National Surveys of educational achievement (pp. 267–294). National Academy Press. Robitaille, D. F., & Garden, R. A. (Eds.). (1996). TIMSS monograph no. 2: Research questions and study design. Pacific Educational Press. Rosen, M. Gustafsson, J. E., & Hansen, K. Y. (2013). Influences of early home factors on later achievement in reading, math and science: An analysis of the Swedish data from PIRLS and TIMSS 2011. International Association for the Evaluation of educational achievement. International Research Conference. Scheerens, J., & Bosker, R. (1997). The foundations of educational effectiveness. Emerald Group Publishing. Scheerens, J., Glas, C. A., Thomas, S. M., & Thomas, S. (2003). Educational evaluation, assessment, and monitoring: A systemic approach (Vol. 13). Taylor & Francis. Scherman, V. (2007). The validity of value-added measures in secondary schools (Unpublished doctoral dissertation). University of Pretoria. Stringfield, S. C., & Slavin, R. E. (1992). A hierarchical longitudinal model for elementary school effects. In B. P. M. Creemers & G. J. Reezigt (Eds.), Evaluation of educational effectiveness (pp. 35–69). ICO. Tymms, P. (1998). A universal baseline. Presented at British education research association conference, Belfast. Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. https://doi.org/10.1080/0924345 3.2017.1404478 UNESCO. (2015). Education for all 2000–2015: Achievements and challenges; EFA global monitoring report, 2015. UNESCO. Van Staden, S. (2010). Reading between the lines: Contributing factors that affect Grade 5 learner reading performance. Unpublished doctoral dissertation: University of Pretoria. Westat, I., & US Department of Health and Human Services. (1991). A national evaluation of Title IV-E foster care independent living programs for youth: Phase 2. Author. Wӧßmann, L. (2003). Schooling resources, educational institutions and student performance: The international evidence. Oxford Bulletin of Economics and Statistics, 65(2), 117–170.

Chapter 5

International Comparative Assessments of Young Children: Debates and Methodology Sarah J. Howie

This chapter reflects on international comparative assessment of young children, introducing some of the issues when conducting large scale assessments of young children. This is followed by two examples of international assessments of young children. Critique of the latest study is discussed, followed by further consideration of the methodological considerations required for valid and reliable cross-national assessments. Subsequently some of the strategies utilised internationally outside of the international assessments are also described. In pondering the methodological issues specifically relevant to assessing young children in international assessments, the chapter culminates with the challenges and solutions experienced and addressed by the iPIPS team.

5.1 Introduction As the focus in education is shifting to earlier in the child’s development and is targeted at early schooling (Diaz-Diaz et al., 2019; Lin & Lin, 2019; Moss et al., 2016;), new debates rage particularly in Europe and the USA with the emergence of the implementation of large-scale assessments for young children, such as IELS. Implemented in three countries in 2018, IELS aims to measure and compare performance in participating countries’ early learning ‘domains’ that include emerging literacy, emerging numeracy, self-regulation, empathy and trust. The initial critique was levelled at the secrecy of the study and the exclusion of ECD expertise (Moss & Urban, 2017; Urban, 2019) growing to significant methodological and validity concerns (Moss & Urban, 2019) and political concerns regarding its S. J. Howie (*) Stellenbosch University, Stellenbosch, South Africa e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_5

65

66

S. J. Howie

“neoliberal, neoconservative, and globalising discourses which underpin the study” (Delaune, 2019:60) and concerns about it being a ‘Baby PISA’ (Urban, 2019). To understand the debates and concerns better, a short description of the two previous large-scale assessments, apart from IPiPs, is provided followed by a discussion of the issues regarding large-scale assessments of young children.

5.2 International Comparative Assessments of Young Children Apart from iPIPS, there are two other major international comparative studies, namely the IEA’s Preprimary Study and the OECD’s IELS and these are briefly described below.

The IEA Preprimary Study, 1987–1997 The Preprimary Study was a 10-year cross-national project (1987–1997) which aimed to identify and document the process and structural characteristics of care and education settings for 4-year-olds and relate these characteristics to the children’s cognitive and language performance 3 years later (Olmsted & Weikart, 1995). Participation by countries varied according to which of the three phases of the project they opted to participate in: • In Phase 1, a household survey was administered to families to identify the major types of early childhood care and education used by families with 4-year-old children (Olmsted & Weikart, 1994, 1995; Olmsted & Montie, 2001). • Phase 2, an observational study, documented the structural and process characteristics of selected setting types identified in Phase 1 and assessed children’s cognitive and language skills at age 4 (Olmsted & Montie, 2001; Weikart, 1999; Weikart et al., 2003). • Phase 3, the children observed in Phase 2 were followed to age 7, when language and cognitive assessments were repeated. In total, 17 countries participated in one or more phases of the project. The participating countries were from Africa (only 1  in Phase 1), Asia, Europe and North America, and included both developed and developing nations. The IEA Preprimary Project used variables derived from direct observation of children’s and teachers’ behaviours and activities in preschool, as well as setting structural characteristics, to predict children’s language and cognitive performance at age 7. The analysis aimed to examine whether children’s experience in settings at age 4 was related to their language and cognitive performance at age 7 and also to examine whether relationships between children’s experience in settings and their performance at age 7 varied across participating countries, and then to explore reasons for varying effects (Olmsted & Weikart, 1995).

5  International Comparative Assessments of Young Children: Debates and Methodology

67

Researchers across countries collaborated to develop common instruments to measure family background, teachers’ characteristics, the structural characteristics of settings, children’s experiences within settings and children’s developmental status. After the publication of the main reports, further research continued as secondary analyses were undertaken (Montie et  al., 2006). Montie and colleagues used hierarchical linear modelling to investigate variation in child outcomes in terms of variation among children within settings, among settings, within countries and among countries. This analysis method revealed that across all 10 countries included in the analysis: • Age-7 language improves as teachers’ number of years of full-time teacher training/education increases. • The predominant type of activity teachers propose in settings is free choice (individual child choses) rather than personal/social; and • Age-7 cognitive performance improves as children spend less time in whole group/class activities and the variety of equipment and materials available increases. A number of findings varied across countries depending on particular country characteristics. The findings support child-initiated activities and small group activities and are consistent with developmentally appropriate practices promoting active learning.

International Early Learning Study (IELS) The International Early Learning Study (IELS) (OECD, 2020a), an international large-scale assessment of 5-year-olds conducted under the auspices of the OECD, was implemented between 2016–2020 with data being collected in three participating countries (England, Estonia and the U.S). Its aim was to measure and compare performance in participating countries for early learning ‘domains’ that include emerging literacy, emerging numeracy, self-regulation, empathy and trust. IELS appears to have developed from earlier initiatives by the OECD on early childhood policies and services, named ‘Starting Strong’ (OECD, 2001, 2006, 2017a, b, c). Starting Strong, which involved 20 countries, used case-study methods “sensitive to the diversity and complexity of systems and pedagogies” (Moss et al., 2016:344) to identify common features and inform policy. Thereafter, a further two Starting Strong cycles followed, each differing in focus and tone, with Starting Strong III (OECD, 2011) offering “a quality toolbox for early childhood education and care” (Moss et al., 2016:344), while Starting Strong IV (OECD, 2015) focused on “monitoring quality” (Moss et al., 2016:344). The focus of IELS was different from the Starting Strong projects. The study claimed to focus on those aspects of children’s early learning that have been found to “best predict positive later outcomes” (OECD, 2020a:31). However, given the contents of the language test, this could be contested. For instance, letter

68

S. J. Howie

knowledge, which is known to be a good predictor of learning was omitted from the test. IELS notes that the areas of early learning that best support positive later outcomes are inter-related and mutually reinforcing. IELS aims to: • Provide robust empirical data on children’s early learning through a broad scope of domains that comprise cognitive and social and emotional development; • Identify factors that foster and hinder children’s early learning, both at home and in early childhood education programmes; • Provide findings that will allow parents and caregivers to learn about interactions and learning activities that are most conductive to child development; • Inform early childhood education centres and schools about skill levels of children at this age as well as contextual factors related to them that they could use to make more informed decisions about curriculums and pedagogical [sic]; and • Provide researchers and educators in the field of early education with valid and comparable information on children’s early learning and characteristics obtained from a range of sources and accompanied by a broad scope of contextual variables. The data were collected for the ‘direct assessment’, from almost 7000 children engaged with ‘developmentally appropriate stories and activities’ on a tablet. There was no reading or writing involved (OECD, 2020a, b, c) and therefore, presumably, no reading comprehension was explicitly assessed. Whilst it may not be immediately clear how children could engage with developmentally appropriate stories and activities on a tablet without reading and being able to comprehend what they were engaging with, as the study assessed emergent literacy (oral language, listening comprehension and phonological awareness), it may have obviated the need to read. However, not assessing reading may lead to a ceiling effect on the test and underestimate the development of 5-year-olds in some countries. At the time of writing this book, the findings of IELS have just been published (OECD, 2020a). There is evidence from the study of three countries that some of the key findings mirror those of studies of older age groups conducted cross-nationally (PIRLS, TIMSS in upper primary and secondary schooling). For example, girls perform better than boys in (emergent) literacy; higher socio-economic status is correlated with higher achievement; more books in the home is associated with higher performance; greater parental involvement is related to higher achievement and participation; there is less variation in performance in Baltic states than in the USA. The critique of the study has continued (Moss, 2020) with some of the more political aspects of the debate discussed in the forthcoming section and other more methodological issues discussed later in the chapter.

5  International Comparative Assessments of Young Children: Debates and Methodology

69

Critique and Issues Related to IELS Whilst there is some willingness for international collaboration and sharing of information, fears are rising about “joint learning at the international level being replaced by universal standardised assessment of children, decontextualized comparisons, and, as a consequence, ranking of countries” (Urban & Swadener, 2016:7). There are real worries about the assessment of early childhood education being technical practices, “adopting a vision of comparative education as a technical process modelled on industrial benchmarking” (Auld & Morris, 2016:226). These fears are further fuelled by the previous reports of low reliability and validity of standardised tests in contexts of large-scale comparison (Madaus & Clarke, 2001; Meisels, 2004, 2006; Meisels & Atkins-Burnett, 2006; Raudenbush, 2005). However, although many of these validity and reliability issues have been addressed in the studies of older children over the past 20 years by intensive efforts of organisations such as the IEA and OECD, these are less widely reported for assessments of young children, given the paucity of large-scale assessments of young children. Therefore, the seeming lack of attention to these issues, in particular in the IELS design and development phase, raised anxieties in the early childhood communities internationally (Urban & Swadener, 2016). This critique of IELS is in contrast to the wider support for the earlier initiatives of the Organisation for Economic Co-operation and Development (OECD) that instituted a major international project on early childhood policies and services, ‘Starting Strong’ (OECD, 2001, 2006). The study including 20 countries, identified common features and important policy conclusions using case-study methods “sensitive to the diversity and complexity of systems and pedagogies”. Thereafter, two other Starting Strong reports followed, different in focus and tone, with Starting Strong III offering “a quality toolbox for early childhood education and care” (OECD, 2011), while Starting Strong IV was reportedly about ‘monitoring quality’ (OECD, 2015). These developments reflected a shift from early childhood education and care towards a discourse of outcomes and investment (Moss et al., 2016). This shift in discourse to outcomes and investment at the early childhood stage raises concerns due to the negative experiences of high-stakes testing and impact on learning often experienced at the later age groups. According to the Global Education Monitoring (GEM) Report 2017/2018 “There is extensive evidence showing that high-stakes tests based on narrow performance measures can encourage efforts to “game the system, negatively impacting on learning and disproportionately punishing the marginalized” (UNESCO, 2017:7). These reveal the necessity of balancing the needs of monitoring and developing education, including early childhood education, with empirical data whilst avoiding the negative, albeit unintentional consequences. The GEM report also recognises the importance of collecting data on learning outcomes and in particular focusing this effort on identifying “factors that drive inequality in education” (UNESCO, 2017:7). The challenge highlighted by the GEM report is that although collecting these types of data is critical, “drawing

70

S. J. Howie

precise conclusions requires time, resources and skills that few countries have, and drawing the wrong conclusions can be all too easy” (UNESCO, 2017:7). The main the critique of the IELS study lies in the small numbers participating, the low reliability and validity of the tests, the approach being technical benchmarking, concerns about the neglect of context, political and economic pressures bearing on the education of young children. The issues are briefly expanded on below: Technical Concerns  There appear to be concerns about the absence of an acknowledgement of the problems alluded to in the external critique of the IELS study. None of validity and reliability issues were addressed explicitly; nor was there any discussion of issues raised by computer-based assessment of five-year-olds, with the website simply asserting that “even children with no previous experience with tablets have no problems in understanding and following” (https://www.oecd.org/fr/education/scolaire/the-­international-­early-­learning-­and-­child-­well-­being-­study-­faq.htm). Political and Corporate Interests  There are widespread concerns about the appropriateness of ranking young children or imposing a common framework of education, as well as the apparent emphasis on political and corporate interests over educational interests. These were some of the reasons leading to only three countries participating. For example, Germany decided not to participate in the IELS after an intervention by a coalition of national organisations arguing that the OECD approach failed to recognise children’s rights, diversity and sociocultural contextualisation of early childhood practices (Urban & Swadener, 2016). The move by the OECD from the use of an international consortium of professional organisations up to 2012 in assessment to international corporations for profit in 2013 onwards, sent alarm bells through the education community. In the minds of many, the entanglements of big business and government policy making (Unwin & Yandell, 2016; Urban & Swadener, 2016; Moss  & Urban, 2019) undermine the education (and assessment) agenda internationally. Of deep concern is that the “current international initiatives for standardised assessment contribute to opening public education sectors to corporate profit interests and to channelling scarce resources from the public sphere to private, corporate profit” (Urban & Swadener, 2016:11). Of particular concern was the call for tender for the IELS pilot in the US. The call provided insight into the values guiding the initiative where IELS was presented as a ‘business opportunity’ by the US government1. Revealingly, the tender document specified that ‘expert help’ in developing and piloting IELS is ‘optional’. The latter clearly fed directly into the fears of the education community that the interests of the children were not foremost and that corporate interests were being promoted. Insufficient Contextualisation  IELS abandons meaningful contextualised evaluation in order to create comparability, according to Urban and Swadener, (Urban & Swadener, 2016:11). In brief, IELS is considered by some authors to be a blunt  https://www.fbo.gov/index?s=opportunity&mode=form&id=f1f768686c6d0a656582efc61e23b 3ad&tab=core&_cview=0 1

5  International Comparative Assessments of Young Children: Debates and Methodology

71

instrument, seeking to reduce the rich diversity and complexity of the early childhood education context to a common standard, measure and outcome. Contextual information collected in the IELS is viewed as limited and narrow (Moss et  al., 2016). It over-emphasises policy over cultural and social contexts (Sellar & Lingard, 2013), which are critical for this age group. There are concerns that IELS moves the emphasis away from pedagogies which are meaningful and relevant in children’s lives and their learning, to an emphasis on achieving assessment results that “fit a universal framework” (Mackey et al., 2016:448). Unaccountable Power  The early childhood communities’ concerns are expressed about the effects of comparing national performance in an unnuanced way, standardising and narrowing early childhood education. Concerns have been raised about IELS following the PISA experience where the quickest way for countries to raise their PISA score is to align their curricula to the PISA measurements, having a detrimental effect on national priorities for teaching and learning in diverse communities (Carr et al., 2016). Lack of Transparency and Secrecy  There is a perception amongst early childhood researchers that IELS has been undertaken in secrecy without broader consultation/awareness of early childhood-related specialists and educators and that the study and processes especially before implementation lacked transparency according to the early childhood community. Concerns raised by early childhood researchers and their research were dismissed and not taken into account in the design and setting up of IELS (Moss & Urban, 2017).

5.3 Methodological Issues Regarding International Assessments of Young Children The experiences of the IEA’s Preprimary project and the OECD’s IELS give rise to a number of issues with implications for the methodology of international assessments, in addition to those described in Chap. 4. The Preprimary project was unique and the first time that an international comparative study had targeted the younger age group. There were a number of limitations including the limited number and diversity of the countries included. Whilst there was no intention to exclude nations, given that the participation was voluntary, Africa and South America were not included, and many cultures were excluded. One critique was that …although the inclusion of country-level variables permitted exploration of reasons for varying effects across countries, these variables covary among themselves and with country characteristics that were not measured directly. We can only speculate about what cultural characteristics the country-level variables represent (Montie et al., 2006).

72

S. J. Howie

Furthermore, there was critique about the lack of standardisation in the data collection which led to the conclusion that “The fact that age-4 developmental status measures were administered at different times during the school year means that any setting effects that took place prior to age-4 testing were adjusted for”, thereby preventing exact comparisons with other large-scale studies (Montie et al., 2006). The most recent study, IELS, has come under close scrutiny and critiqued sharply with a wide range of concerns, not all of which are methodological, as mentioned earlier. With only three countries, the study had even fewer countries than the IEA Preprimary. The methodological concerns (Moss & Urban, 2017, 2019) appear to focus on the validity and reliability of the tests that were administered which describe as low reliability and validity of standardized tests of children and quote a “vast literature critiquing aspects of PISA methodology” (Moss & Urban, 2017). The validity and reliability of the tests comprise the heart of the assessment and are central to the validity of a study overall in addition to the cross-cultural appropriateness and ease of administration (Fernald et al., 2017). In their development of a toolkit for measuring early childhood development in low and middle-income countries for the World Bank, Fernald and colleagues emphasise the difficulties of developing ‘culture-free’ cognitive tests stating that no test (even non-verbal) can avoid bias due to the cultural and linguistic differences. However, test developers strive to design sufficient measures with low enough bias to make meaningful comparisons. To this end, Fernald and colleagues foreground the steps to adaptation and standardisation of existing assessments in new contexts. This includes accurate translation, cultural adaptation, pre-testing, pilot testing and test modification. However, this is true not only for the process of assessing young children but also when conducting an assessment of children of all ages as well as adults. Related to the test design and adaptation of the assessment of young children, Fernald et al. (2017) highlight both Criterion, Convergent, Discriminant validity as being important to strive for. Firstly, for criterion validity, one would typically compare the test score to a gold standard measure. However, the challenge to achieve this is that it is not possible for assessments in low- and middle-income countries for young children because a gold standard measure does not exist. Secondly, convergent validity would typically involve checking whether test scores are associated with factors that are expected to be related to them. For example, language and motor development would be expected to be associated with each other and with factors such as maternal education, home stimulation and linear growth. Finally, for Discriminant validity, one would usually check whether test scores are not associated with factors not expected to be related to them. This is particularly important for very young children. Given the age of the young children, there is also the difficult balance to be maintained for assessment between the learners’ ability to guide their own learning as well as the environment providing opportunities for learning. Maintaining this balance is a challenge for teachers (Buzzelli, 2018). Other tensions arise especially for teachers with regard to international, national and provincial high stakes assessments, which cause conflicting expectations and roles (Jang & Sinclair, 2018) amongst teachers and is further discussed in Chap. 4 of Part2. The use of assessment for accountability leads to challenges for the implementation of assessment in the

5  International Comparative Assessments of Young Children: Debates and Methodology

73

classroom, given the conflicting policy narrative contrasting with the relational pedagogy and care ethics (Archer, 2017). This has implications for how the format and methods of assessment are designed and implemented for young children.

 trategies Used to Assess Young Children outside S of International Assessments Apart from the international assessments, there is extensive knowledge of assessment instruments and strategies available internationally. These have been widely reported across the globe and are briefly summarised below. With regard to assessing Early Childhood Development and its determinants, there are a number of important aspects (Fernald et  al., 2017) to consider when designing and developing assessments as well as interventions based upon assessments: • Child development represents a dynamic interplay between biological and environmental factors; • Evidence from impact evaluations in both high- and low-income settings suggests that development is malleable and can be improved by interventions affecting the child; • Any assessment of child development should be accompanied by a measure of the quality and quantity of nurturing care that the child experiences in his or her environment to aid the interpretation of developmental scores; and • The breadth and depth of behaviours that can be assessed increase with age, and the advancement in communication and other skills during the preschool and early primary years provides additional modes for testing (Fernald et al., 2017). It is notable that most of the commonly available assessment instruments have been designed, developed and implemented mostly in the developed world (see Table 5.1) with few exceptions such as BIL, EAP-ECDS in addition to IPiPs. One development internationally is the Toolkit for Measuring Early Childhood Development in Low- and Middle-Income Countries (Fernald et al., 2017), originally compiled and published in 2009. The toolkit was compiled after a review of 41 assessment tools that had been developed or used for children ages 0–5  years in low- and middle-income countries. The most recent version was released in 2017. The intention is to provide a resource for researchers, evaluators, and programme personnel from various disciplines interested in assessing early childhood development (ECD) in low- and middle-­ income countries – either for planning and evaluating interventions, monitoring development over time, or conducting a situation analysis. The Toolkit is intended to help produce reliable, actionable data on child development (Fernald et al., 2017:18).

In total, the Toolkit includes 106 new assessment tools for children of the ages 0–8  years. Notably, a third of the tools in the previous version of the toolkit

74

S. J. Howie

Table 5.1  Commonly found assessments internationally Assessment British Ability Scales

Country (if specified) UK

Best Start Kindergarten Assessment of numeracy and literacy

Australia

BIL (Baterıa de Inicio a la Lectura) Clinical Evaluation of Language Fundamentals–Preschool (Second Edition (CELF-Pre-2)) Clinical Evaluation of Language Fundamentals – Preschool–Spanish (Second Edition) The Phoneme Elisions Subtest of the Comprehensive Test of Phonological Processing (CTOPP) Dynamic Indicators of Basic Early Literacy Skills (DIBELS Next) Wechsler Individual Achievement Test, Second UK Edition WIAT-II; Wechsler (2005) EAP-ECDS China

Peru Spain

Early Development Instrument (EDI) Toolkit for Measuring Early Childhood Development in Low- and Middle-Income Countries

Source Eliott and Smith (2011); Duncan et al. (2018) NSW Department of Education and Training (2009) Selles et al. (2018) Semel et al. (2004)

USA

Landry et al. (2019)

USA

Torgesen et al. (1998)

USA

Good et al. (2011)

UK

Wechsler (2005)

Ireland USA China (mainland), Cambodia, Mongolia, Papua New Guinea, Timor-Leste, Vanuatu Canada

Nally et al. (2018) Zhang et al. (2018)

Low- and Middle-Income Countries

Gagné et al. (2020) Youmans et al. (2018) Fernald et al. (2017)

originated from a low- or middle-income country (e.g., the Malawi Developmental Assessment Tool), or had been developed for multiple countries. Nearly half (44%) of the 106 newly-added tools originated from low or middle-income countries. The current updated toolkit conducted extensive searches for new, additional tools including from those cited in reviews (Fischer et al., 2014; Sabanathan et al., 2015; Semrud-Clikeman et al., 2016;), the Bill & Melinda Gates Foundation landscaping analysis, test inventory for school-age children (Wuermli et al., 2015) leading to an additional 71 tools developed or used in low- and middle-income countries. The search found additional 35 tools specifically for children aged 5–8 years (Fernald et al., 2017:18). A variety of tests cover the nine domains of child development, with cognitive, language and motor development most often assessed. More than half of the tests (88 tests or 60%) cover multiple domains, while those that covered a single domain most commonly measured cognitive (12 tests) or social-emotional development (17 tests), academic skills (10 tests), or executive function (10 tests). The majority of

5  International Comparative Assessments of Young Children: Debates and Methodology

75

tests are individual-level screening or ability tests, with only 11 population-level assessments. According to Fernald et  al. (2017), the most widely used tools found to have been applied in at least 20 different countries originated from the United States. These include the Achenbach Child Behavior Checklist (CBCL), Bayley Scales of Infant Development (BSID), Wechsler Intelligence Scale for Children (WISC), Wechsler Preschool and Primary Scale of Intelligence (WPPSI), Ages & Stages Questionnaires (ASQ), Strengths and Difficulties Questionnaire (SDQ), and Denver Developmental Screening Test. The new tests found that have been developed in multiple countries and have already gained widespread use, include the Early Grade Reading Assessment (EGRA) (Dubeck & Gove, 2015), used in 65 countries; Early Grade Mathematics Assessment (EGMA) (Reubens, 2009), used in 22 countries; Save the Children’s Literary Boost assessment toolkit, used in 24 countries; and the Multiple Indicator Cluster Surveys (MICS) Early Child Development Index (ECDI), used in 36 countries. Of the tools developed in a low- or middle-income country before 2009, two have been used in a growing number of countries. The Kilifi Developmental Inventory, originally developed in Kenya, has been used in studies in Uganda, Malawi, Ghana and South Africa. The Guide for Monitoring Child Development, originally developed in Turkey, has been used in Argentina, India and South Africa. The Guide for Monitoring Child Development, originally developed in Turkey, has been used in Argentina, India and South Africa (Fernald et al., 2017). Based upon an extensive review of the literature, Fernald et al. (2017) concluded that the ideal characteristics of an assessment for early childhood development are the following (Table 5.2). Notably the characteristics address different types of validity (content, construct, predictive, ecological), its ability to discriminate between ages and levels and a Table 5.2  Ideal characteristics of an ECD assessment Ideal 1 Ideal 2

The test score represents the child’s true ability. The test is appropriate, interpretable, and has high reliability and validity in all contexts and cultures. Ideal 3 The test shows variance in scores at all ages and ability levels. Ideal 4 The test is easy to administer Ideal 5 The test can be administered quickly and at low cost. Ideal 6 The test provides information on all developmental domains. Ideal 7 The test score is relevant to a child’s practical function in daily life and therefore relevant to policy and program design. Ideal 8 The test is a good indicator of future success. Ideal 9 The brain systems and neural mechanisms underlying test performance are well- understood. Ideal 10 The impact of health, nutrition, and environmental factors on the test score is well-understood. Source: Fernald et al., 2017:63

76

S. J. Howie

number of practical qualities (ease of use and cost). What is not mentioned is its reliability, another essential quality of an assessment which, if high, can improve the consistency of results. In the next section, the experiences of the team who worked on the iPIPS project are encapsulated in a number of reflections. The theoretical and methodological issues highlighted in this chapter as well as in Chap. 2, are discussed in terms of how the iPIPS team managed them across several diverse landscapes.

 eflection on Methodological Challenges and Solutions R Addressed by the IPIPS Team The iPIPS study built upon almost 20 years of experience gained with running the PIPS assessment of children in their first year of school mainly in the UK (see Part 1, Chap. 1). The original aim of PIPS was to provide teachers and schools with user-­ friendly information about what children know and can do when they start school, and their progress during their first school year. From this basis, the research questions for iPIPS evolved: 1. What do children know and what can they do when they start formal education? 2. How much progress do they make during their first year of school? These questions are pertinent for policymakers, practitioners and the academic community. When exploring what children know and can do, iPIPS takes a broad perspective examining, where possible, personal, social and emotional development, behaviour and physical development as well as cognitive development. The same holds for the progress that they make during that important first year of school. Data about children’s home background and other contextual factors are collected, providing variables against which outcome data can be interpreted. Moving on to consider the conceptual framework, it might be expected to follow a curriculum but at the start of school, children will have had different pre-school experiences, which in many countries is not compulsory, and they will not necessarily have followed a common curriculum. It therefore made sense to turn to the literature on child development and identify aspects which predict later outcomes. This is indeed how the iPIPS assessment was constructed. Data from England where children have been monitored from their first PIPS assessment at the start of school through primary school and up to the end of compulsory education at age 16, show that PIPS do predict achievement with a correlation of about 0.5 (Merrell & Bailey, 2012; Tymms et al., 2009, 2018). For iPIPS, the decision was made to use grade-based samples, which contrasts with the IEA age-based IELS study. This does have the complication of children starting school at different ages in different countries, which, at the ages of 4 or 5 years is a significant proportion of a child’s life. However, the rationale was that looking at what children know and can do at the start of school can provide valuable

5  International Comparative Assessments of Young Children: Debates and Methodology

77

information about the effectiveness of countries’ preschool policies and teachers with information to guide the next steps of teaching their new intake of students. Progress made in the first year of schooling can also be readily assessed and compared. If an age-based approach, for example at age 5, children in countries such as Russia and South Africa would not have experienced the school setting, and the numeracy and literacy measures would have been hard to interpret because many children become literate during their first year at school. Sampling strategies have varied according to the nature of iPIPS projects in different countries. For example, in Australia, England and Scotland where the assessment was used by several thousand schools on an annual basis in a model where schools paid a registration fee, a two-state/territory sample was drawn where the whole state/territory was involved. In Brazil, Russia and South Africa, sampling methods were used in the regions participating to ensure that the schools and students were representative of the regions as a whole. In the projects in these three countries, trained researchers assessed the pupils’ cognitive development and teachers were asked to assess personal, social and emotional development, and behaviour. In Brazil, where children’s physical development was assessed, this was done by trained researchers. The assessment of young children brings particular challenges (Merrell & Tymms, 2016) over and above those already discussed in this chapter. In older children and adults, the assessment of cognitive development can be efficiently and validly conducted through a written test. However, many young children cannot read at the start of school, nor do they have the self-management and maturity to participate in a group test. Their concentration span can be quite short and their working memory holds very few pieces of information. Assessments of young children are often made using observation methods. However, it is possible that children do not show their full abilities in a classroom setting. For example, a child may have an advanced understanding of mathematical concepts but if the activities in the setting do not challenge them to display this understanding, it will be missed. This may be particularly relevant in countries where class sizes are large. A further issue with observations is bias against individuals and groups (Harlen, 2004, 2005; Hosterman et al., 2008). Having said that, there are areas such as personal, social and development and behaviour, which are more appropriately assessed using observation methods and the information gleaned from them validly represents a child’s interactions in the classroom setting albeit with the potential for bias. This requires instruments which are, as far as possible, resistant to the potential biases of the rater; for example, a description of the behaviour with clear criteria for what it constitutes against each point on the scale. iPIPS uses a combination of approaches. To assess children’s cognitive development, an adaptive test is administered on a one-to-one basis, short and engaging to hold the child’s attention yet providing reliable and valid results. For personal, social and emotional development and behaviour, observation rating scales are used. Adaptations for each country have been undertaken with care, noting the issues outlined earlier in this chapter. The mathematics section of iPIPS readily lends itself to local translation and cultural adaptation without significant threats to

78

S. J. Howie

the validity of making international comparisons. Other parts, such as vocabulary and reading comprehension, are much more challenging due to the nuances of different languages. Similarly, when translating a rating scale of behaviours into different languages, extreme care is needed to ensure that the meaning is retained. An additional challenge is varying perceptions of what constitutes acceptable and unacceptable behaviour across countries, which emphasises the importance of assessing what, and how frequently, a child is doing, rather than how acceptable or severe it is. iPIPS has generally been coordinated by a centre linked in each country (Scotland and England being jointly coordinated by the centre at Durham University) with a core team who have the skills outlined earlier in the chapter. The centre organised data collection and analysis, along with feedback to teachers, reports and research papers. The one interesting exception to this is Lesotho. Here, the local team consisted of school principals, teachers and education specialists from the Ministry of Education and Training. This team was well placed to work on the adaptation of the assessment content and administration to suit the local context but did not have the psychometric expertise. While the project was at the pilot stage, the local team worked directly with the team at Durham to develop, administer and analyse the data. As the project scales up beyond the pilot stage, there is an opportunity for more established centres to help with capacity building in countries, as needed, so that in time, such countries can establish their own centres. When making comparisons between countries, iPIPS firstly considers the validity of the comparisons, see for example the ‘comparison hypothesis’ suggested by Merrell and Tymms (2016). This hypothesis sets out different areas of children’s development and considers whether international comparisons provide meaningful insights. Once it has been established that a comparison is useful, iPIPS uses similar methodology to the other international studies to equate scales. In conclusion, although iPIPS explores the development and progress of young children much younger than studies such as TIMSS, PIRLS and PISA, it faces similar issues and takes similar approaches. There are additional challenges which have been largely overcome to ensure the validity of the information produced. It differs from the other studies of early childhood development described earlier, making direct assessments of children’s cognitive development and uses a grade approach to sampling rather than age. It also differs because it produces longitudinal information, assessing students twice in the same year to estimate the amount of progress made in the first year of school.

References Archer, N. (2017). Where is the ethic of care in early childhood summative assessment? Global Studies of Childhood, 7(4), 357–368. https://doi.org/10.1177/2043610617747983 Auld, E., & Morris, P. (2016). PISA, policy and persuasion: Translating complex conditions into education ‘best practice’. Comparative Education, 52(2), 202–229.

5  International Comparative Assessments of Young Children: Debates and Methodology

79

Buzzelli, C.  A. (2018). The moral dimensions of assessment in early childhood education. Contemporary Issues in Early Childhood, 19(2), 154–166. https://doi. org/10.1177/1463949118778021 Carr, M., Mitchell, L., & Rameka, L. (2016). Some thoughts about the value of an OECD international assessment framework for early childhood services in Aotearoa New Zealand. Contemporary Issues in Early Childhood, 17(4), 450–454. Delaune, A. (2019). Neoliberalism, neoconservativism, and globalisation: The OECD and new images of what is ‘best’ in early childhood education. Policy Futures in Education, 17(1), 59–70. https://doi.org/10.1177/1478210318822182 Diaz-Diaz, C., Semenec, P., & Moss, P. (2019). Editorial: Opening for debate and contestation: OECD’s international early learning and child well-being study and the testing of children’s learning outcomes. Policy Futures in Education, 17(1), 1–10. https://doi. org/10.1177/1478210318823464 Dubeck, M. M., & Gove, A. (2015). The early grade reading assessment (EGRA): Its theoretical foundation, purpose, and limitations. International Journal of Educational Developmet, 40, 315–322. Duncan, R. J., Schmitt, S. A., Burke, M., & McClelland, M. M. (2018). Combining a kindergarten readiness summer program with a self-regulation intervention improves school readiness. Early Childhood Research Quarterly, 42, 291–300. https://doi.org/10.1016/j.ecresq.2017.10.012 Eliott, C.  D., & Smith, P. (2011). British ability scales: Administration and scoring manual. Manual 3. GL Assessment. Fernald, L.  C. H., Prado, E., Kariger, P., & Raikes, A. (2017). A toolkit for measuring early childhood development in low- and middle-income countries. Prepared for the strategic impact evaluation fund, the World Bank. 17–27. http://documents.worldbank.org/curated/ en/384681513101293811/pdf/WB-­SIEF-­ECD-­MEASUREMENT-­TOOLKIT.pdf Fischer, V. M., Morris, J., & Martines, J. (2014). Developmental screening tools: Feasibility of use at primary healthcare level in low- and middle-income settings. Journal of Health, Population, and Nutrition, 32(2), 314–326. Gagné, M., Janus, M., Muhajarine, N., Gadermann, A., Duku, E., Milbrath, C., Minh, A., Forer, B., Magee, C., & Guhn, M. (2020). Disentangling the role of income in the academic achievement of migrant children. Social Science Research, 85(July 2018), 102344. https://doi.org/10.1016/j. ssresearch.2019.102344 Good, R. H., Kaminski, R. A., Cummings, K., Dufour-Martel, C., Petersen, K., Powell-Smith, K., & Wallin, J. (2011). DIBELS next assessment manual. Dynamic Measurement Group. Harlen, W. (Ed.). (2004). A systematic review of the evidence of reliability and validity of assessment by teachers used for summative purposes. EPPI-Centre, Institute of Education, University of London. Harlen, W. (2005). Trusting teachers’ judgment: Research evidence of the reliability and validity of teachers’ assessment used for summative purposes. Research Papers in Education, 20(3), 245–270. Hosterman, S. J., DuPaul, G. J., & Jitendra, A. K. (2008). Teacher ratings of ADHD symptoms in ethnic minority students: Bias or behavioral difference? School Psychology Quarterly, 23(3), 418–435. Jang, E. E., & Sinclair, J. (2018). Ontario’s educational assessment policy and practice: A double-­ edged sword? Assessment in Education: Principles, Policy and Practice, 25(6), 655–677. https://doi.org/10.1080/0969594X.2017.1329705 Landry, S. H., Assel, M. A., Carlo, M. S., Williams, J. M., Wu, W., & Montroy, J. J. (2019). The effect of the preparing Pequeños small-group cognitive instruction program on academic and concurrent social and behavioral outcomes in young Spanish-speaking dual-language learners. Journal of School Psychology, 73, 1–20. Lin, P. Y., & Lin, Y. C. (2019). International comparative assessment of early learning in exceptional learners: Potential benefits, caveats, and challenges. Policy Futures in Education, 17(1), 71–86. https://doi.org/10.1177/1478210318819226

80

S. J. Howie

Mackey, G., Hill, D., & De Vocht, L. (2016). Response to the colloquium ‘the organisation for economic co-operation and development’s international early learning study: Opening for debate and contestation. Contemporary Issues in Early Childhood, 17(4), 447–449. Madaus, G. F., & Clarke, M. (2001). The adverse impact of high-stakes testing on minority students: Evidence from one hundred years of test data. In G. Orfield & M. L. Kornhaber (Eds.), Raising standards or raising barriers? Inequality and high-stakes testing in public education (pp. 85–106). The Century Foundation Press. Meisels, S. J. (2004). Should we test four-year-olds? [commentary]. Pediatrics, 113(5), 1401–1402. Meisels, S. J. (2006). Accountability in early childhood: No easy answers. Occasional Paper 6, March 2006. Erikson Institute. Meisels, S. J., & Atkins-Burnett, S. (2006). Evaluating early childhood assessments: A differential analysis. In K.  McCartney & D.  Phillips (Eds.), Handbook of early childhood development (pp. 533–549). Blackwell Publishing. Merrell, C., & Bailey, K. (2012). Predicting achievement in the early years: How influential is personal, social and emotional development? Online Educational Research Journal. http://www. oerj.org/View?action=viewPaper&paper=55 Merrell, C., & Tymms, P. (2016). Assessing young children: Problems and solutions. UNESCO Institute for Statistics (UIS). Montie, J.  E., Xiang, Z., & Schweinhart, L.  J. (2006). Preschool experience in 10 countries: Cognitive and language performance at age 7. Early Childhood Research Quarterly, 21(3), 313–331. https://doi.org/10.1016/j.ecresq.2006.07.007 Moss, P. (2020). The organisation for economic co-operation and development’s international early learning and child well-being study: The scores are in! March 27–28. https://doi. org/10.1177/1463949120929466. Moss, P., & Urban, M. (2017). The organisation for economic co-operation and development’s international early learning study: What happened next. Contemporary Issues in Early Childhood, 18(2), 250–258. https://doi.org/10.1177/1463949117714086 Moss, P., & Urban, M. (2019). The organisation for economic co-operation and development’s international early learning study: What’s going on. Contemporary Issues in Early Childhood, 20(2), 207–212. https://doi.org/10.1177/1463949118803269 Moss, P., Dahlberg, G., Grieshaber, S., Mantovani, S., May, H., Pence, A., Rayna, S., Swadener, B. B., & Vandenbroec, M. (2016). The organisation for economic co-operation and development’s international early learning study: Opening for debate and contestation. Contemporary Issues in Early Childhood, 17(3), 343–351. Nally, A., Healy, O., Holloway, J., & Lydon, H. (2018). An analysis of reading abilities in children with autism spectrum disorders. Research in Autism Spectrum Disorders, 47(July 2017), 14–25. https://doi.org/10.1016/j.rasd.2017.12.002 NSW Department of Education and Training. (2009). https://education.nsw.gov.au/ teaching-­a nd-­l earning/curriculum/literacy-­a nd-­n umeracy/assessment-­r esources/ best-­start-­kindergarten#%3Cspan0 Olmsted, P. P., & Montie, J. (Eds.). (2001). Early childhood settings in 15 countries: What are their structural characteristics? High/Scope Press. Olmsted, P. P., & Weikart, D. P. (Eds.). (1994). Families speak: Early childhood care and education in 11 countries. High/Scope Press. Olmsted, P. P., & Weikart, D. P. (Eds.). (1995). The IEA preprimary study. Pergamon Press. Organisation for Economic Co-operation and Development (OECD). (2001). Starting strong: Early childhood education and care. OECD Publishing. Organisation for Economic Co-operation and Development (OECD). (2006). Starting strong II: Early childhood education and care. OECD Publishing. Organisation for Economic Co-operation and Development (OECD). (2011). Education at a glance: OECD indicators. OECD Publishing. Organisation for Economic Co-operation and Development (OECD). (2015). Universal basic skills: What countries stand to gain. OECD Publishing.

5  International Comparative Assessments of Young Children: Debates and Methodology

81

Organisation for Economic Co-operation and Development (OECD). (2017a). Starting strong 2017: Key OECD indicators on early childhood education and care. OECD Publishing. Organisation for Economic Co-operation and Development (OECD). (2017b). Starting strong V: Transitions from early childhood education and care to primary education. OECD Publishing. Organisation for Economic Co-operation and Development (OECD). (2017c). Early learning matters. The international early learning and child well-being study. Available at: http://www.oecd. org/education/school/Early-­Learning-­Matters-­Project-­Brochure.pdf. Accessed 31 Mar 2018. Organisation for Economic Co-operation and Development (OECD). (2020a). Early learning and child well-being: A study of five-year-olds in England, Estonia, and the United States. OECD. Available at: https://doi.org/10.1787/3990407f-­en. Organisation for Economic Co-operation and Development (OECD). (2020b). Early learning and child well-being in the United States. OECD.  Available at: http://www.oecd.org/education/ school/early-­learning-­and-­child-­well-­being-­study/early-­learning-­and-­child-­well-­being-­in-­the-­ united-­states-­ 198d8c99-­en.htm Organisation for Economic Co-operation and Development (OECD). (2020c). International early learning and child well-being study: A summary of findings. Available at: http://www.oecd.org/ education/school/early-­learning-­and-­child-­well-­being-­study/International_Early_Learning_ and_Child_Well-­ being_Study_Summary.pdf Raudenbush, S. (2005). Newsmaker interview: How NCLB testing can leave some schools behind. Preschool Matters, 3(2), Rutgers University: National Institute of Early Education Research. Reubens, A. (2009). Early grade mathematics assessment (EGMA): A conceptual framework based on mathematics skills development in children. RTI International. Sabanathan, S., Wills, B., & Gladstone, M. (2015). Child development assessment tools in low-­ income and middle-income countries: How can we use them more appropriately? Archives of Disease in Childhood, 100(5), 482–488. https://doi.org/10.1136/archdischild-­2014-­308114 Sellar, S., & Lingard, R. (2013). The OECD and global governance in education, September 2013. Journal of Education Policy, 28(5), 710–725. https://doi.org/10.1080/02680939.2013.779791 Selles, P. A., Âvila, V., Martinez, T., & Ysla, L. (2018). The skills related to the early reading acquisition in Spain and Peru. PLoS One, 13(3), e0193450. Semel, E., Wiig, E. H., & Secord, W. A. (2004). Clinical evaluation of language fundamentals Preschool-2. CELF Preschool-2. Semrud-Clikeman, M., Regilda, A., Romero, A., Prado, E. L., Shapiro, E. G., Bangirana, P., & Chandy, J. (2016). Selecting measures for the neurodevelopmental assessment of children in low- and middle-income countries. Child Neuropsychology, 23(7), 761–802. Torgesen, J., Wagner, R., & Rashotte, C. (1998). Test of word reading efficiency. Pro-Ed. Tymms, P., Jones, P., Albone, S., & Henderson, B. (2009). The first seven years at school. Educational Assessment, Evaluation Accountability, 21, 67–80. Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. https://doi.org/10.1080/0924345 3.2017.1404478 UNESCO. (2017). Global education monitoring report, 2017/8. Accountability in education: Meeting our commitments. In Educational administration quarterly (pp.  1–508). UNESCO. https://doi.org/10.1177/0013161X7401000101 Unwin, A., & Yandell, J. (2016). PISA-envy, Pearson and Starbucks-style schools. New Internationalist, 491, 42–43. Urban, M. (2019). The shape of things to come and what to do about tom and Mia: Interrogating the OECD’s international early learning and child well-being study from an anti-colonialist perspective. Policy Futures in Education, 17(1), 87–101. https://doi.org/10.1177/1478210318819177 Urban, M., & Swadener, B.  B. (2016). Democratic accountability and contextualised systemic evaluation. International Critical Childhood Policy Studies Journal, 5(1), 6–18. http://journals. sfu.ca/iccps/index.php/childhoods/article/view/71/pdf Wechsler, D. (2005). Wechsler individual achievement test (WIAT II) (2nd ed.). The Psychological Corp.

82

S. J. Howie

Weikart, D. P. (Ed.). (1999). What should young children learn? Teacher and parent views in 15 countries. High/Scope Press. Weikart, D.  P., Olmsted, P.  P., & Montie, J. (Eds.). (2003). A world of preschool experience: Observations in 15 countries. High/Scope Press. Wuermli, A. J., Tubbs, C. C., Petersen, A. C., & Aber, J. L. (2015). Children and youth in low- and middle-income countries: Toward an integrated developmental and intervention science. Child Development Perspectives, 9(1), 61–66. https://doi.org/10.1111/cdep.12108 Youmans, A. S., Kirby, J. R., & Freeman, J. G. (2018). How effectively does the full-day, play-­ based kindergarten programme in Ontario promote self-regulation, literacy, and numeracy? Early Child Development and Care, 188(12), 1786–1798. https://doi.org/10.1080/0300443 0.2017.1287177 Zhang, L., Sun, J., Richards, B., Davidson, K., & Rao, N. (2018). Motor skills and executive function contribute to early achievement in East Asia and the Pacific. Early Education and Development, 29(8), 1061–1080. https://doi.org/10.1080/10409289.2018.1510204

Chapter 6

Teachers’ Roles in the Assessment of Young Children Sarah J. Howie

The iPIPS study depends in part, on data collected by teachers. Given that the study focuses on young children who themselves vary significantly (Merrell & Tymms, 2015), as well as across different contexts, there are different approaches required raising a range of issues. Therefore, in this chapter, the role of teachers in the assessment process is discussed. The chapter is split into two parts. The first addresses the general issue of teacher assessments and the second focuses on the roles of teachers in research projects and specifically in the iPIPS project. Teachers have important insight into learners’ development, behaviours and backgrounds and questions of validity and reliability are explored in relation to their assessments and the chapter probes whether it is possible to compare teachers’ ratings (e.g., for Behaviour & Personal and Social Development as used in iPIPS) across schools, countries and regions.

6.1 Introduction There are different approaches to assessing cognitive development as well as children’s behaviour and their social development, including posing questions to children and direct observation of the way that they work and interact within an educational setting (Merrell & Tymms, 2015). In the classroom, teachers are uniquely placed to carry out assessment both in their role as teacher and also in assisting research. In this chapter, these aspects are addressed separately with the first addressing the general issue of teacher assessments and the second focusing on the roles of teachers in research projects and specifically in the iPIPS project. S. J. Howie (*) Stellenbosch University, Stellenbosch, South Africa e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_6

83

84

S. J. Howie

6.2 Teacher Assessments Assessment is not something added onto a lesson or crammed into the lessons at the end of term for summative assessment purposes; it is an integral part of teaching and learning (Bennett, 2010; Black & Wiliam, 1998; James, 2006; van den Akker, 2003). There are a number of critical elements of assessment, including the role of the teacher (see Fig. 6.1) and several related to characteristics of the curriculum and the various levels thereof (see Table 6.1). The implementation of the curriculum includes how the school and its teachers perceive what has to be taught and learnt, the process of teaching and learning as well as the learning experiences of the pupils. The quality of the teacher becomes central to the interpretation of the intended curriculum (van den Akker, 2003) when the curriculum is implemented in the classroom. However, the effectiveness of the teaching and learning is only measured by the extent to which the curriculum is attained. This attainment is both experiential in nature where the learning experiences (participation and attitudes) of the pupils may be evaluated in addition to the learning outcomes (achievement) of the pupils

Role of Teacher

Activity

Aim

Content

Resources

Reporting method

Goal

Assessment Strategy

Role of student

Criteria

Task of teacher

Location Time

Task of student

Fig. 6.1  Elements of an Assessment Strategy. (Source: Howie, 2007)

6  Teachers’ Roles in the Assessment of Young Children

85

Table 6.1  Curriculum levels and their interpretation Intended

Ideal Vision (rationale or basic philosophy underlying a curriculum) Formal/Written Intentions as specified in curriculum documents and/ or materials Implemented Perceived Curriculum as interpreted by its users (especially teachers) Operational Actual process of teaching and learning (also: curriculum-in-action) Attained Experiential Learning experiences as perceived by learners Learned Resulting in learning outcomes of learners Source: van den Akker, 2003:3

(Howie, 2003; van den Akker, 2003). It is the latter that is most often assessed by teachers, standardised tests and systemic tests (Howie, 2012) at the local, national and international levels. The alignment between the intended, implemented and attained curriculum is important but often difficult to achieve. Whilst it may be clear what society wants the children to learn, its achievement is often difficult given the conduit through which it is implemented. The style, type and format of the assessment used in the classroom should be dependent on its purpose and function (Airasian, 1991; Popham, 1995). The rise of the Assessment for/as Learning movements, in which assessment is seen as formative, over the past few decades has sensitised both policymakers and practitioners to a continuum of assessment (Harlen, 2006), which runs from ‘Informal formative’ to ‘formal summative’. This appears logical and comprehensive to researchers but has been found to be more challenging to implement for teachers (Black & Wiliam, 2018). Barnes et al. (2015), Brown et al. (2019), Fulmer et al. (2015) and Xu and Brown (2016) have found that a variety of different factors affect the role of the teacher in their effectiveness both as a teacher and in their ability to assess pupils. These include teachers’ personal characteristics, their pedagogical content knowledge, their beliefs, the pupils’ personal, social and emotional characteristics, the context of the school, policy framework, as well as the subject being assessed. Clearly teachers’ understanding of the purpose and function of assessment is fundamental to how they implement assessment in their classroom.

Teachers’ Conceptions of Assessment Using assessment formatively to improve teaching and learning depends, at least partially, on factors such as the context and policy framework within which teachers function. In high stakes environments, assessment for learning (improvement) may be diametrically opposed to the assessment of learning for accountability purposes (Howie, 2012). Further challenges may arise when the same education policy attempts to use the assessment processes to both improve educational outcomes and to indicate the quality of teaching and student learning (Brown et  al., 2019). These conceptions

86

S. J. Howie

include beliefs about and attitudes toward assessment (Brown et al., 2019). Teachers may also think about assessment as mainly supporting improvement in teaching and student learning whilst accepting that it included making pupils accountable but not teachers and schools (Brown et al., 2019). This is likely to be contrary to the views of policymakers (Howie, 2012). In major reviews of teacher conceptions of assessment (Barnes et  al., 2015; Bonner, 2016; Brown et  al., 2019; Fulmer et  al., 2015), the indications are that teachers are aware of and react to the tension between using assessment for improved outcomes and processes in classrooms, and using assessment to hold teachers and schools accountable for outcomes. In reaction to the increased pressure teachers experience to raise assessment scores in countries such as England, teachers are less likely to see assessment as a formative process (Black & Wiliam, 1998) in which they could discover and experiment with different practices (Brown & Harris, 2009). This contrasts with education systems where consequences associated with assessment are relatively low. For example, in New Zealand there is a significant difference between primary school and secondary school assessment and school-­ based assessments and in Queensland, Australia, Grades 1–10 are an assessment free zone, where Crooks (2010) notes that teacher perceptions of assessment as a formative means, sustains improvement far more. In the USA, perhaps given its federal system, teachers apparently can, and do, hold multiple beliefs about assessment simultaneously (Barnes et  al., 2017) and Barnes et al. (2015) framed teachers’ beliefs on a continuum from an extreme pedagogical to an extreme accounting (accountability) conception. Whilst there are related elements to Harlen’s continuum mentioned earlier, Barnes’ conception places the purpose of the assessment as central (amongst others) in contrast to Harlen’s focus on the mode of assessment (Fig. 6.2). Furthermore, Barnes et al. (2015) found that whilst in all the groups studied, the US teachers indicated that they conceived of assessment as for improvement, and accountability. In some cases, they saw it as irrelevant to their practice. Teachers’ beliefs about assessment were important and it was vital to assist them to make sense of these beliefs given that they may filter and guide their interpretation of assessment information. In addition, an intervention was considered important,

Advance Learning Extreme Pedagogical

The Purpose of Assessment is...

Assessment is...

Student Accountability

Irrelevant

Teacher/School Accountability Extreme Accounting

Fig. 6.2  Conceptions of assessment continuum. (Source: Barnes et al., 2015)

6  Teachers’ Roles in the Assessment of Young Children

87

given the contradictory findings that those teachers perceived assessment as valid for accountability, informing teaching and learning, and simultaneously viewed assessment as irrelevant. The need for the intervention was based on the fact that those teachers held multiple, and sometimes competing beliefs about assessment. These competing beliefs may influence their varied assessment practices and willingness to engage in ongoing professional learning. Similar findings about multiple and conflicting conceptions of assessment were also reported in New Zealand (Brown, 2011). Brown (2008 in Deneen & Brown, 2016) had previously proposed that teachers’ varied conceptions of assessment could be aggregated into four major purposes (that is, informing educational improvement, evaluating students, evaluating schools and teachers and irrelevance). Improvement focuses on assessment informing teachers and students regarding what students need to learn next. The evaluation purpose focused on appraising pupils’ performance against standards, assigning scores/grades and awarding qualifications. School evaluation included using assessment results to determine the performance of teachers and schools. Irrelevance was rejecting assessment as having meaningful connection to learning or believing it to be bad for students. Understanding teachers’ experiences and conceptions of assessment are central to interventions to improve effective assessment. These experiences and conceptions are related to their assessment literacy, as described further below.

Assessment Literacy of Teachers The importance of teachers’ capabilities in undertaking assessments in classrooms and assessment’s role in pupils’ learning is increasingly highlighted in the literature (Baird et al., 2014; Brown & Remesal, 2017; Black et al., 2006; Black & Wiliam, 1998, Merrell & Tymms, 2015; Tymms, 2015). In several parts of the world, there have been reports about the weak/under-developed knowledge and skills in assessment practices of some teachers (Black and Wiliam (1998, 2018); Januário (2007); Zimmerman et al. (2011). These studies, include one of Greek and Cypriot primary school language teachers (Tsagari, 2016) who took a summative approach towards evaluating their pupil’s performance and a lacked clarity about the purposes and implementation of formative assessment. This was largely attributed to a lack of professional training in assessment. According to Stiggins (1991), the assessment literacy of teachers requires • the possession of knowledge about the basic principles of sound assessment practice, including its terminology; • the development and use of assessment methodologies and techniques; and • familiarity with standards of quality in assessment. Teachers’ possession and acquisition of such capabilities, which have been expanded to meet growing demands of education, have been the subject of considerable research over the past 40 years. Originally, teacher education did not include much

88

S. J. Howie

formal training in assessment (Stiggins, 1991). However, in the 2000s, this changed as global assessments drove the need for change in classroom practice and assessment as learning was increasingly prioritised for teacher education and teacher practice (Stiggins, 2006), at least in the Western world. Seemingly, a third or current stage of assessment literacy is underway which seeks to “synthesize the best priorities of earlier stages and prepare teachers to negotiate increasingly complex tensions and demands” (Deneen & Brown, 2016:3). This stage is seen as balancing tensions (Howie, 2014; Brown & Harris, 2016) as teachers navigate between different requirements and demands on their capabilities. Furthermore, there is growing recognition that increasing technological developments of digitally adaptive assessments can potentially make teachers’ work more challenging and highly demanding (Selwyn et  al., 2017). Secondly, there are increasing demands within curricula, leading to concerns that teachers may find reaching the required levels of assessment literacy or quality assessment practice challenging (Looney et al., 2018). Others (such as Rea-Dickins, 2007; Deneen & Brown, 2016) believe that assessment literacy may not be sufficient to cover the range and complexity of the dimensions of assessment currently demanded (Looney et al., 2018). Additionally, the international context of using assessment as a lever for system and school reform (Howie, 2016), has introduced added ‘complexity’ for teachers to deliver changes in teaching and learning practices (Heitink et al., 2016), in addition to “unintended and undesirable consequences across settings in behaviour and outcomes” (Howie, 2014: p.91) in some environments. Some authors believe that these capabilities are dependent on teachers’ identity as professionals, beliefs about assessment, disposition towards enacting assessment, and perceptions of their role as assessors, which are all significant for their assessment work (Thompson, 1992; Brown, 2011; Looney et al., 2018, Pastore & Andrade, 2019) (Fig. 6.3). Internationally, demands and requirements of teachers’ assessment literacy are evolving and have expanded significantly over the past 30 years. From its origins in teacher knowledge and capabilities for practice (Stiggins, 1991), the concept of assessment literacy has evolved towards integrating these (knowledge and capabilities) with notions of the need to increase confidence of teachers in their own assessment practice in the classroom (Brown, 2011) and with their identity as assessors as well their own self-efficacy in an increasingly demanding education environment and broader society (Pastore & Andrade, 2019). There is some evidence that interventions to increase assessment literacy, and even those that are designed to modern, defensible specifications, may provide gains in skills but knowledge, may not be sufficient If conceptions of assessment are not also enhanced, teacher assessment practices will not be consistently or reliably improved upon (Deneen & Brown, 2016). Courses in assessment must address pre-­ existing conceptions and beliefs and their causes (Brookhart, 2011; Shepard, 2006).

6  Teachers’ Roles in the Assessment of Young Children

89

Three-dimensional model of assessment literacy

National education policy Classroom context

Conceptual dimension

Praxeological dimension

Professional wisdom Socio-emotional dimension

Professional practice

Fig. 6.3  Three-dimensional model of assessment literacy. (Source: Pastore & Andrade, 2019:135)

 eachers’ Experiences of Implementing Assessment Systems T and Accountability As mentioned earlier, there appear to be increasing challenges facing teachers in terms of implementing assessment systems linked to accountability (Scherman et  al., 2014). Studies in the USA reveal teachers’ experience of changes in their roles and in their teaching because of implementing a criterion-referenced, commercial online child assessment system. These have apparently led to increased responsibilities and expanded work for teachers (Kim, 2018). Increasingly, the international movements and discourse around quality and performance are impacting teaching and learning in the early childhood learning space and teaching, with examples provided from the USA (Brown & Weber, 2016; Bullough et al., 2014), Australia (Bown & Sumsion, 2007; Kilderry, 2015; Nuttall et  al., 2014) and the UK (Bradbury, 2012, 2014a,b; Osgood, 2006; Roberts-­ Holmes, 2019). There is growing concern in some quarters about the way assessment requirements are being implemented, and as noted in previous chapters, about the impact of data-driven accountability at early childhood education level (Goldstein & Flake, 2016). Under growing external pressures, teachers internationally are expected to accomplish an increasing number of tasks both inside and outside the classrooms,

90

S. J. Howie

in less time and with fewer resources, to comply with multiplying policy changes and reforms. As a response to work overload, teachers often relied on simplified technological solutions such as pre-packaged curriculum and assessment materials, leading to a process of ‘deskilling’ that involves a growing separation of planning from “execution” in teachers’ work (Apple, 1986:40). In other words, teachers are expected to execute the decisions made by others concerning curricular goals, process, outcomes and assessment criteria. Research in the USA revealed that demands to produce ongoing child assessment data intensified early childhood teachers’ work in the USA, indicating global trends of ‘data-based accountability’ that Roberts-Holmes and Bradbury (2016) described in the England context or ‘paper warfare’ and that Australian teachers experience in the “culture of new managerialism” (O’Brien & Down, 2002:123). With increasing emphasis on data-based performativity in the USA and internationally, it seems likely that the number of early childhood programmes that rely on online assessment solutions will continue to grow.

Teacher Assessment in Research Studies Teachers have important insight into learners’ development, behaviours and backgrounds. Following the discussion about teacher assessment and assessment literacy previously in this chapter, in this section the role of teachers and schools in research projects including international comparative studies are described and discussed. Firstly, the issue of the validity and reliability of teachers’ assessments and their ratings of young children are addressed in relation to their assessments. Thereafter, Sect. 6.3.1 probes whether it is possible to compare teachers’ ratings (e.g., for Behaviour and Personal and Social Development as used in iPIPS) across schools, countries and regions. Finally, the chapter concludes with the experiences of the IPIPS project.

6.3 Validity and Reliability of Teachers’ Assessments and Ratings of Young Children Studies We turn now to the role of teachers and schools in research projects including international comparative studies. At the heart of such studies are the validity and reliability of teacher assessments and ratings, and in particular of young children. Teachers and schools may take on one of three roles in research projects. The first is to provide ratings of children using their own professional knowledge, the second is to administer objective assessment and the third is simply to provide the facilities for independent researchers to carry out assessments. Any project might involve teachers in one or more of these roles.

6  Teachers’ Roles in the Assessment of Young Children

91

In general terms, research projects (should) aim for the most reliable and valid measures that they can obtain, given their budget. The most psychometrically sound data will involve external researchers visiting schools and administering tests. But this only makes sense for some learning outcomes in learning areas such as mathematics and reading. For other outcomes, such as behaviour, short, administered tests are not suitable and ratings by individuals who know the children well are more appropriate. It may also be that teachers administer the objective assessments. Because they are humans, teachers, when asked to provide ratings, are bound to be influenced by their own particular mindsets and by the details and limitations of their knowledge of individual children. As a result, all ratings are limited in their accuracy and are biased, to a greater or lesser extent. This does not make the data unusable, but it does mean that the researchers need to know how accurate the ratings are and how much bias there is. What follows is a discussion of these issues and this is followed by the solutions adopted by the iPIPS project. Assessing the cognitive development of young children in particular is challenging and different approaches and strategies from those used with older children. Younger children are not able to follow the typical administration instructions in group assessments given their reading level (Merrell & Tymms, 2007). They cannot manage the processes required in pencil and paper tests; they also have a limited concentration span, restricting the length of assessments (Sperlich et  al., 2015). They have limited short-term memory restricting and/or eliminating reliable information from complex questions (Demetriou et al., 2015). This issue is picked up Part IV. Several authors, whilst acknowledging the role of teacher rating, have raised a number of concerns. These concerns include for instance, that observations may fail to reveal the child’s full cognitive ability potential. This could happen when the activities in a learning setting may not have challenged them sufficiently to display their understanding of the concepts or the extent or level of their knowledge being assessed (Merrell & Tymms, 2015). Another concern, as mentioned above, lies in possible bias against individuals as well as groups, which have been well reported (Harlen, 2004, 2005; Sonuga-Barke et al., 1993; Wilmut,2005). These concerns may have contributed to the fact that few international studies, prior to iPIPS have attempted to assess young children on a large-scale using rating scales. The Preprimary Project (Montie et al., 2006) (see Chap. 5) was one of the first and attempted to include all major variables in the study based upon significant research of the time (Katz, 1968; Ruopp et al., 1979; Sylva et al., 1980). This led to the decision to include various aspects of setting structure and process, such as group size; staff:child ratio; teachers’ education, experience, and organisation of activities; children’s interactions with adults and other children and children’s activities, some of which are also included in iPIPS. The Observational Study of Early Childhood Programs (Layzer et  al., 1993) revealed the potential of direct observation of ‘process characteristics’ in pre-­ primary classrooms to inform teaching practice and classroom organisation in a way that they felt measures of global quality could not. Using both rating scales provided detailed descriptions of classroom activities and groupings, teacher behaviours and

92

S. J. Howie

interactions, and child activities and interactions. The study concluded that the direct observations were better predictors than measures of global quality of the child behaviours (e.g., task engagement, use of higher-level social strategies) they defined as child outcomes. As process characteristics (such as teaching and learning processes) are considered more difficult than structural characteristics (such as physical resources) to measure in instruments such as questionnaires and testing in groups in this age group (the latter already referred to above), this led to the use of direct observations of teacher and children’s behaviours in the classroom and global ratings scales (Montie et al., 2006). It should be noted though, that this study pre-­ dated the concerns about ratings noted earlier. The issue of the validity and reliability of teacher ratings more recently is highlighted in studies such as IELS (see Chap. 5), when differences between the direct assessment and teacher rating occur. One such example lies amongst the IELS findings on gender and that gender differences were found in all three participating countries. The study noted that girls were found to have stronger emergent literacy and social-emotional skills than boys. Girls scored more highly in empathy and reports from parents and teachers rated girls as having higher trust and prosocial skills and being less disruptive than boys. The direct assessment found no discernible differences between girls and boys in emergent numeracy, in contrast to the ratings of and reports by teachers that girls had higher levels of emergent numeracy than boys (OECD, 2020), suggesting that the concerns of bias raised earlier remain a concern. The use of observations and ratings scales would seem to require accuracy and some training. As mentioned earlier, not all teachers receive assessment training to prepare them adequately for the use of certain rating scales, although most teachers would be expected to be astute and conscious observers of pupils’ learning and behaviours in their classrooms. With the training, a reference point would be required against which the teacher or trainee would be compared and guidelines such as correlation of .70 (Fernald et al., 2017), may be applied to check the rater’s accuracy during the training period and periodic re-evaluation is advised (Fernald et al., 2017). Furthermore, training should include training on the manual related to the instrument, various scenarios, building rapport with the child prior to the administration (the latter most relevant to new teachers to the assessed child or external fieldworkers). An effective strategy for dealing with the reliability of ratings is to include multiple raters where possible, although not all studies and classroom settings may be able to implement this. This allows for the evaluation of ‘inter-rater reliability’ which measures the degree to which scores rated by different teachers/fieldworkers/ assessors differ or agree. Such a strategy may reveal the need for standardisation training of all involved and trainers requiring in-depth knowledge of the rating instrument and fluency in the language of implementation. Standards are also set to be able to measure the degree to which the raters’ agreement is required to reach. Thereafter a decision is made about the level of correlation coefficient to set which would then be regarded as the requirement for inter-rater accuracy. The inter-rater reliability assesses how much the scores between raters align. The intention is

6  Teachers’ Roles in the Assessment of Young Children

93

ultimately to reduce the measurement error or bias by a particular assessor (Fernald et al., 2017). This is mostly achieved by extensive training on the instrument to be administered as well as standardising the administration of the instrument across the group of assessors administering the instrument to achieve accuracy and reliability. This has been done extensively in international studies such as PIRLS 2016 (Howie et  al., 2017) where rating open-ended items required multiple raters in order to ascertain the reliability of the scores assigned. These scores assigned by multiple raters were also captured digitally and quality assured by a third rater to identify and resolve differences between raters. The final data were sent to the international data centre where the data were verified, and the inter-rater reliability coefficients finally calculated. The iPIPS study in South Africa relied on fieldworkers and quality assurers who received intensive training in a standardised way following some of the principles above, which is described at the end of this chapter. A key feature of the iPIPS assessment is teacher ratings of Personal and Social Development and of Behaviour. These are described in Chap. 6 and here we simply add some details about their reliability and validity which were established mainly in the UK. The inter-rater reliability of the ratings at the pupil were first established by asking teachers and classroom assistants in the sample class to provide independent ratings of the same pupils without discussion. The internal reliabilities were established by comparing the responses to items measuring the same construct and these are reported in Merrell and Tymms (2005). Because the Behaviour measure is based on the American Psychiatric Association DSM IV, there are numerous reports which deal with the measures. The predictive validity of the measure can be found in Merrell et al. (2017).

 actors Influencing the Validity and Reliability of Teachers’ F Assessments and Ratings of Young Children As mentioned earlier, there are a number of challenges that arise and adaptations are needed of cognitive assessments (Howie & Chamberlain, 2017) and particularly of young children (Merrell & Tymms, 2015). When assessing non-cognitive assessments, there are additional challenges (Merrell & Tymms, 2015) that emerge for the valid and reliable evaluation of these attributes, as self-report questionnaires are regarded as being only appropriate for children aged 10  years and over (Soto et al., 2011). Experience and knowledge of teachers and trained professionals developed through years of observations and interacting with young children are crucial for valid and reliable assessment. Nonetheless, there may be other difficulties such as unfamiliarity with individual children where there are large classes, and they cannot provide an in-depth assessment, or the trained professional is external to the school.

94

S. J. Howie

Involving parents could complicate the process introducing a risk of bias or not having the necessary skills to undertake the assessment (Merrell & Tymms, 2015). Whilst there may be advantages to teachers rating their young pupils (Harlen, 2005), many may be well qualified and knowledgeable to do so. Nonetheless, there is “evidence of low reliability and bias in teachers’ judgements” (Harlen, 2005 in Tymms, 2015:3). Therefore, there are concerns that a number of factors may affect the judgement of the teachers, as well as others administering the ratings (Tymms, 2015). For instance, there are studies indicating that children of different ethnic groups may be under or over-rated in various educational settings, as has been reported in the UK (Burgess & Greaves, 2009). A significant relationship was found in a German study between pupils’ increased ratings of their work by teachers who shared similar personality traits (Tymms, 2015). In high stakes assessments (Howie, 2012, 2014) such as examinations, steps are often taken to anonymise the pupils’ identity to the markers by using unique identity numbers and avoiding indicators of ethnicity, gender, religion and social background. However, in settings where anonymity is not possible, ratings and assessments of all ages appear to be prone to bias particularly where observational data forms the foundation for the ratings. Given the challenges emphasised previously about the specific nature of young children and the one-on-one interaction needed, anonymity is unrealistic. This suggests that observational data can be used to check on obvious validity issues such as the relationship to age. Proposals have also been made previously that where observation data need to be collected, the characteristics of the observers (such as sex, ethnicity and personality) and those of the pupils to be rated should be matched (Tymms, 2015). It is clear however, that the purpose of the evaluation and its associated methods should be carefully considered for each environment to determine what would be most appropriate and effective.

6.4 iPIPS Methodology for Ensuring Validity and Reliability of the Assessments The iPIPS methodology is based on the fundamental premise that assessing young children is quite different from administering an examination to teenagers for reasons already outlined earlier (see Chap. 1). Building on the experience of PIPS (see Chap. 1), the designers of iPIPS realised that children starting school usually cannot not read or write, he or she has a small, short-term memory; perhaps limited to recalling 2 or 3 items. Furthermore, children of this age have a limited ability to focus on a task and therefore designing an assessment requiring engagement of more than 20 minutes is likely to be problematic. Additionally, the contents of such an assessment required an alternative approach to a test-curriculum matching as there is no universal global curriculum at this age. Predictive validity was a strong consideration when selecting the content and structure of original items and scales. Selecting predictive validity as a significant factor in the decision making is further

6  Teachers’ Roles in the Assessment of Young Children

95

supported by the importance of the first year of schooling and the argument (see Chap. 7 in Part III) that the cognitive priorities of each of two successful development cycles linking preschool with primary school are the best predictors of school achievement in later cycles of schooling. iPIPS is seen as being an important tool to assess possible problems at transitions in learning accurately in order to identify problems and interventions (Demetriou et al., 2017; Kardanova et al., 2014). An adaptative approach comprising a set of scales with multiple items (mini tests) using sequences and stopping rules was designed and extensively trialled focusing on the content and construct validity. The bank of items ranges from the lowest levels of development to high levels expected at older age groups. This permits the testing of wider populations of pupils across contexts allowing for younger pupils with less well-developed skills to older pupils with already well-developed literacy and numeracy skills. The adaptative assessment is time efficient and reduces the amount of stress on the pupils being assessed as easier or more difficult items will automatically appear, following the prescribed sequence and stopping rules. This approach increased the reliability of the administration, removing subjective judgement of the administrator. To capture these data across contexts, the project accounted for different country contexts which would either permit computer adaptative testing on laptops or capturing the data on tablets as described in Parts I and III. The development of the original PIPS tests was trialled extensively (see Chap. 1) where considerable consultation (such as with the NAHT group and the Solihull group) and included involvement of the teachers and researchers in the UK project. The project took cognisance of the initial feedback about some apprehension of assessing young children prevalent in the UK at that time (Chap. 1). Their feedback accompanied the empirical trialling in schools and thereafter, underwent piloting and adaptation across participating country contexts see Chaps. 1 and 7; Merrell & Tymms, 2015; Scherman et al., 2017). The latter included a multi-year project in South Africa where the original PIPS items were trialled in 22 primary schools (Archer et al., 2010; Scherman et al., 2013) and the instruments were extensively contextualised including the redrawing of the pictures and adaptation to the South African context. For the finalisation of the iPIPS instruments for the South African context, similar contextualisation occurred and adaptations were made to items and the accompanying pictures included in the instruments. For example Fig. 1.4 was changed to a South African context with no castle, no masupials and two people whose image not associated with any ethicnity and nor sex. (Howie, 2017; Howie et al., 2016; Tymms et al., 2017). Furthermore, the translation of the UK English instruments into Afrikaans and isiXhosa required extensive work in the verification and validation of the integrity of the translations done. Pre-piloting and piloting were undertaken before the instruments were finalised for data collection. Specific challenges were reported with regard to the isiXhosa translation process. Similar contextualisation processes and adaptations were undertaken in other countries and are reported in Parts IV and V to ensure that the data were valid and reliable for those contexts. The project view was that whilst there are some areas where data could be compared across participating countries, the methodology did

96

S. J. Howie

not permit precise comparisons between countries in general terms (see Part III, Chap. 9 for example the different mechanisms within countries for collecting data) which is evident from the tentative use of the data comparatively and a greater focus on the countries’ national data as rich case studies and a cross case analysis approach for learnings and challenges illustrated by the data. Each country was able to design, based upon a common core of items, a valid and reliable way to design, collect and analyse data within their country that was time-efficient and enjoyable for pupils (see Merrell in Part III). As was noted by Bailey in Part III, “data from England and South Africa have shown us that progress in made in reading and mathematics in both countries but fallen short of comparing those levels of progress”. Coming back to the role of teachers in the iPIPS research, the schools and teachers are always involved. Their support is need for access in the first instance, and their knowledge is needed to assess some constructs such as behaviour, which cannot be tested with a short one-off assessment. Where possible, IPIPS has used researchers (fieldworkers) to collect the data for reading, maths and vocabulary although teachers were used when necessary. Where teachers were used, a researcher has always reassessed a sample of children to check that there are no significant problems. These exercises have been very reassuring with correlations up to 0.98 between the teacher and researcher results. For the ratings of non-cognitive outcomes, IPIPS collected data from two teachers, or teachers and their assistants, asking them not to discuss their rating. The results are generally reassuring but rare, as teachers stand out as having very different perceptions from their colleagues. Whilst participating countries utilised different means to collect their data and ensure the quality thereof, in the iPIPS study in South Africa, fieldworkers were trained intensively in a standardised way, according to a fieldwork manual designed for this purpose, to conduct the one-on-one assessments and collect the additional data required by the study. The fieldworkers were all former teachers of this age group who were experienced in collecting education and assessment data. Furthermore, researchers from the CEA monitored the conduct of the fieldworkers to quality-assure the data being collected. This included for instance, sitting in the same venue where the one-on-one assessments were taking place, as unobtrusively as possible, observing and recording the administration of the test by the fieldworker. The behaviour of both the assessor and pupil were observed. This was done to monitor the comfort levels of the pupils (watching for any excessive anxiety), making sure that pupils had sufficient opportunity to answer fully, noting the professional conduct of the fieldworker, that the procedures followed were correct and followed the previous training. The additional quality assurance provided an opportunity to interrogate the validity as well as the reliability of the data being collected from the young children in the study. Whilst this did add additional costs, it improved the quality of data in some settings when an intervention was needed to improve performance of the fieldworker and in a couple of cases, to remove fieldworkers who were not able to administer the instruments to the required standard. The collected data were uploaded from tablets regularly and scrutinised for any irregularities before being sent for collation and data cleaning (see Tymms et al., 2017 for further information on the study in South Africa). Other chapters in this book

6  Teachers’ Roles in the Assessment of Young Children

97

provide deeper insights into the individual national efforts and strategies to obtain valid and reliable data, the results of which some are reported in this book (see Part III, Parts IV and V). Finally, there is a firm belief that iPIPS was able “to collect reliable and valid data in a short time with young children at varying developmental stages, which is useful to schools and at the same time can provide analyses for policymakers”, the project has shown that this can be done across cultures using suitable adaptations (Merrell & Tymms, 2015: page number 131).

References Airasian, P. W. (1991). Classroom assessment. McGraw-Hill. Apple, M. W. (1986). Teachers and texts: A political economy of class and gender relations in education. Routledge. Archer, E., Scherman, V., Coe, R., & Howie, S. J. (2010). Findings the best fit: Adaptation and translation of the performance indicators for primary schools (PIPS) for the South Africa context. Perspectives in Education, 28(91), 77–88. Baird, J., Hopfenbeck, T.  N., Newton, P., Stobart, G., & Steen-Utheim, A.  T. (2014). State of the field review: Assessment and learning. Report for the Norwegian knowledge Centre for Education. Barnes, N., Fives, H., & Dacey, C. M. (2015). Teachers’ beliefs about assessment. In H. Fives & M. Gregoire Gill (Eds.), International handbook of research on teacher beliefs (pp. 284–300). Routledge. Barnes, N., Fives, H., & Dacey, C. M. (2017). U.S. teachers’ conceptions of the purposes of assessment. Teaching and Teacher Education, 65, 107–116. https://doi.org/10.1016/j.tate.2017.02.017 Bennett, R. E. (2010). Cognitively based assessment of, for, and as learning (CBAL): A preliminary theory of action for summative and formative assessment. Measurement, 8(2–3), 70–91. https://doi.org/10.1080/15366367.2010.508686 Black, P., & Wiliam, D. (1998). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), 139–148. Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy and Practice, 25(6), 551–575. https://doi.org/10.108 0/0969594X.2018.1441807 Black, P., McCormick, R., James, M., & Pedder, D. (2006). Learning how to learn and assessment for learning: A theoretical inquiry. Research Papers in Education, 21, 119–132. Bonner, S.  M. (2016). Teachers’ perceptions about assessment: Competing narratives. In G. T. L. Brown & L. R. Harris (Eds.), Handbook of human and social conditions in assessment (1st ed., pp. 21–39). Routledge. Bown, K., & Sumsion, J. (2007). Voices from the other side of the fence: Early childhood teachers’ experiences with mandatory regulatory requirements. Contemporary Issues in Early Childhood, 8(1), 30–49. Bradbury, A. (2012). Education policy and the “ideal learner”: Producing recognisable learner-­ subjects through early years assessment. British Journal of Sociology of Education, 34(1), 1–19. https://doi.org/10.1080/01425692.2012.692049 Bradbury, A. (2014a). Learning, assessment and equality in early childhood education (ECE) settings in England. European Early Childhood Education Research Journal, 22(3), 347–354. https://doi.org/10.1080/1350293X.2014.912897

98

S. J. Howie

Bradbury, A. (2014b). Early childhood assessment: Observation, teacher “knowledge” and the production of attainment data in early years settings. Comparative Education, 50(3), 322–339. https://doi.org/10.1080/03050068.2014.921371 Brookhart, S. M. (2011). Educational assessment knowledge and skills for teachers. Educational Measurement: Issues and Practice, 30(1), 3–12. Brown, G.  T. L. (2008). Conceptions of assessment: Understanding what assessment means to teachers and students. Nova Science Publishers. Brown, G. T. L. (2011). Teachers’ conceptions of assessment: Comparing primary and secondary teachers in New Zealand. Assessment Matters, 3, 45–70. Brown, G. T. L., & Harris, L. R. (2009). Unintended consequences of using tests to improve learning: How improvement-oriented resources engender heightened conceptions of assessment as school accountability. Journal of Multi-Disciplinary Evaluation, 6(12), 68–91. Brown, G.  T. L., & Harris, L.  R. (Eds.). (2016). Handbook of human and social conditions in assessment (1st ed.). Routledge. Brown, G.  T. L., & Remesal, A. (2017). Teachers’ conceptions of assessment: Comparing two inventories with Ecuadorian teachers. Studies in Educational Evaluation, 55(June), 68–74. https://doi.org/10.1016/j.stueduc.2017.07.003 Brown, C. P., & Weber, N. B. (2016). Struggling to overcome the State’s prescription for practice: A study of a sample of early educators’ professional development and action research projects in a high-stakes teaching context. Journal of Teacher Education, 67(3), 183–202. Brown, G. T. L., Gebril, A., & Michaelides, M. P. (2019). Teachers’ conceptions of assessment: A global phenomenon or a global localism. Frontiers in Education, 4(March), 1–13. https://doi. org/10.3389/feduc.2019.00016 Bullough, R.  V., Hall-Kenyon, K.  M., MacKay, K.  L., & Marshall, E. (2014). Head start and the intensification of teaching in early childhood education. Teaching and Teacher Education, 37, 55–63. Burgess, S., & Greaves, E. (2009). Test scores, subjective assessment and stereotyping of ethnic minorities: Working paper no. 09/221. September. Crooks, T.  J. (2010). Classroom assessment in policy context (New Zealand). In B.  McGraw, P.  Peterson, & E.  L. Baker (Eds.), The international encyclopedia of education (3rd ed., pp. 443–448). Elsevier. Demetriou, A., Spanoudis, G., & Shayer, M. (2015). Mapping mind-brain development. In M. Farisco & K. Evers (Eds.), Neurotechnology and direct brain communication. Routledge. Demetriou, A., Merrell, C., & Tymms, P. (2017). Mapping and predicting literacy and reasoning skills from early to later primary school. Learning and Individual Differences, 54, 217–225. https://doi.org/10.1016/j.lindif.2017.01.023 Deneen, C. C., & Brown, G. T. L. (2016). The impact of conceptions of assessment on assessment literacy in a teacher education program. Cogent Education, 3(1), 1225380. https://doi.org/1 0.1080/2331186X.2016.1225380 Fernald, L.  C. H., Prado, E., Kariger, P., & Raikes, A. (2017). A Toolkit for measuring early childhood development in low- and middle-income countries. Prepared for the Strategic Impact Evaluation Fund, the World Bank. 17–27. http://documents.worldbank.org/curated/ en/384681513101293811/pdf/WB-­SIEF-­ECD-­MEASUREMENT-­TOOLKIT.pdf. Fulmer, G. W., Lee, I. C. H., & Tan, K. H. K. (2015). Multi-level model of contextual factors and teachers’ assessment practices: An integrative review of research. Assessment in Education: Principles, Policy & Practice, 22, 475–494. https://doi.org/10.1080/0969594X.2015.1017445 Goldstein, J., & Flake, J. K. (2016). Towards a framework for the validation of early childhood assessment systems. Educational Assessment, Evaluation and Accountability, 28(3), 273–293. https://doi.org/10.1007/s11092-­015-­9231-­8 Harlen, W. (2004). A systematic review of the evidence of reliability and validity of assessment by teachers used for summative purposes (Research Evidence in Education Library). EPPI-­ Centre, Social Science Research Unit, Institute of Education.

6  Teachers’ Roles in the Assessment of Young Children

99

Harlen, W. (2005). Trusting teachers’ judgement: Research evidence of reliability and validity of teachers’ assessment used for summative purposes. Research Papers in Education, 20, 245–270. Harlen, W. (2006). On the relationship between assessment for formative and summative purposes. In J. Gardner (Ed.), Assessment and learning (pp. 103–118). Sage. Heitink, M. C., Van der Kleij, F. M., Veldkamp, B. P., Schildkamp, K., & Kippers, W. B. (2016). A systematic review of prerequisites for implementing assessment for learning in classroom practice. Educational Research Review, 17, 50e62–50e62. Howie, S. J. (2003). Language and other background factors affecting secondary pupils’ performance in mathematics in South Africa. African Journal of Research in Mathematics, Science and Technology Education, 7(1), 1–20. https://doi.org/10.1080/10288457.2003.10740545 Howie, S.  J. (2007). Assessment practices. Keynote Address at Western Cape Department of Education Conference on Assessment, March 2007. Howie, S. (2012). High-stakes testing in South Africa: Friend or foe? Assessment in Education: Principles, Policy and Practice, 19(1), 81–98. https://doi.org/10.1080/0969594X.2011.613369 Howie, S. J. (2014). High-stakes testing in South Africa: Friend or foe? In T. Eggen & G. Stobart (Eds.), High-stakes testing in education: Value, fairness and consequences. Routledge. Howie, S. J. (2016). Assessment for political accountability or towards educational quality? Keynote speaker. The International Association for Educational Assessment (IAEA) Conference, held 21–26 Aug 2016 in Cape Town South Africa. Howie, S. J. (2017). International Performance Indicators in Primary Schools (IPIPS). Setting the Scene. Presentation to the International Seminar on Young Children’s Growth, Development and Progress, The first year at school in the Western Cape, 24th October 2017, London. Howie, S., & Chamberlain, M. (2017). Reading performance in post-colonial contexts and the effect of instruction in a second language. Policy brief no. 14. International Association for the Evaluation of Educational Achievement, 14, 1–12. http://search.ebscohost.com/login.aspx?dir ect=true&db=eric&AN=ED574333&site=ehost-­live Howie, S. J., Tymms, P., & Combrinck, C. (2016, April). Challenges facing translating international comparative assessment instruments: The case of isiXhosa in South Africa. Presented at the World Educational Research Association conference, Washington, DC. Howie, S., Combrinck, C., Roux, K., Tshele, M., Mokoena, G., & Palane, N. (2017). PIRLS literacy 2016 Progress in international Reading literacy study 2016: South African children’s reading literacy achievement. https://doi.org/10.13140/RG.2.2.33072.61446 James, M. (2006). Assessment, teaching and theories of learning. In J. Gardner (Ed.), Assessment and learning (1st ed., pp. 47–60). Sage. Januário, F. M. (2007). Investigating and improving assessment practices in physics in secondary schools in Mozambique. Unpublished PhD. University of Pretoria. Kardanova, E., Ivanova, A., Merrell, C., Hawker, D., & Tymms, P. (2014). The role of the iPIPS assessment in providing high-quality value-added information on school and system effectiveness within and between countries (Basic Research Program Working Papers). Moscow Higher School of Economics. Katz, L. G. (1968). A study of changes in behavior of children enrolled in two types of head start classes. Unpublished doctoral dissertation. Stanford University. Kilderry, A. (2015). The intensification of performativity in early childhood education. Journal of Curriculum Studies, 47(5), 633–652. Kim, K. (2018). Early childhood teachers’ work and technology in an era of assessment. European Early Childhood Education Research Journal, 26(6), 927–939. https://doi.org/10.108 0/1350293X.2018.1533709 Layzer, J., Goodson, B., & Moss, M. (1993). Observational study of early childhood programs, final report volume I: Life in preschool. U.S. Department Of Education. Looney, A., Cumming, J., van Der Kleij, F., & Harris, K. (2018). Reconceptualising the role of teachers as assessors: Teacher assessment identity. Assessment in Education: Principles, Policy and Practice, 25(5), 442–467. https://doi.org/10.1080/0969594X.2016.1268090

100

S. J. Howie

Merrell, C., & Tymms, P. (2005). Rasch analysis of inattentive, hyperactive and impulsive behaviour in young children and the link with academic achievement. Journal of Applied Measurement, 6(1), 1–18. Merrell, C., & Tymms, P. (2007). What children know and can do when they start school and how this varies between countries. Journal of Early Childhood Research, 5(2), 115–134. Merrell, C., & Tymms, P. (2015). Assessing young children: Problems and solutions (pp. 126–133). UNESCO. Merrell, C., Sayal, K., Tymms, P., & Kasim, A. (2017). A longitudinal study of the association between inattention, hyperactivity and impulsivity and children’s academic attainment at age 11. Learning and Individual Differences, 53, 156–161. Montie, J.  E., Xiang, Z., & Schweinhart, L.  J. (2006). Preschool experience in 10 countries: Cognitive and language performance at age 7. Early Childhood Research Quarterly, 21(3), 313–331. https://doi.org/10.1016/j.ecresq.2006.07.007 Nuttall, J., Thomas, L., & Wood, E. (2014). Travelling policy reforms reconfiguring the work of early childhood educators in Australia. Globalisation, Societies and Education, 12(3), 358–372. O’Brien, P., & Down, B. (2002). What are teachers saying about new managerialism? Journal of Education Enquiry, 3(1), 111–133. https://www.ojs.unisa.edu.au/index.php/EDEQ/article/ view/553/423 OECD. (2020). Early learning and child well-being: A study of five-year olds in England, Estonia and the United States. OECD Publishing. Osgood, J. (2006). Deconstructing professionalism in early childhood education: Resisting the regulatory gaze. Contemporary Issues in Early Childhood, 7, 5–14. Pastore, S., & Andrade, H. (2019). Teacher assessment literacy: A three-dimensional model. Teaching and Teacher Education, 84, 128–138. https://doi.org/10.1016/j.tate.2019.05.003 Popham, W. J. (1995). Classroom assessment: What teachers need to know. Allyn and Bacon. Rea-Dickins, P. (2007). Classroom-based assessment: Possibilities and pitfalls. In J.  Cummins & C.  Davison (Eds.), The international handbook of English language teaching (Vol. 1, pp. 505–520). Springer. Roberts-Holmes, G. (2019). Governing and commercialising early childhood education: Profiting from the international early learning and well-being study (IELS)? Policy Futures in Education, 17(1), 27–40. https://doi.org/10.1177/1478210318821761 Roberts-Holmes, G., & Bradbury, A. (2016). Governance, accountability and the datafication of early years education in England. British Educational Research Journal, 42(4), 600–613. Ruopp, R., Travers, J., Glantz, F., & Coelen, C. (1979). Children at the center: Final report of the National Day Care Study (Vol. 1). Abt Associates. Scherman, V., Howie, V., & Archer, E. (2013). The interface between monitoring performance and how data is used: Striving to enhance the quality of education in schools. In J. MacBeath, C. Sugrue, & M. Younger (Eds.), Millennium goals revisited: A common wealth of learning. Routledge. Scherman, V., Zimmerman, L., Howie, S. J., & Bosker, R. (2014). Setting standards and primary school teachers’ experiences of the process. Perspectives in Education, 32(1), 92–104. Scherman, V., Bosker, R., & Howie, S. (Eds.). (2017). Monitoring the quality of education in schools: Examples of feedback into education systems from developed and emerging economies. Sense Publishers. https://doi.org/10.1007/978-­94-­6300-­453-­4 Selwyn, N., Nemorin, S., & Johnson, N. (2017). High-tech, hard work: An investigation of teachers’ work in the digital age. Learning, Media and Technology, 42(4), 390–405. Shepard, L. A. (2006). Classroom assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 623–646). American Council on Education/Praeger. Sonuga-Barke, E.  J. S., Minocha, K., Taylor, E.  A., & Sandberg, S. (1993). Inter-ethnic bias in teachers’ ratings of childhood hyperactivity. Journal of Developmental Psychology, 11, 187–200.

6  Teachers’ Roles in the Assessment of Young Children

101

Soto, C. J., John, O. P., Gosling, S. D., & Potter, J. (2011). Age differences in personality traits from 10 to 65: Big five domains and facets in a large cross-sectional sample. Journal of Personality and Social Psychology, 100(2), 330–348. Sperlich, A., Schad, D.  J., & Laubrock, J. (2015). When preview information starts to matter: Development of the perceptual span in German beginning readers. Journal of Cognitive Psychology, 27(5), 511–530. Stiggins, R. J. (1991). Assessment literacy. Phi Delta Kappan, 72(7), 534–539. Stiggins, R. (2006). Assessment for learning: A key to motivation and achievement. Edge, 2(2), 3–19. http://ati.pearson.com/downloads/edgev2n2_0.pdf Sylva, K., Roy, C., & Painter, M. (1980). Oxford preschool research project, child watching at playgroup & nursery school (Vol. 2). High/Scope Press. Thompson, A.  G. (1992). Teachers’ beliefs and conceptions: A synthesis of the research. In D. A. Grouws (Ed.), Handbook of research on mathematics teaching and learning (p. 127e146). National Council of Teachers of Mathematics. Tsagari, D. (2016). Assessment orientations of state primary EFL teachers in two Mediterranean countries. Center for Educational Policy Studies Journal, 6(1), 9–30. Tymms, P. (2015). Teachers show bias to pupils who share their personality. The conversation, 25 February 2015, 2–4. http://theconversation.com/ teachers-­show-­bias-­to-­pupils-­who-­share-­their-­personality-­38018 Tymms, P., Howie, S., Merrell, C., Combrinck, C., & Copping, L. (2017). The first year at school in the Western Cape: Growth, development and progress Nuffield Foundation. http://www.nuffieldfoundation.org/sites/default/files/files/Tymms%2041637%20-­% 20 SouthAfricaFinalReport%20Oct%202017.pdf Van den Akker, J. (2003). Curriculum perspectives: An introduction. In J. van den Akker, W. Kuiper, & U. Hameyer (Eds.), Curriculum landscapes and trends (pp. 1–10). Kluwer. Wilmut, J. (2005). Experiences of summative teacher assessment in the UK (Qualifications and Curriculum Authority). QCA. Xu, Y., & Brown, G. T. L. (2016). Teacher assessment literacy in practice: A reconceptualization. Teaching and Teacher Education, 58, 149–162. https://doi.org/10.1016/j.tate.2016.05.010 Zimmerman, L., Howie, S.  J., & Smit, B. (2011). Time to go back to the drawing board: Organisation of primary school reading development in South Africa. Educational Research and Evaluation, 17(4), 215–232. https://doi.org/10.1080/13803611.2011.620339

Part III

Growth of the International Performance in Primary School Indicators Project Christine Merrell

Introduction This section presents children’s development in the first few years of life and progress during the first year of school. Cognitive development is rapid in early childhood and lays the foundations for lifelong development (Shonkoff et al., 2000a), outcomes, including academic achievement (Duncan et  al., 2007), health behaviours (Pagani & Fitzpatrick, 2013) and college graduation (McClelland et al., 2013). Some aspects of development are taught, such as language, reading and mathematics, and others, such as executive functions, develop naturally. These include the higher-order cognitive processes associated with self-regulation, shifting attention and working memory (Melby-Lervåg & Hulme, 2013; Miyake et al., 2000). Social and emotional skills, including pro-social behaviour and emotional competencies have also been shown to develop rapidly in the early years of life (Shonkoff et al., 2000b; Zelazo & Carlson, 2012). Between the ages of 3 and 5  years, children become aware of and begin to understand various aspects of language, including the function of words as representations of objects, actions and concepts, the phonological sounds in words, and relations between word sounds and writing or pictures (Otto, 2013). In mathematics, they develop basic arithmetic skills including beginning to establish a mental number line, understanding differences between quantities and mapping spoken language onto written representations of numbers (Dehaene, 2011). At this age, children learn general inferential and problem-solving skills (Carey, 2009). This early development is discussed in more depth in Chap. 7 of this section, in which a theory of cognitive development is presented.

C. Merrell (deceased) Durham University, Durham, UK

104

III  Growth of the International Performance in Primary School Indicators Project

Round about the age of 5, children often enter formal education but starting ages vary, for example, children can be as young as age four in England (noting that the compulsory starting age in England is the start of the school term following their fifth birthday) and as old as age 7  in Russia and South Africa. Finding out what children know and can do when they start school, during their first school year and beyond, is important for teachers to be able to plan for the development of the curriculum and for policy-makers to reflect on pre-school as well as primary school provision. With the interest in international comparative studies of student performance and countries’ educational policies, it is pertinent to explore whether children’s development in their early years follows similar pathways across countries, cultures and languages and to assess the extent to which it deviates in the first year of school. This section of the book includes three chapters, which examine children’s growth and development alongside countries contexts and school curricula. The first chapter, Chap. 7, ‘Packing 200,000 years of evolution in one year of individual development: Developmental constraints on learning in the first primary school year’ presents a theory for cognitive development and then discusses the educational implications of the theory. This in-depth theoretical perspective forms a useful basis from which to consider the similarities and differences that we see in children starting school in different countries and the progress which they make in their first school year. In the final part of the chapter, Demetriou expands upon the implications of developmental stages for education. It would be possible to link these implications and the information from iPIPS through to policy initiatives following the recommendations of Parker (2013) to bridge the gap between evidence and policy. The second chapter, Chap. 8, ‘Children’s developmental levels at the start of school’ draws upon data collected in the iPIPS study for children’s reading and mathematics development, and their personal, social and emotional development, to explore what they know and can do when they start school. It presents a comparison of children’s early reading and mathematics development, and their personal, social and emotional development at the start of school in some of the different countries where iPIPS projects have been conducted: Australia, Brazil, England, Lesotho, Russia, Scotland and South Africa. ‘Pedagogical ladders’ for early reading and mathematics depict developmental pathways for each of the countries with data indicating new insights from these different countries of children’s development at the start of school that have implications for practice and policy. These countries vary greatly in their cultural contexts and early years policies and make some interesting examples. The third chapter in this section, Chap. 9 entitled ‘Progress made during the first year in school’, reports on the progress made by children during their first year of school in the same countries, and changes in their personal, social and emotional development, and their behaviour. It seeks to explore questions including: What difference does schooling make? How does progress vary across children, teachers and countries? Similarities and differences are discussed in relation to local contexts and curricula.

III  Growth of the International Performance in Primary School Indicators Project

105

References Carey, S. (2009). The origins of concepts. Oxford University Press. Dehaene, S. (2011). The number sense: How the mind creates mathematics. Oxford University Press. Duncan, G. J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., Klebanov, P., Pagani, L. S., Feinstein, L., Engel, M., Brooks-Gunn, J., & Sexton, H. (2007). School readiness and later achievement. Developmental Psychology, 43, 1428–1446. McClelland, M. M., Acock, A. C., Piccinin, A., Rhea, S. A., & Stallings, M. C. (2013). Relations between preschool attention span-persistence and age 25 educational outcomes. Early Childhood Research Quarterly, 28(2), 314–324. Melby-Lervåg, M., & Hulme, C. (2013). Is working memory training effective? A meta-analytic review. Developmental Psychology, 49(2), 270–291. Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41(1), 49–100. Otto, B. W. (2013). Language development in early childhood education. Pearson. Pagani, L. S., & Fitzpatrick, C. (2013). Prospective associations between early long-­term household tobacco smoke exposure and antisocial behaviour in later childhood. Journal of Epidemiology and Community Health, 67(7), 552–557. Parker, I. (2013). Early developments: Bridging the gap between evidence and policy in earlyyears education. Institute for Public Policy Research. Shonkoff, J. P., Phillips, D. A., & National Research Council. (2000a). Communicating and learning. In From neurons to neighborhoods: The science of early childhood development. National Academies Press (US). Shonkoff, J. P., Phillips, D. A., & National Research Council. (2000b). Acquiring self-regulation. In From neurons to neighborhoods: The science of early childhood development. National Academies Press (US). Zelazo, P. D., & Carlson, S. M. (2012). Hot and cool executive function in childhood and adolescence: Development and plasticity. Child Development Perspectives, 6(4), 354–360.

Chapter 7

Packing 200,000 Years of Evolution in One Year of Individual Development: Developmental Constraints on Learning in the First Primary School Year Andreas Demetriou

7.1 Introduction It took humankind about 200,000  years to acquire written language, at about 5000 BC and another 3000 years to start doing formal arithmetic. These extended times in human history are compressed into about the first 3–4 years of a child’s life. Children speak at about 2 years of age. At about 4 years, when they go to primary school in some countries, they are expected to learn to read and write, grasp the basics of arithmetic and grasp concepts distanced from their everyday experience. Obviously, the early primary school years are demanding for children, because learning these skills requires mental effort and persistence. If teachers are to match children’s education to their level of development, it is important to understand the theoretical underpinnings of learning and development. This chapter outlines a cognitive developmental theory of learning that is based on six empirically based assumptions. 1. Learning depends on the state of four central mental processes: Attention control (A), Working memory (W), Awareness (A), and Reasoning (RE), the AWARE set. 2. These processes are always present, but their relative importance varies in different phases of development. Cognitive developmental priorities gradually shift This chapter summarises a theory of cognitive development and discusses its educational implications A. Demetriou (*) Cyprus Academy of Sciences, Letters, and Arts, Nicosia, Cyprus University of Cyprus, Nicosia, Cyprus University of Nicosia, Nicosia, Cyprus e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_7

107

108

A. Demetriou

from executive processes (attention control and working memory) to inferential processes (inductive, analogical, and deductive reasoning) as children develop. 3. Awareness of cognitive processes, (cognisance) is a liaison mechanism connecting executive and inferential processes. With development, it changes reflecting the processes dominating each phase. Early in development, it focuses on perceptions, actions or perceptually-based representations; later it focuses on reasoning and other cognitive processes. 4. General processes do not operate in a void. To function efficiently, general mental processes need to have available the representational and relational units of the domain concerned (see Demetriou & Spanoudis, 2018; Demetriou et  al., 2018a, b); for example, words in a language and amounts in arithmetic. If, for any reason, the units in a domain cannot be recognised, represented accurately and mentally processed, performance in this domain suffers, even if the general mechanisms are intact. 5. Therefore, learning is the product of general and domain-specific processes. Learning would be deficient if either component is deficient. By implication, boosting the one may compensate for deficiencies in the other, if operating above a certain minimum level of sufficiency relative to a learning goal. 6. Developmental priorities are closely related to school learning (see Table 7.1). Research shows that the developmental priorities of each phase are the best predictors of school learning at this or later phases (Demetriou et  al., 2019a, b, 2020a, b). Thus, it is important for school learning to capitalise on the developmental priorities of each developmental phase if it aspires to meet the learning goals associated with progressing to the next phase. Ignoring developmental priorities may deprive learners of the support they need to consolidate the cognitive processes involved to reach the functional level needed to move forward. In addition, it may cause difficulties and delays in grasping and consolidating the concepts and skills of interest. Below I outline cognitive development from early preschool to primary school, which is the focus of this chapter. Then I focus on learning in language and mathematics, highlighting typical and atypical forms of learning and development, such as dyslexia and dyscalculia. Finally, I outline a programme that may better bridge developmental and educational priorities in these domains.

7.2 Changing Developmental Priorities from Preschool to Primary School A long tradition of research suggests that cognitive development occurs in four major developmental cycles starting at birth, 2, 6, and 11 years of age. Each of these boundaries is associated with major changes in developmental priorities: mastering interaction with objects and persons in the first 2 years of life, which culminates in acquiring a vocabulary and speaking at 2 years, mastering mental representations

Inferential control, Rule induction, inductive reasoning

Inferential awareness

Early primary 7–8 years

Late primary 9–11 years

Late preschool 5–6 years

Age/cycle Early Preschool 3–4 years

Understanding that symbol systems are exchangeable

Symbol and representational integration.

Specify when available knowledge or solutions are not enough and look for or invent new ones

Fill in lags in information by inference

School learning priorities Problem-solving Symbol systems priorities Attention control Mastering Explore language and action-based understanding the solutions role of symbols Perceptual Building Learn how to build awareness connections a representation for between symbol a problem, plan, systems do, evaluate

Developmental priorities

Focusing/ abstracting patterns in perceptual structures Attention control and coordination between a represented goal and action

Learning difficulties

Representational mapping and differentiation, map number words on number digits, relevant images of objects. Grasp cardinality and quantitative comparisons Encoding; learn that letters Exploration of the patterns Dyslexia stand for sounds, syllables between 1-digit numbers, Dyscalculia for blocks of sounds, which tens and decades in order to form words, the main units grasp the rules underlying of meaning. Combine words structure of place value into meaningful text-based system representations. Writing Abstract different lines of Master rules of specifying Crystallisation of meaning in texts. Evaluate numeric relations and mental processes the rationale behind a text, patterns and transform them and knowledge into even where the text covers to each other. Recognition of ready-to-draw on multiple events or themes relations between numbers mental skills and and the overall rationale is must extent from integers to concepts not apparent unless analysed rational numbers, complex at several levels. numbers

Acquire basic phonological awareness; pre-writing activities

Priorities in language Priorities in mathematics Implementing basic Early number sense syntactic/ grammatical rules

Table 7.1  Developmental priorities, school learning priorities, and learning across developmental cycles

7  Packing 200,000 Years of Evolution in One Year of Individual Development… 109

110

A. Demetriou

which culminates in the ability to stay focused and process representations in preschool and mastering inferential processes integrating representations in primary school. In psychometric terms, the priorities of each cycle define the nature of general intelligence within it (Demetriou et al., 2017). These cut-offs in years are not fixed boundaries but signs that a dominant mode of thinking starts to fade away and a new mode emerges. The exact age of transition reflects individual differences. Some children, because of genetic, epigenetic, and social reasons, move faster than others through the cycles. This poses a challenging problem for education because some children may be more ready than others to respond to learning tasks (Demetriou & Spanoudis, 2018). In the subsequent section/s, I outline the cognitive profile of the preschool and the primary school years in chronological order.

Episodic Representations Episodic representations are perceptual or action-based abstractions that stand for dominant or recurrent patterns of interaction with the environment, such as visual patterns, sounds in one’s language, and actions such as eating, walking, grasping etc. These abstractions are specific to different forms of perception, such as images or sounds, mostly unnamed, and inflexible beyond the context of their emergence. Thus, they cannot be analysed into components, such as a sequence of events involved in a mental image and a sequence of sounds involved in a sentence describing that image, that may be mapped onto each other to be meaningfully connected (Piantadosi et al., 2018; Xu, 2019). Eventually, these crude representations are connected to language at about the age of 2 years. At this age, they are sufficiently stable and accurate, and the acoustic pattern of their respective words are sufficiently stable in working memory so that they can be connected. When transformed into language, they are gradually disembodied from their episodic context and generalised. As a result, infants start to scan representations intentionally, search for specific elements in them, align them and use them to guide their action. This is the beginning of the next cycle.

Realistic Representational Thought By the end of the second year, pre-schoolers reach a highly symbolic stage, focusing on the refinement and interlinking of episodic representations. Children are thus interested in symbols, learn them quickly, use them extensively in their interactions with objects and persons and develop awareness of their existence and functions. This representational insight (‘I can imagine the world’, ‘I can think of my parents, of my toys, etc.’, and ‘I can try my thoughts out on objects) makes representational control an important challenge. At this phase, representations are central as information generated by the senses needs to be evaluated and used. If the information is

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

111

relevant to a represented goal of action, it is processed but if it is not, it is inhibited or ignored. Thus, control of attention is the major developmental priority of the cycle. Gradually, pre-school children develop the ability to inhibit responding to irrelevant stimuli in order to respond to goal-relevant stimuli. In addition, pre-­ schoolers become increasingly able to set up action plans requiring a shift between actions, such as sorting objects according to one rule, such as by colour and then shifting to another rule, such as shape, if needed (Zelazo, 2015). This implies that children have developed executive control which enables them to represent a major goal such as sorting objects into piles or alternative options such as colour if red, shape if square, and shifting between them as specified. Executive possibilities co-exist with awareness of representations. Pre-schoolers are aware of representations stored in their memory (Paulus et al., 2013), that representations emerge from perception, and that one’s own perceptions shape one’s own representations. This is what is known as children’s Theory of Mind (ToM) (Demetriou et al., 2019a; Spanoudis et al., 2015; Wellman, 2014). Pre-school children integrate information by reasoning, but they are neither aware of the inferential processes involved nor they can control them at will. For instance, seeing dad’s car outside the house they may infer that ‘dad is in’ but they cannot yet justify why. Thus, they may map representations onto each other and start abstracting their relations. For instance, they can map counting, number name sequence, and ordinal and cardinal values of object sets onto each other and abstract the concept of number (Carey, 2009; Gelman, 1986; Siegler & Braithwaite, 2017). All in all, building realistic representational thought from 2 to 6 years, generates the mental background for school learning.

Rule-Based Thought With attentional control established, links between representations take priority. This complexity presents a new developmental challenge: to identify relations between representations and to organise them so that they can be called upon efficiently to strengthen understanding and interaction. Thus, inferential control is the major developmental priority in this cycle because it links representations. Inductive and analogical inference is the major mental tool in understanding relationships between representations and their use for building new concepts expressing these relations (Gentner & Hoyos, 2017). This allows mental fluency that is evident in many seemingly different mental tasks, such as using metaphors in language (e.g., he is fast like lightning), and arithmetic (e.g., 2/4 is the same as 3/6). Explicit deductive reasoning also emerges in this cycle. For instance, children may infer that if ‘A implies B’ then two possibilities are necessarily true: When A occurs then B occurs too and when B does not occur then A may not occur. Obviously, these tasks are central for learning all subjects taught in primary school. Thus, in the cycle of rule-­ based thought, attention control is turned inwards to connect conceptual spaces (e.g., object categories), activate space-specific instances indexing underlying

112

A. Demetriou

relations and inter-relate them according to specific conceptual or procedural constraints. For instance, children in early primary school may search their memory to name different instances of a category (e.g., apple, banana, pear, etc. for the category of ‘fruits’ or say words starting with a specific letter, such as box, banana, baby, etc., for the letter ‘b’). Within this cycle, reasoning develops in parallel with awareness of underlying inferential processes and of the cognitive processes involved in different reasoning tasks. Interactions between reasoning development and related awareness becomes explicit at 8–10 years of age (Kazi et al., 2019), indicating that inferential choices become the object of reflection allowing for optimising conclusions. In turn, this generates awareness of inferential control: inference may take alternative roads depending on the representations connected and how they are connected. This awareness may be used to arrange representations according to the rule at hand, such as sequences of executive actions to formulate a plan, extrapolate dimensions in inductive reasoning tasks according to the relations involved or deduce conclusions from premises in deductive reasoning tasks (Kazi et al., 2019).

7.3 Learning in Language and Mathematics Efficient learning in different domains depends on two major dimensions: aligning the demands of learning in each school subject with developmental priorities of the age concerned and facilitating the representation of information as required in each domain. Language or arithmetic cannot be learned without the four AWARE general processes specified in the introduction. To identify words in one’s own language and grasp the rules of grammar and syntax inter-relating them, children must recognise recurring patterns of sound, store them in memory and match them with earlier memories; the aim is to recognise similarities and differences between word forms (e.g., sometimes words end in -ed) and relate these with actions (e.g., understanding that words ending in -ed refer to actions which took place in the past) (Dehaene, 2010). To grasp number, children must discriminate between individual objects, recognise their spatial or other relations organising them into amounts (of something), keep these patterns in memory and compare them so that their relations standing for numerical operations may be made (Dehaene, 2011; Gelman, 1986). Recent research suggests that even very early sensitivity to number, which was ascribed to an innate ‘number sense’, reflects the operation of general learning mechanisms that capture key characteristics of numerosity in the perceptual field, such as a particular aggregation of objects which gives rise to numerical interpretations (Testolin et al., 2020). For instance, having 1 RED and 2 GREEN toy cars in the visual field may attract attention to the numerical difference between the two sets, generating an early understanding of number relations (there are MORE green cars). Early reading and arithmetic learning are related with each other and they are both related with reasoning development (Demetriou et al., 2017). With development, emphasis and importance gradually shift from early reading and arithmetic to

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

113

reasoning. If, for any reason, units in a domain, such as words in language or amounts in arithmetic, cannot be recognised or represented accurately and mentally processed, performance in the specific domain involved would suffer, even if general mechanisms operate well in other domains. For instance, if amounts are represented inaccurately, learning arithmetic would be difficult. Specific learning difficulties will arise if the teaching is too far ahead of a child’s development. For instance, delays in mastering the developmental priorities in preschool (attention control, representational awareness, and a minimal flexibility in analysing and inter-relating representations) would hinder learning the symbolic skills required for reading or arithmetic in early primary school. Deficient attention control would prevent children from accurately registering and encoding letters or numbers; deficient representational awareness would hinder them from understanding that written words stand for the words of oral language, or that numerals stand for number words and related quantities. Delays in mastering rule-based thought in primary school, such as difficulty in rule induction, may cause difficulties in reading comprehension or in implementing arithmetic operations on numbers. Therefore, deficiencies must be diagnosed, and children helped in those specific areas to enable them to be ready to learn, and to systematically direct learning to strengthen mental development.

Language Reading involves three hierarchical levels (Kintsch & Rawson, 2007) that align with the cycles of development described above. The first is the encoding level that is dominated by perceptual attentional processes. Users of alphabetic writing systems learn that letters stand for sounds, composed into syllables standing for blocks of sounds, which form words which are the main units of meaning. These are composed into sentences according to grammatical and syntactic rules which signify meaning. Learning to read starts circa 5  years, sometimes earlier, and typically extends over the next 3 years. At circa age 5, children start to recognise script or words as symbols for speech and they are involved in pre-writing activities aiming at mastering the basic skills required for writing. Learning to read (and write) is one of the major goals of the first two primary school years. In terms of the present model, learning to recognise letters and compose them into syllables and words is heavily based on the major developmental priorities of the cycle of realistic representations. Firstly, understanding that written words stand for spoken words. Secondly, attention control is important at the early stages of learning to read, when focusing and shifting are required for the integration of letters into words and words into sentences. Deficiencies in control of attention hinder learning to read even after intelligence (IQ), hyperactivity, and other behavioural problems are controlled (Franceschini et al., 2012). These difficulties are universal; they were found in different languages, such as Arabic (Friedmann & Haddad-Hanna, 2014) and Chinese

114

A. Demetriou

(Chung & Ho, 2010), implying that they derive from central cognitive processes, rather than the specifics of a particular language. Thirdly, representational awareness, expressed here as phonological awareness and comprehension monitoring, allows children to monitor, reflect on and evaluate their comprehension to reprocess and re-visit the text, if necessary. Notably, phonological awareness and reading are reciprocally related: phonological awareness enhances literacy development which results into further growth of phonological awareness (McCardle et al., 2001). Kim (2015) indicated that comprehension monitoring and theory of mind were the strongest predictors of reading performance at the age of 6 years and that the relation between working memory and listening comprehension was mediated by comprehension monitoring and theory of mind. General representational awareness operates as a top-down guide directing visual search and the integration of mental units into meaningful symbolic ensembles such as alphabetically-­composed words or numbers. The second level goes beyond the technical level of the recognition and production of words to gaining meaning from them. At this level, a text-based representation is constructed, drawing on language and cognitive processing mechanisms. This text-based representation is based on vocabulary and syntactic knowledge enabling the reader to process the meaning of words and phrases in the text. At this level, text-based representation may be literally based on the words involved rather than on both the words themselves and the rules of syntax that indicate the specific meaning of the words in the present context; thus, the meaning grasp may be inaccurate or incoherent. The third level goes beyond words, standing for the ability to distil propositions from sentences in order to abstract a mental model of the situation described by the text (Kintsch & Rawson, 2007:211). Therefore, this level draws primarily on rule-based thought. As expected, reading comprehension by the age of 8 to 9 years, at the second level, is predicted best by working memory and fluid intelligence (Demetriou et al., 2019a, b; García-Madruga et al., 2014).

Mathematics An Approximate Number System (ANS) (Butterworth, 2005; Dehaene, 2011) is the background for the development and learning of mathematics in the fashion that natural language is the background for the development of reading and writing. The core of the ANS is subitisation, the ability to perceptually identify automatically the number of elements in a set of up to three to four elements. Subitisation is present in a child from infancy (Dehaene, 2011). Subitisation develops into a mental number line in early childhood. The number line allows approximate comparisons between numbers. Thus, the mental number line is the core of the ANS. For most individuals, numbers on the mental number line are ordered from left to right according to magnitude. The number line allows approximate comparisons between numbers; the accuracy of these comparisons decreases with increasing number

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

115

magnitude or decreasing distance between numbers. For instance, it is easier to judge that 27 is larger than 23 than to judge if 727 is larger than 723. The ANS and the mental number line develop throughout childhood and adolescence (Dehaene, 2011), being established between 3 to 5 years and covering only small numbers between 1 and 10. In this cycle, children can also map number words on small sets of up to three dots. They cannot map number words on arrays of four to six dots, dots with digits or number words with digits, nor can they use arithmetic operations (Benoit et  al., 2013). Obviously, they have a global representation of quantities limited to no more than three to four elements. The number line extends to 100 between 5 and 7 years. At this phase, representations from the different representational spaces become accessible as distinct mental entities that can be aligned. For instance, 4-year-olds map both number words and number digits with arrays of up to six elements but they do not map number words on digits. At 5 years, children map all representations with each other for all sizes. They can also compute additions and abstractions on numbers smaller than 10, using their fingers as tools for counting. Counting at this stage includes both reciting a series of numbers and understanding a symbol as an index of a quantity. It is interesting to specify the cognitive parameters of the development of this mathematical thinking. Research shows that attention control and phonological awareness are the main predictors of learning and performance in arithmetic. Specifically, inhibitory control, flexibility in shifting and planning during preschool have been found to account for substantial variability in fluency in executing arithmetic addition and subtraction and reading and comprehension. These associations between executive functions and arithmetic persist even after controlling for individual differences in general cognitive ability and reading achievement (Clark et al., 2010). The relationship between executive functions at age 4 and learning mathematics at age 6 years is similar to the relationship between these functions and reading. Recent research shows that phonological awareness predicts both early reading and early arithmetic; similarly, numerical recognition predicts performance in both domains (Vanbinst et al., 2020). This pattern suggests that the critical factor is representational awareness in general rather than awareness about a specific notational system.

 yslexia and Dyscalculia: Understanding Developmental D Learning Difficulties A recent meta-analysis of 680 studies shows that fluid intelligence (reasoning) is moderately related to reading and mathematics (r is circa .4) throughout childhood and adolescence and this relation increases over time; also, learning to read and do arithmetic relies on fluid intelligence early on and, as schooling progresses, learning in these two subjects boosts fluid intelligence. This means that the two domains are inter-related moderately (r  =  .42) throughout school. However, each of the two

116

A. Demetriou

domains also have very special characteristics which are related to specific learning difficulties in each but not the other. In the domain of language, about 20% of children in early primary school face strong difficulty in learning to read and write; a proportion of them, about 5–10%, meet the requirements of dyslexia, a serious condition interfering with every aspect of school life. These children face serious problems in phonological processing, which underlies the translation of letters into sounds and their integration into meaningful words. About the same number of children face similar difficulties in the learning of mathematics. Specifically, about 5% of children present developmental dyscalculia (Reigosa-Crespo et al., 2012). These children “have a poor intuitive sense of quantity, … poor understanding of more and less, and slow learning of Arabic numerals, number words, and their meanings” (Chu et al., 2013:9). For instance, they have difficulty in enumerating small sets of up to nine elements, compare small magnitudes, such as 5 to 7 and 7 to 5 between, and do simple mental arithmetic by adding or subtracting numbers between one and nine. Butterworth (2005) claims that dyscalculia is caused by a ‘defective number module’ which hinders accurate encoding of numerocity: that is, representing one as a quantity of one, two as a quantity of two, three as a quantity of three, etc. This deficit makes learning to count difficult because counting words would lack the exact corresponding representations. In turn, this would hinder the functioning of both the ANS and the learning of rules underlying the relationship between quantities. For instance, there is evidence that numbers on the number line overlap in dyscalculic children, causing difficulties in number comparisons (Mussolin et al., 2010). Evidence suggests that reading difficulties, including dyslexia, and mathematics difficulties, including dyscalculia, are caused by specific representational deficits in each, which are separate from each other. In reading difficulties, the phonological system does not have the resolution required for letter recognition and their composition in words. In arithmetic difficulties, the numerocity coding system is not precise enough to allow building representations for different quantities. Children with dyscalculia have problems in associating Arabic numerals with their representations of magnitude but not usually problems in associating letters with phonemes; dyslexics face problems with letter and digit recognition and naming but not usually problems with magnitude processing, symbolic or non-symbolic (Rubinstein & Henik, 2006).

7.4 Outline for the FOSTIR Programme The model outlined above has several clear implications for education and assessment. These implications are specified in the FOSTIR programme. This programme aims to consolidate the functioning of the processes underlying learning and capitalise on them by aligning school learning with the changing developmental priorities of learning. This programme is designed to train children to Focus (F), Search representations and concepts of interest (S), Try new concepts and solutions (T),

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

117

Infer relations across concepts and relations (I) and Reflect on them vis-à-vis goals, experiences and future plans (R). Notably, FOSTIR (ΦΩΣΤΗΡ) in Greek means ‘wise’ (Demetriou & Spanoudis, 2018; Demetriou et al., submitted). The programme involves a diagnostic module and a learning module. The diagnostic module accurately maps the cognitive profile of children in all processes above. Specifically, first, it addresses attention control with speeded performance tasks examining ability to focus on and respond to specific stimuli resisting interference from irrelevant stimuli. Secondly, it addresses working memory and integration processes which enable an individual to mentally combine component representations, such as words and sentences and numbers and the results of number operations. Thus, it addresses inductive, analogical and deductive reasoning along the lines outlined above. Finally, it addresses mental awareness, including awareness and self-evaluation of mental processes, linguistic and numerical awareness. Examples of tasks and norms related to these processes are presented in our recent book (Demetriou & Spanoudis, 2018) and several technical papers (e.g., Kazi et al., 2019). It should be noted here that standardisation and tuning of these tasks to specific national, social and school environments is needed, if the information that it may generate can be maximally useful to guide the operation of the FOSTIR programme for individual children. Several studies show that this programme is highly effective in accelerating development of reasoning (Christoforides et al., 2016) and mastering relational thought in mathematics (Papageorgiou et al., 2016). To implement the FOSTIR programme in school environments, instruction should lead children to engage in activities that sharpen and automate each of the processes discussed here. The general scheme for this kind of instruction with some examples, is outlined below. To exercise attention control, pre-schoolers and early primary school children are asked to identify things of interest amidst the variety of objects and events usually present in the environment and focus on them, resisting distraction. Tasks may involve activities of identifying various stimuli, including objects and sounds, among other, relevant and irrelevant, similar or different, stimuli. These exercises are embedded in set-ups of increasing complexity and associated with hierarchies of behavioural activities that come under increasingly pre-planned mental control. Exercising working memory is related to the activities addressed to attention control, as outlined above. Children are systematically directed to ‘think of these things’ and their own actions, hold them in mind and relate them with other things of relevance or with past knowledge for as long as required to make sense of them, given their context. In doing so, they are encouraged to resist temptation to look at or think of other irrelevant things. To practise representational awareness, children are trained to reflect on these activities where the aim is to identify possible sources of difficulty in thinking and understanding. They also realise that alternative representations may stand for the same thing and for each other. Examples may be drawings, photographs, constructions, etc. Modern technology is obviously helpful. Children also practise understanding that knowledge may come from different sources, such as perception,

118

A. Demetriou

others, books, films, etc. Thus, one’s knowledge may vary from other persons’ knowledge depending on each person’s contact with the world, perceptual or other. To improve problem solving, children realise how their representations may guide their behaviour. For instance, when jumping from the sofa onto the floor, a child needs to think of his or her current position, the floor, the distance between them, and possibly objects spread on the floor. Thus, children need to plan their actions (e.g., contraction of feet and hand in a specific way before jumping), specifically direct the body to fall at a specific spot and then evaluate if the jump was precise, if it was painful, etc., to improve next time. To practise inference, children make connections by inference, whether information is missing, and conclude whether they fit or deviate from what was known or believed so far. They may extrapolate a series of things according to a property, such as size or colour, make connections across groups of objects by analogy, or make inferences by deduction. Children learn that they can ‘decipher’ missing representations from other ones once their common referent is known. Thus, they learn how to recruit available knowledge, look for new knowledge if available knowledge does not suffice, and use reasoning to relate, bridge and fill in lack in representations vis-­ à-­vis a problem (as in arithmetic problem solving, text comprehension, etc.), which are important for problem solving. In one of our studies, we showed that building awareness of different logical schemes and facility in transforming them into relevant mental models consolidated rule-based thinking at the age of 8  years and enabled children to grasp the logical principles underlying each logical rule (Christoforides et al., 2016).

Early Learning in Language and Mathematics Integrated remedial programmes in language and arithmetic learning are organised to provide experiences over all levels of learning specified above. Children practise executive control, raise awareness of representations at different levels of resolution and exercise reasoning in each of them. In language, children acquire phonological, syntactic, grammatical and semantic awareness and map oral language onto written language at each of these levels of awareness. For instance, in developing phonological awareness, children become aware that words are composed by the use of specific sounds. For instance, that the word ‘cat’ is composed of the sounds /c/, /a/, and /t/. In grammatical awareness, they become aware that components of words stand for different meaning, such as the time of action in the past or present: ‘the cat chases the ball’ vs. ‘the cat chased the ball’. In syntactic awareness, children become aware that word-order defines meaning and it is usually mandatory (with rules varying across languages). For instance, we can say ‘the cat chased the ball’ but we cannot say ‘the chased cat the ball’. They also become aware that syntax and semantics do not always coincide. For instance, the sentence ‘the ball chased the cat’ is syntactically correct but semantically wrong, unless other intentions are conveyed by the speaker. Thus, language learning also involves the inferential mechanisms

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

119

needed to integrate sound patterns and production of written symbols and sentence production into semantically meaningful structures. Training metaphorical, analogical and deductive reasoning so that thinkers can access and systematically operate on the processes involved, may be a means to this end. In mathematics, even before the age of 3 years, children do have a number sense: they recall number names from memory, they may count from one, associating counting with number names, and they may express judgements about relative differences between numbers. The curriculum in mathematics for 4 to 7-year-old children aims to develop number sense and build on it to facilitate understanding of numbers, operations on them, and the use of strategies for solving problems involving numbers. The main objective of the pre-school mathematics curriculum at the age of 3 to 4 years is to develop quantitative and numerical awareness. Children are instructed to quantify reality by noting how changes in aggregations of objects changes their quantities and how these changes may be numerically named. They are also able to quantify groups of objects according to properties, such as colour, size, shape, etc., and build correspondences between dimensions. A first understanding of the relationship between numbers is mastering global quantitative comparisons, such as ‘more’, ‘less’ and ‘same as’. Understanding quantity comes next, when they grasp cardinality: i.e., they understand that the last number stated when counting a set of objects, represents the number of objects in the set. At the ages of 4 to 6 years, emphasis is given on aligning and connecting complementary representations of number, such as number names (e.g., one, two, three … ten), corresponding symbols (e.g., 1, 2, 3, …, 10, respectively), and the quantities represented (e.g., one thing, two things, three things, …, ten things, respectively). The number line typically extends to 100 between 5 and 7 years, but it may come later for many children. At this phase, representations from different representational spaces become accessible as distinct mental entities that can be aligned but alignment is not fluid yet. For instance, 4-year-olds map both number words and number digits with arrays of up to six elements but they do not map number words on digits. At 5 years, children map all representations with each other for all sizes. Therefore, children at this age need experiences in using a variety of tools and resources (e.g. number lines, base ten blocks, math stories) in order to realise that the number system is pattern-based (3  +  5  =  8 and 83  +  5  =  88). Children use manipulatives and pictures to represent mathematical concepts before the introduction of symbols, thereby building the necessary links between the mathematical concepts and the symbolic language which may stand for them. The introduction of symbols as a means of communication and representation is important for pre-­ school education, because it allows the consolidation of representational thought. A quantity of seven objects can be represented by seven objects, by seven pictures, by seven images, or by seven points on a line. They can also compute addition and subtraction on numbers smaller than ten, using their fingers as tools for counting or other manipulatives or images. The use of different representations for the same concept facilitates representational awareness and abstraction underlying rule-induction.

120

A. Demetriou

At the age of 6 to 7  years, emphasis shifts to the exploration of the patterns between one-digit numbers, tens and decades in order to grasp the rules underlying structure of the place value system (Kilpatrick et al., 2001). For instance, TEN is a very important anchor for children at this age because it helps them to remember the combinations that make 10 (3 + 7) and also that two-digit numbers, such as 12, is divided into 10 and 2, and 25 may be decomposed into 2 x 10 and 5 units. Number sense is enriched by the value of place value. A numeral represents the number symbol, the word, the placement on a number line, a quantity and a place value position (e.g., 2 can mean 2, 20, 200 in respect to the place value). Thus, 6 to 7-year-old children grasp that a general rule connects different aspects of number; also, that knowing this rule allows transforming numbers to each other, according to this rule. Grasping this rule requires that 7 to 8-year-old children need opportunities to experience counting to 100 and to establish a link between the numbers and their visual representations as numerals. They are expected to use their knowledge about the order of numbers on the number line in order to conduct basic calculations. Ideally, 7 to 8-year-old children are able to calculate the cost of items which may be bought with a given sum of money and can calculate the best estimate of the sum or difference of two two-digit numbers. They also show understanding of the associative property of addition; the connection between two-step word problems and their corresponding numerical expressions. Obviously, mathematics involves many other concepts and skills going further than natural numbers and numerical operations. For instance, children extend the skills discussed here to fractions and decimals. These present special difficulties because operations on them differ from operations on natural numbers. However, learning in these aspects of mathematics goes beyond the scope of this chapter.

7.5 Conclusions We summarised a theory of cognitive development unifying cognitive developmental, psychometric and educational research on intellectual development and learning. This theory postulates that cognitive development occurs in cycles of representational expansion and reorganisation, allowing an increasingly accurate multidimensional representation of the world. Each new form of representation poses a new problem of mental control, such as attention at pre-school, inference in childhood and logical control in adolescence. Mental awareness is an important component of this process, reflecting the cognitive processes coming under control in each cycle. Control is exercised according to the symbol systems used; symbol systems stand for different levels of mental complexity and may express the same aspects of reality at different levels of resolution. Commanding a learning domain requires facility with the symbol systems involved; for instance, acoustic patterns standing for spoken words, visual patterns standing for written words or quantities, etc. If grasping and representing patterns in the environment is deficient, children’s learning would also be deficient, as in speech delays at the transition from infancy

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

121

to early pre-school or reading and arithmetic difficulties at the transition from pre-­ school to secondary school. This is indicated by individual differences in mastering major developmental tasks at transitions between developmental cycles. All major transitions are associated with fast learning of a new symbol system: speech at the transition between episodic and realistic representations; reading and writing at the transition between realistic representations and rule-based thought; highly specific idiosyncratic symbol systems at the transition between rule-based and principle-based thought. Accurate assessment of possible problems at transitions is an important first step in identifying sources of learning problems that need special programmes to be dealt with. iPIPS is an important tool in the service of these aims (Demetriou et al., 2017; Kardanova et al., 2014).

References Benoit, L., Lehalle, H., Molina, M., Tijus, C., & Jouen, F. (2013). Young children’s numerical judgment of arrays, numberwords, and digits. Cognition, 129(1), 95–101. Butterworth, B. (2005). Developmental dyscalculia. In J. I. D. Campbell (Ed.), Handbook of mathematical cognition (pp. 455–467). Psychology Press. Carey, S. (2009). The origin of concepts. Oxford University Press. Christoforides, M., Spanoudis, G., & Demetriou, A. (2016). Coping with logical fallacies: A developmental training program for learning to reason. Child Development, 87, 1856–1876. Chu, F. W., van Marle, K., & Geary, D. C. (2013). Quantitative deficits of preschool children at risk for mathematical learning disability. Frontiers in Psychology, 4, 195. Chung, K. K., & Ho, C. S. (2010). Second language learning difficulties in Chinese children with dyslexia: What are the reading-related cognitive skills that contribute to English and Chinese word reading? Journal of Learning Disabilities, 43, 195–211. Clark, C. A., Pritchard, V. E., & Woodward, L. J. (2010). Preschool executive functioning abilities predict early mathematics achievement. Developmental Psychology, 46(5), 1176–1191. Dehaene, S. (2010). Reading in the brain: The new science of how we read. Penguin. Dehaene, S. (2011). The number sense (2nd ed.). Oxford University Press. Demetriou, A., & Spanoudis, G. (2018). Growing minds: A general theory of intelligence and learning. Routledge. Demetriou, A., Merrell, C., & Tymms, P. (2017). Mapping and predicting literacy and reasoning skills from early to later primary school. Learning and Individual Differences, 54, 217–225. Demetriou, A., Makris, N., Kazi, S., Spanoudis, G., & Shayer, M. (2018a). The developmental trinity of mind: Cognizance, executive control, and reasoning. WIREs Cognitive Science, 2018, e1461. https://doi.org/10.1002/wcs.1461 Demetriou, A., Makris, N., Kazi, S., Spanoudis, G., Shayer, M., & Kazali, E. (2018b). Mapping the dimensions of general intelligence: An integrated differential-developmental theory. Human Development, 61, 4–42. https://doi.org/10.1159/000484450 Demetriou, A., Kazi, S., Spanoudis, G., & Makris, N. (2019a). Predicting school performance from cognitive ability, self-representation, and personality from primary school to senior high school. Intelligence, 76, 101381. https://doi.org/10.1016/j.intell.2019.101381 Demetriou, A., Makris, N., Tachmatzidis, D., Kazi, S., & Spanoudis, G. (2019b). Decomposing the influence of mental processes on academic performance. Intelligence, 77, 101404. https://doi. org/10.1016/j.intell.2019.101404 Demetriou, A., Kazali, E., Kazi, S., & Spanoudis, G. (2020a). Cognition and cognizance in preschool predict school achievement in primary school. Cognitive Development, 54, 100872. https://doi.org/10.1016/j.cogdev.2020.100872

122

A. Demetriou

Demetriou, A., Kazi, S., Spanoudis, G., & Makris, N. (2020b). Cognitive ability, cognitive self-­ awareness, and school performance: From childhood to adolescence. Intelligence, 79, 101432. https://doi.org/10.1016/j.intell.2020.101432 Demetriou, A., Greiff, S., Makris, N., Spanoudis, G., Panaoura, R., & Kazi, A. (submitted). Bridging educational priorities with developmental Priorities: Towards a developmental theory of instruction. Franceschini, S., Gori, S., Ruffino, M., Pedrolli, K., & Facoetti, A. (2012). A causal link between visual spatial attention and reading acquisition. Current biology, 22(9), 814–819. Friedmann, N., & Haddad-Hanna, M. (2014). Types of developmental dyslexia in Arabic. In E. Saiegh-Haddad & R. M. Joshi (Eds.), Handbook of Arabic literacy, literacy studies (Vol. 9, pp. 119–151). Springer. García-Madruga, J. A., Vila, J. O., Gómez-Veiga, L., Duque, G., & Elosúa, M. R. (2014). Executive processes, reading comprehension and academic achievement in 3th grade primary students. Learning and Individual Differences, 35, 41–48. Gelman, R. (1986). The child’s understanding of number. Harvard University Press. Gentner, D., & Hoyos, C. (2017). Analogy and abstraction. Topics in Cognitive Science, 9, 672–693. https://doi.org/10.1111/tops.12278 Kardanova, E., Ivanova, A., Merrell, C., Hawker, D., & Tymms, P. (2014). The role of iPIPS assessment in providing high quality value added information on school and system effectiveness within and between countries basic research program working paper. National Research University, Higher School of Economics. Kazi, S., Kazali, E., Makris, N., Spanoudis, G., & Demetriou, A. (2019). Cognizance in cognitive development: A longitudinal study. Cognitive Development, 52, 100805. https://doi. org/10.1016/j.cogdev.2019.100805 Kilpatrick, J., Swafford, J., & Findell, B. (2001). Adding it up: Helping children learn mathematics. National Academy Press. Kim, Y. S. G. (2015). Developmental, component-based model of reading fluency: An investigation of predictors of word-reading fluency, text-reading fluency, and reading comprehension. Reading Research Quarterly, 50(4), 459–481. Kintsch, W., & Rawson, K. A. (2007). Comprehension. In M. J. Snowling & C. Hulme (Eds.), The science of reading: A handbook (pp. 209–226). Blackwell. McCardle, P., Scarborough, H. S., & Catts, H. W. (2001). Predicting, explaining, and preventing children's reading difficulties. Learning Disabilities Research & Practice, 16(4), 230–239. Mussolin, C., Mejias, S., & Noël, M.-P. (2010). Symbolic and nonsymbolic number comparison in children with and without dyscalculia. Cognition, 115, 10–25. Papageorgiou, E., Christou, C., Spanoudis, G., & Demetriou, A. (2016). Augmenting intelligence: Developmental limits to learning-based cognitive change. Intelligence, 56, 16–27. Paulus, M., Proust, J., & Sodian, B. (2013). Examining implicit metacognition in 3.5-year-old children: An eye-tracking and pupillometric study. Frontiers in Psychology: Cognition, 4, 145. https://doi.org/10.3389/fpsyg.2013.00145 Piantadosi, S. T., Palmeri, H., & Aslin, R. (2018). Limits on composition of conceptual operations in 9-month-olds. Infancy, 23, 1–15. https://doi.org/10.1111/infa.12225 Reigosa-Crespo, V., Valdes-Sosa, M., Butterworth, B., Estevez, R., Rodrıguez, M., Santos, E., Torres, P., Suarez, R., & Lage, A. (2012). Basic numerical capacities and prevalence of developmental dyscalculia: The Havana survey. Developmental Psychology, 48, 123–135. https:// doi.org/10.1037/a0025356 Rubinstein, O., & Henik, A. (2006). Double dissociation of functions in developmental dyslexia and dyscalculia. Journal of Educational Psychology, 98, 854–867. Siegler, R. S., & Braithwaite, D. W. (2017). Numerical development. Annual Review of Psychology, 68, 12.1–12.27. https://doi.org/10.1146/annurev-­psych-­010416-­044101 Spanoudis, G., Demetriou, A., Kazi, S., Giorgala, K., & Zenonos, V. (2015). Embedding cognizance in intellectual development. Journal of Experimental Child Psychology, 132, 32–50.

7  Packing 200,000 Years of Evolution in One Year of Individual Development…

123

Testolin, A., Zou, Y., & McClelland, L. L. (2020). Numerosity discrimination in deep neural networks: Initial competence, developmental refinement and experience statistics. Developmental Science, 23, e12940. https://doi.org/10.1111/DESC.12940 Vanbinst, K., van Bergen, E., Ghesquière, P., & De Smedt, B. (2020). Cross-domain associations of key cognitive correlates of early readingand early arithmetic in 5-year-olds. Early Childhood Research Quarterly, 51, 144–152. Wellman, H. M. (2014). Making minds: How theory of mind develops. Oxford University Press. Xu, F. (2019, June 10). Towards a rational constructivist theory of cognitive development. Psychological Review. Advance online publication. https://doi.org/10.1037/rev0000153. Zelazo, P. D. (2015). Executive function: Reflection, iterative reprocessing, complexity, and the developing brain. Developmental Review, 38, 55–68.

Chapter 8

Children’s Developmental Levels at the Start of School Christine Merrell

8.1 Introduction At the start of school, children’s development has already been influenced by their home environment (see Chaps. 10 & 11) and in addition to this important context, systematic reviews have suggested that early childhood development programmes can make a positive difference to later cognitive, social and emotional outcomes (e.g. Anderson et al., 2003; Tanner et al., 2015). In this context, programmes were based in centres (public schools or child development centres) which provided a different physical and social environment to the home, although a few involved a home visit component. Head Start was one such programme. These reviews found positive impacts for children from deprived backgrounds and benefits from attending the centre-based programmes in combination with nutritional supplemental programmes. Both reviews pointed to gaps in the literature, particularly in relation to social and emotional outcomes. The home environment is an important and distinct environment, sitting separately to a broader context of local and national policies, and global initiatives such as the United Nations’ Sustainable Development Goals (SDGs). The 2030 Agenda for Sustainable Development sets out 17 SDGs within which, SDG 4 focuses on ensuring inclusive and equitable quality education and promoting lifelong learning opportunities for all. The SDGs were adopted by all member states in 2015 (United Nations, 2015). In a report on the progress towards the SDGs (United Nations, Children’s developmental levels, as measured at the start of school in a range of countries, are described. These include cognitive levels as well as personal, social and emotional development.

C. Merrell (deceased) Durham University, Durham, UK © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_8

125

126

C. Merrell

2019a), it was noted that by 2017, two-thirds of children globally had accessed ‘organised learning’ in the year before the official school entry age. However, the rate was lower than 50% of children in sub-Saharan Africa and in least developed countries (United Nations, 2019a) where there remained a lack of basic infrastructure and facilities to provide effective learning environments. In this chapter, we look at children’s early language, reading and mathematics development, as well as at their personal and social development (where available) at the start of school in seven countries which have contrasting contexts and policies, and different ages for starting school; Australia, Brazil, England, Lesotho, Russia, Scotland and the Western Cape of South Africa. The data come from pupils assessed with iPIPS upon entry to school except for Brazil when the assessments were conducted at the end of pre-school (the later section on Brazil contains a more extended explanation of the rationale behind this).

8.2 The iPIPS Assessment The historical development of the iPIPS assessment is described in Part 1. Currently, it comprises four main parts: (i) cognitive development; (ii) personal, social and emotional development; (iii) behaviour; and (iv) physical development. Projects in different countries have used some or all of these parts, and some have also added parental questionnaires to collect more information about home environments. Cognitive development and personal, social and emotional development levels at the start of school are discussed in this chapter. These assessments generally take place within the first few weeks of children starting school and are repeated at the end of the first school year.

8.3 Cognitive Development Vocabulary, early reading and early mathematics are assessed by an adult working with one pupil at a time. Each of these sections has a number of sub-sections and, in total, the assessment of cognitive development takes around 20  minutes per pupil to administer. Pupils are asked a series of questions within each section, presented in order of increasing difficulty. When they get a certain number of questions wrong in a section, the assessment moves on to the next section. This process, whereby the route through the assessment adapts in response to the pupils’ ability levels, is known as ‘sequences with stopping rules’. The assessment is delivered using either computer software (Australia, England, Scotland and Russia), a booklet of questions along with an App which guides the adult through the sequences and stopping rules and records the pupils’ responses (Brazil and South Africa), or

8  Children’s Developmental Levels at the Start of School

127

a booklet with paper record sheets (Lesotho). We have found each format to provide reliable and valid information in a time-efficient way and, importantly, to be enjoyable to pupils. The vocabulary part assesses pupils’ receptive vocabulary by firstly asking them to point to objects within a scene. In the versions used in Lesotho and South Africa, the vocabulary section is extended to include questions about the names for parts of the body. The easiest items are presented using a kitchen scene. The items such as knife, fork and pan are familiar to most young pupils. The scenes required local modification but retained as many features in common with other language and country versions as possible. In the early reading part, children are asked to write their full name from memory. This is followed by sections to assess ideas about reading (e.g., Can a pupil distinguish between someone who is reading and someone who is writing, pick out some writing, and can they point to a letter of the alphabet?). This section is based upon Marie Clay’s research into concepts about print (Clay, 1989). There are also sections on phonological awareness (repeating words and recognition of rhyming words), letter and word recognition, reading words and comprehension. One of the pictures used in the ideas about reading section is shown in Fig. 8.1 for the Lesotho version, which is administered in Sesotho. For all versions, the people in the pictures are drawn to be gender and ethnically neutral. The early mathematics part includes sections to assess ideas about mathematics (e.g., Does a pupil understand concepts such as biggest and least?), counting and

Fig. 8.1  Ideas about reading scene from Lesotho version

128

C. Merrell

numerosity (e.g., Can a pupil count a small number of objects and then recall the number counted?), number identification (naming written numbers), informally presented arithmetic and more advanced, formally presented calculations. An example from the informally presented arithmetic section starts by showing pupils a picture of three balls and asking, “Here are three balls. If I took one away, how many would be left?” For addition, they will see pictures such as two rabbits and are asked, “Here are two rabbits. If I add one more to the picture, how many would there be?” In this format, subtraction is found to be easier for children to answer than addition. For more information about the content of the cognitive development assessment and the research underpinning each section, see Tymms (1999a) and for the predictive validity of the cognitive parts of the iPIPS assessment, see Tymms (1999b), Tymms et al. (2009, 2018).

8.4 Personal, Social and Emotional Development Personal, social and emotional development is assessed by teachers completing a rating scale based upon their observations of their pupils during the first few weeks of school. There are eleven areas: 1. Comfortable (how comfortable the pupil is at being left in the school setting). 2. Independence (how independent the pupil is, for example able to dress unaided). 3. Confidence. 4. Concentration – Teacher-directed activities. 5. Concentration – Self-directed activities. 6. Actions (the extent to which the pupil considers the consequences of his/her actions). 7. Relationship to peers. 8. Relationship to adults. 9. Rules (adherence to rules in social situations). 10. Cultural awareness. 11. Communication (ability to communicate non-verbally and verbally). The teacher rates each pupil for each of these areas on a scale of 1–5. The assessment contains a descriptor for each point on the scale for each area. For example, for Confidence, the scale is: 1 (Very hesitant, does not join in group activities, rarely talks), 2 (Fairly hesitant, reluctant to join in group activities or talk), 3 (Will join in group activities or talk when prompted), 4 (Quite confident, keen to join in group activities or talk within the school setting), 5 (Very confident, keen to participate in group activities within school). For more information about the content of the personal, social and emotional development part of iPIPS, and the association with later attainment for pupils in England, see Merrell and Bailey (2008) and for pupils in Serbia, see Aleksic et al. (2019).

8  Children’s Developmental Levels at the Start of School

129

8.5 Developmental Levels at the Start of School The chapter now moves on to consider the cognitive and personal, social and emotional developmental levels of pupils starting school in the countries listed in the introduction. I provide an overview of the context of three of those countries (South Africa, Lesotho and Brazil) and report a selection of findings from the iPIPS assessment conducted at the start of the first school year. These three countries have some similarities in their contexts, containing particular areas of significant deprivation and in their current aim of achieving the United Nation’s Sustainable Development Goal 4 (inclusive and equitable quality education for all). Australia, England and Scotland use English as their mother tongue and in the case of England and Scotland, the age of starting school is younger than for many countries. The children in Russia are older at the start of school and the Russian language uses a Cyrillic alphabet, which makes an interesting comparison.

South Africa Context In South Africa, children enter compulsory education (Grade 1) in the year they turn 7 years of age, although they may enter at a younger age. Policy is currently underway to make a Reception year (the year prior to Grade 1) compulsory. In 2019, in his state of the nation address, the President announced that responsibility for early childhood development centres would be transferred from the social development sector to the Department for Basic Education, and that the department would implement the provision of 2 years of compulsory early childhood provision before entering Grade 1. Net enrolment is 99% at primary level. By 2017, it was reported that well over 90% of children in the year below the compulsory school starting age were participating in organised learning, however this includes private fee-paying settings. Just 33% of children aged 6 attended a Grade R class in public primary schools (Statistics South Africa, 2019). In a recent report of poverty rates using UNICEF’s MoDA methodology, it was found that around 62% of children across South Africa as a whole, were multidimensionally poor but the percentage was lower for children in the Western Cape at 34% (Maluleke, 2020). iPIPS Sample The iPIPS study took place in 2016 in the Western Cape province of South Africa. It was coordinated by the Centre for Evaluation and Assessment (CEA) at the University of Pretoria. The Western Cape was chosen because it has a well organised administration and the senior administration were keen to participate. Despite

130

C. Merrell

it being one of the highest performing provinces in South Africa for pupils’ reading and mathematics achievement, there was considerable variation within, making it an interesting region to evaluate. The project was funded by the Nuffield Foundation and participation of schools and pupils was voluntary. A representative sample of 3000 pupils starting Grade 1 was drawn from 112 public schools across three urban districts. Approximately equal numbers of pupils were assessed in Afrikaans, English and isiXhosa in the schools where those languages were the medium of instruction. Although many languages are spoken in the Western Cape, these are the languages of instruction in the province. Trained research staff conducted the cognitive parts of iPIPS and teachers were asked to rate the personal, social and emotional development of their pupils after observing them for the first few weeks of the school year. Teachers completed personal, social and emotional development ratings for about half of participating pupils. The average age of pupils assessed at the start of Grade 1 was 6.8 years (Standard Deviation 0.5). The distribution of ages was almost normal, indicating that more younger and older children were in the first year than the chronological age would imply. This did not differ greatly for the three languages of instruction. There was a small number of older children than would be expected in the sample, with the oldest being 9.7 years. There are two possible reasons for this: Grade 1 is the grade that is repeated most often in the South African education system and the population movements mean that some children arrive late in the year. (See Tymms et al. (2017) for the full report of the project). Cognitive Development The Rasch person reliabilities for the vocabulary part of the iPIPS assessment were 0.75, 0.82, and 0.64 for Afrikaans, English and isiXhosa respectively. Scales were constructed for the reading and mathematics parts which included all languages and the Rasch person reliabilities were 0.73 for reading and 0.78 for mathematics. For the vocabulary section, almost all pupils assessed in Afrikaans could point to items commonly found in a kitchen such as a fork, knife and pan, and at the difficult end of the scale virtually no pupils could identify the body parts joint, tendon and bicep. This was echoed in the groups assessed in English and isiXhosa. Whilst it is interesting to note the vocabulary development for children learning in different languages, it is difficult to make valid comparisons because of variation in difficulty of words in different languages (see Part 2 and Chap. 9). However, it was found that the majority of pupils in each language group had acquired at least a basic functional vocabulary of high-frequency words at the start of school with the vocabulary of many being more advanced. Pupils’ levels of development are represented on pedagogical ‘ladders’. Rasch measurement was used to establish the stages and ‘rungs’ of the ladder. Figures 8.2 and 8.3 show the percentage of pupils at each stage of the ladder at the start of school across the Western Cape. All three language groups are reported together.

8  Children’s Developmental Levels at the Start of School

Reading Ladder th

5 Rung Can read with understanding.

131

% of Pupils 2%

th

4 Rung Can read simple sentences aloud.

4%

rd

3 Rung Can recognise some high frequency words.

33%

nd

2 Rung Can name leers of the alphabet.

57%

1st Rung Can point to someone who is wring and someone who is reading in a classroom scene. Can point to a word in a classroom scene but not read it.

4%

Fig. 8.2  Reading at the start of school, Western Cape Province, South Africa

Mathematics Ladder

% of Pupils

5th Rung Advance: Examples include 15+21=

1%

4th Rung More difficult calculations: Examples include 7+3= 8-3=

24%

3rd Rung Simple formal calculations: Children were able to do more formally presented arithmetic such as what is three more than seven? What is two more than six?

29%

2nd Rung Name numbers up to 10 and do informally presented arithmetic such as ‘here are three balls, if I took one away how many would be left?’

35%

1st Rung Count to four and understood vocabulary such as biggest and smallest.

11%

Fig. 8.3  Mathematics at the start of school, Western Cape Province, South Africa

132

C. Merrell

Over half of the pupils assessed had a sound understanding of concepts about print (first Rung) and could name letters of the alphabet. A third of the sample could also read some high-frequency words. This involved decoding rather than reading fluently with comprehension. Nearly half of the pupils in the sample had a working knowledge of the mathematical vocabulary associated with concepts such as ‘biggest’, ‘smallest’, ‘most’, ‘least’ and could identify single digits and perform simple formally-presented arithmetic. A quarter of pupils could also perform more difficult calculations, which were presented in a formal format. Personal, Social and Emotional Development

Teacher rangs

Pupils’ personal, social and emotional development were rated by their teachers on a scale ranging from 1 to 5. The mean scores for the start of year are shown in Fig. 8.4. Pupils’ levels of comfort and independence at school were given the highest ratings by their teachers. The scale point descriptor for comfort was: ‘never upset on separation from carer at the start of the session; comfortable for most of the time during the session; has no difficulty coping with transitions’. The scale point descriptor for independence was: ‘independent of others for most of the time but still needs occasional support; copes well with most clothing and personal activities’. The lowest ratings were given for cultural awareness. The scale point description for cultural awareness was: ‘is aware that they are a member of a wider community within their local neighbourhood and pre-school setting; talks about experiences relating to those environments’. For communication, the scale point 5 4.5 4 3.5 3 2.5 2 1.5 1

Categories Fig. 8.4  Personal, social and emotional development mean scores, Western Cape Province, South Africa

8  Children’s Developmental Levels at the Start of School

133

descriptor was: ‘begins to combine statements to present a coherent argument or explanation; spoken sentences are generally a combination of ideas and not usually grammatically correct’. Even these lowest average ratings of 2.7 on the 1–5 scale showed a level that would enable pupils to interact and communicate within the classroom setting.

Lesotho Context Lesotho is an independent country much of which is mountainous, situated within South Africa with a population of around two million. In Lesotho, there are three years of pre-primary education, which has an official entry age of 3 years, followed by primary education, with an official entry age of 6 years. In 2017, 30% of eligible children were enrolled for pre-primary education and for primary school, the enrolment rate was 87%. There was regional variation with poorer districts and families recording lower rates of enrolment (United Nations, 2019a, b). A study of child and adolescent poverty in Lesotho, using the UNICEF Multiple Overlapping Deprivation Analysis (MODA) methodology found that in 2018, 65% of all children and adolescents (aged 0–17 years) were multi-dimensionally poor (Ministry of Development and Planning/UNICEF, 2018). High investment in primary education reflected the Government’s priority of providing free primary education but by contrast, spending on early childhood care and development was low at 0.36% of the Ministry of Education and Training Annual Budget in 2018 (Ministry of Finance/UNICEF, 2018). According to a report by the World Bank (2019), Lesotho had achieved a student-teacher ratio of 33 in primary schools, although this is the mean and there is variation in class sizes. iPIPS Sample The iPIPS project in Lesotho was a pilot study funded by the David and Elaine Potter Foundation, which took place in 2019, and involved class teachers assessing the cognitive development of 15 children in their classes. In total, 180 children from 12 state-funded primary schools in Maseru were assessed at the start of school (Grade 1). This was not a representative sample but gives an interesting insight into children’s development at the start of school within the capital city of Lesotho and shows the possibilities for a further study with a representative sample. Participation was voluntary. Given the often-large class sizes in primary schools in Lesotho, teachers were asked to select a sample of pupils by using their position on the register; every fourth pupil on the register was sampled to achieve a selection of 15 pupils per class. This sampling approach was intended to avoid selection bias. The assessment

134

C. Merrell

format was booklet and paper record sheets. It was administered in the Sesotho language. Teachers attended workshops to learn about the assessment, how to administer it and how to interpret the results. The record sheets were designed to help the teachers navigate the adaptive assessment easily and to immediately be able to see what each pupil knew and could do, to inform their pedagogy. For each of vocabulary, early reading and early mathematics, the record sheet was set out with the items in a column of ascending order of difficulty, the easiest item at the bottom of the column and the most difficult at the top. Teachers put a tick next to the item if the child answered correctly and a cross if incorrectly. It was easy to see when a child had answered a number of questions incorrectly to stop that section and move onto the next. Since the items were in order of increasing difficulty, the teachers could see what kind of items the child was able to answer and which were too difficult for them. At the end of year assessment, the teachers used the same record sheet and re-assessed the child from the stage where they had given wrong answers at the start of the year, thus not repeating items which were very easy for them. Teachers completed the cognitive development parts of the assessment but not the personal, social and emotional development. The average age of participating pupils at the start of Grade 1 was 6.1  years (Standard Deviation 0.5). The youngest was 5 and the oldest was 7.5. Cognitive Development The internal reliabilities for the vocabulary, reading and mathematics parts of the iPIPS assessment were 0.80, 0.89 and 0.96 respectively (Rasch person reliabilities from start and end of Grade 1 data combined). Pupils’ levels of development are represented on a developmental ‘ladder’. Rasch measurement was used to establish the stages and ‘rungs’ of the ladder. Figures 8.5, 8.6 and 8.7 show the percentage of pupils at each stage of the ladder at the start of school, noting that these were informed by the data from 146 pupils who were assessed at both the start of school and the end of the first year. At the start of school, all of the pupils assessed understood words that were commonly used in the home. The vocabulary acquisition of the vast majority was more developed. Five percent of pupils could identify parts of their body that involved less frequently used vocabulary. A rich vocabulary supports learning about the world and is linked to academic progress and social skills (Beck et  al., 2013; Lonigan & Shanahan, 2009; Sparapani et al., 2018; Tymms et al., 2018). In the sample of children starting school in Lesotho, the majority were equipped with a functional level of vocabulary that would enable them to access the early stages of learning to read. Pupils started school with some understanding of the activities of reading and writing, and some understanding of concepts of print. Around half of the pupils assessed could name some letters of the alphabet but almost nobody could read any words. This more formal learning of reading may start in pre-school or the home in other countries but given the modest percentage of attendance at pre-school in Lesotho and the low educational level across the country, it is unlikely that many children are helped to read before they start school.

8  Children’s Developmental Levels at the Start of School

Vocabulary Ladder

135

% of Pupils

5th Rung Children could point to parts of their body including bicep, tendon and knuckle.

5%

4th Rung Children could point to parts of their body including chin and forehead.

11%

3rd Rung Can point to pictures of saxophone and yacht from a toy room scene.

11%

2nd Rung Can point to pictures of petal, lizard and kite from a countryside scene.

59%

1st Rung Can point to pictures of fork, carrots, bowl, knife from a kitchen scene.

14%

Fig. 8.5  Vocabulary at the start of school, Lesotho

Reading Ladder

% of Pupils

5th Rung Can read with understanding.

0%

4th Rung Can read simple sentences aloud.

0%

3rd Rung Can recognise some high frequency words.

1%

2nd Rung Can name letters of the alphabet.

53%

1st Rung Can point to someone who is writing and someone who is reading in a classroom scene. Can point to a word in a classroom scene but not read it. Fig. 8.6  Reading at the start of school, Lesotho

46%

136

C. Merrell

Mathematics Ladder

% of Pupils

5th Rung Advanced: Examples include 15+21=

0%

4th Rung More difficult calculations: Examples include 7+3= 8-3=

1%

3rd Rung Simple formal calculations: Children were able to do more formally presented arithmetic such as what is three more than seven? What is two more than six?

23%

2nd Rung Name numbers up to 10 and do informally presented arithmetic such as ‘here are three balls, if I took one away how many would be left?’

67%

1st Rung Count to four and understood vocabulary such as biggest and smallest.

9%

Fig. 8.7  Mathematics at the start of school, Lesotho

A high percentage of pupils were starting school with a good level of conceptual understanding of number and arithmetic, and were able to do informally presented calculations. This conceptual understanding could have been developed through everyday activities involving counting and calculation helped, perhaps informally, by family members and carers. Almost a quarter of the sample was also able to tackle simple formally presented calculations.

Brazil Context Got to here since 2009, pre-school from age 4 has become part of compulsory education in all states in Brazil and universal access to pre-school was scheduled to be in place by 2016. In 2018, the net enrolment in pre-school in Brazil was 93,8%. The curriculum for pre-school and primary education is defined by each local municipality, respecting general guidelines from the Ministry of Education (Federal Government) and therefore, considering the different local contexts, such as city

8  Children’s Developmental Levels at the Start of School

137

size and wealth, pupils’ socio-economic background, curricular guidelines and public policies, there is a great variation of educational results across different municipalities. However, across Brazil as a whole, it was estimated that in 2015, 6.5% of children aged between 4 and 17 were not at school and described as educationally deprived, and, using a multiple-deprivation approach, 26% of children and adolescents were considered to live in extreme poverty (Paz & Arévalo, 2018). This is lower than Lesotho and the Western Cape of South Africa but an important proportion. iPIPS Sample The data reported here came from a two-year longitudinal study (2017–2018) conducted in a representative sample of 41 municipal schools in the city of Rio de Janeiro (see also appendix to Chap. 1). The study was funded by the Inter-American Development Bank, FAPERJ (Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro) and Instituto Alfa e Beto and participation was voluntary. The number of children assessed was 2716 children in wave one when they were in pre-school. However, the sample increased during the longitudinal study as the researchers followed the children who moved to other municipalities and included new children entering the 41 schools in the sample during the two-year study. The study sample excluded private and state-subsidised schools. At the time of the study, the municipal local educational system was responsible for 61% and private/state-subsidised schools for 39% of total pre-school enrolment in the city of Rio de Janeiro. In the third wave of the longitudinal study, the assessment was administered in November/December 2018 with pupils aged 5 to 6  years who attended publicly funded schools. With the design of the study, the pupils were at the end of their second year of pre-school and were assessed just before moving up to Grade 1 and therefore slightly different to the data from Lesotho and the Western Cape, where pupils would have been assessed immediately after a holiday rather than just before, as they were in Brazil. However, for comparisons against children in Lesotho and the Western Cape, it was the start of formal school rather than the earlier waves of the study which were for pre-school, albeit compulsory. The cognitive development part of the assessment in this wave of the study was administered to 3552 pupils in Portuguese by trained researchers, and teachers rated the personal, social and emotional development of 800 pupils. (See Bartholo et al., 2019, 2020a, b and Koslinski & Bartholo, 2020 for additional information about the longitudinal study). Cognitive Development The Rasch person reliabilities for vocabulary, reading and mathematics were 0.55, 0.76 and 0.87 respectively.

138

C. Merrell

Reading Ladder

% of Pupils

5th Rung Can read with understanding.

0%

4th Rung Can read simple sentences aloud.

2%

3rd Rung Can recognise some high frequency words.

10%

2nd Rung Can name letters of the alphabet.

75%

1st Rung Can point to someone who is writing and someone who is reading in a classroom scene. Can point to a word in a classroom scene but not read it.

13%

Fig. 8.8  Reading at the end of pre-school, Rio de Janeiro

From the vocabulary part, the pupils in Rio de Janeiro were, in the main, able to identify the items commonly found in the kitchen and countryside scenes. Their scores were normally distributed around identifying items in the toy room scene (items including microscope, yacht, saxophone, cash, jewellery and cosmetics). The version used in Brazil did not include the items relating to parts of the body. This distribution for vocabulary is different from Lesotho and the Western Cape of South Africa. Of course, as previously noted, any comparison of vocabulary should be interpreted with caution because translations may not be equivalent. However, this may be a consequence of the focus of the pre-school curriculum in Rio de Janeiro (Fig. 8.8). Eighty-eight percent of pupils had a sound understanding of concepts about print (first Rung) and could name letters of the alphabet. A much smaller percentage could read any words, suggesting that while there was a focus on understanding print and learning the alphabet, there was little evidence of formal teaching of reading. This is similar to the results from the Lesotho sample, a relatively small proportion of pupils able to read words. Both samples included children from very deprived socio-economic backgrounds, with the Rio sample comprising municipal schools in an environment of almost 40% of schools being private or state-­ subsidised schools. There is a much lower pre-school attendance in Lesotho, which leads to the question of how much informal learning of the basic understanding of concepts of print took place in the home rather than pre-school. The mathematics results of children were very low and so the first rung of the ladder was split into two sections (Fig. 8.9). The distribution of scores is not

8  Children’s Developmental Levels at the Start of School

Mathematics Ladder

139

% of Pupils

5th Rung Advanced: Examples include 15+21=

0%

4th Rung More difficult calculations: Examples include 7+3= 8-3=

3%

3rd Rung Simple formal calculations: Children were able to do more formally presented arithmetic such as what is three more than seven? What is two more than six?

18%

2nd Rung Name numbers up to 10 and do informally presented arithmetic such as ‘here are three balls, if I took one away how many would be left?’

63%

1st Rung Identifies numbers 1 – 5. Counts a few objects by rote. Identifies shapes.

13%

Count to four and understood vocabulary such as biggest and smallest.

3%

Fig. 8.9  Mathematics at the end of pre-school, Rio de Janeiro

dissimilar to Lesotho, with both being lower, in general, than the Western Cape of South Africa. Personal, Social and Emotional Development Figure 8.10 shows the mean scores for each of the items on the personal, social and emotional development scale (Santos, 2020), noting that the scale used in Rio de Janeiro did not include the item about concentration of self-directed activities.

Teacher ratings

140

C. Merrell

5 4.5 4 3.5 3 2.5 2 1.5 1

Categories Fig. 8.10  Personal, Social and Emotional Development Mean Scores, Rio de Janeiro

Australia, England and Scotland Context Three countries with English as their mother tongue are now explored; Australia, England and Scotland. The school starting age is lower than the countries considered so far with the mean age of 5.5 years for Australia, 4.5 years for England and 5 years for Scotland. Schools across two states/territories were included in the Australian sample (4770 children, considered to be representative of those states/territories), one with a policy of children starting school if they were 4 years old on or before the end of April of the commencement year, the other with a policy of school starting age of 5 years old on or before first January of the commencement year (see also Appendix to Chap. 1). Early education and care are a priority in Australia and at the time of the iPIPS data reported here (2012 academic year), prior to preparatory schooling, children attending kindergarten followed the Belonging, Being and Becoming … The Early Years Learning Framework. Teachers conducted the assessments with all pupils in their classes. For the English sample, the iPIPS data discussed here were collected in the 2011–12 academic year (6986 pupils, nationally representative sample). At that time, children were entitled to state-funded part-time pre-school education with its own curriculum and assessment framework. In 2010, when the sample of children in this study would have been attending pre-school education, there was an uptake of free entitlement of 95% (Department for Education, 2010). UK Government spending on education peaked in 2009–2011 at around 5.7% GDP before declining to lower levels

8  Children’s Developmental Levels at the Start of School

141

in more recent years. Spending on early years education was 5% of this total (Bolton, 2019). Teachers conducted the assessments with all pupils in their classes. Children in Scotland usually start school between the ages of 4.5 and 5.5 years. Like Australia and England, pre-school education and care are a national priority with children being entitled to free part-time pre-school early learning and childcare between the ages of 3 and 5 years. Scotland’s Curriculum for Excellence sets out experiences and outcomes for children in pre-school right through to the end of compulsory education (Education Scotland, 2020). For the Scottish sample, the iPIPS data discussed here were collected in the 2012–13 academic year (For the English sample, the iPIPS data discussed here were collected in the 2011–12 academic year (6986 pupils, nationally representative sample). Teachers conducted the assessments with all pupils in their classes. (See Tymms et al., 2016 and Tymms et al., 2014 for full reports of the projects). Cognitive Development Pupils in all three countries were, on average, able to identify commonly-used items in the kitchen scene, many items in the country scene and very few items in the toy room scene. In the reading part, on average, pupils were able to identify some letters. In England, only 10% of the sample were able to recognise some high-frequency words and virtually no children were able to read sentences. In Scotland, the percentage of pupils able to recognise some high-frequency words was 18% with a further 3% able to read sentences. Pupils in Australia were ahead of their peers in England and Scotland. In mathematics, 27% of pupils in England reached the simple formal calculations rung of the ladder and a small percentage reached the higher rungs. In Scotland, 37% reached the simple formal calculations rung and 10% reached the next rung. Australian pupils were further ahead. Personal, Social and Emotional Development Personal, social and emotional development data were available from 1566 pupils in England and 699 pupils in Scotland at the start of the school year. These were a sub-group of the full samples and not necessarily nationally representative. No data were available for Australia (Fig. 8.11). The mean scores of pupils’ personal social and emotional development are lower across all domains for England compared with Scotland. The English pupils are the youngest and clearly less mature in their development compared with those children in other countries, though, the mean score for concentration on teacher-directed activities was around 3 for all countries.

142

C. Merrell

4 3.5 3 2.5 2 1.5 1 0.5 0

England

Scotland

Fig. 8.11  Personal, social and emotional development mean scores, England and Scotland

Russia Context Children typically start school in Russia at age 7 years, which makes it an interesting contrast to the preceding countries with the children being older and the different language with its Cyrillic alphabet. The iPIPS data discussed in this chapter come from a sample of children assessed when they started school in September 2014. The participating schools were located in two regional capital cities; Krasnoyarsk (Krasnoyarsk region) and Kazan (Tatarstan). Participating schools were representative of each city and pupils within the schools were selected at random to be assessed (see also Appendix to Chap. 1). In total, 2407 children were assessed with both the cognitive development and personal, social and emotional development parts of iPIPS. The mean age at the start of the school year was 7.35 years. The project in Russia used a computer-adaptive version of iPIPS with which each pupil sat at a computer with an adult. Questions were presented using sound files and the adult recorded the pupils’ right and wrong answers on-screen. The sample consisted of 1440 children with 50% girls. The average age of the children was around 7.3 years. (One teacher did not complete the survey, so we only have data for 1416 children). (For more details of the sample and the internal reliabilities of the Russian version of the assessment, see Orel et al. (2018).

8  Children’s Developmental Levels at the Start of School

Reading Ladder

143

% of Pupils

5th Rung Can read with understanding.

66%

4th Rung Can read simple sentences aloud.

11%

3rd Rung Can recognise some high frequency words.

16%

2nd Rung Can name letters of the alphabet.

7%

1st Rung Can point to someone who is writing and someone who is reading in a classroom scene. Can point to a word in a classroom scene but not read it.

Not applicable

Fig. 8.12  Reading at the start of school, Russia

Cognitive Development The Rasch person reliabilities for vocabulary, reading and mathematics were 0.72, 0.89 and 0.91 respectively. The data from Russia were noticeably different from the other countries which have been described. This is because children were older, they were being assessed using a different script and come from a different culture. The reading levels can be broadly described using the same ladder as used above (Fig. 8.12). About two thirds of the children starting school were reading with understanding, about one in ten were reading sentences and the rest were reading some words and identifying letters. The mathematics section and the vocabulary sections did not lend themselves to descriptions using the ladder format. But a careful analysis by Ivanova et al. (2018) showed that the Russia levels of mathematics were, age for age, largely in line with the data from Scotland and England.

Teacehr ratings

144

C. Merrell 5.00 4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00

Categories Fig. 8.13  Personal, social and emotional development Mean Scores, Russia

Personal, Social and Emotional Development (Fig. 8.13) Remarkably, given the difference in ages and culture, the ratings of personal, social and emotional development in Russia were similar to those seen in other countries. The children were generally older and this is reflected in the high ratings for Independence, Relationships with adults and Communication.

8.6 Discussion On average, children across the countries started school with a functional vocabulary, an understanding of concepts of print, and knowing quite a few letters. Variable percentages of children were able to recognise high frequency words or sentences. Very few children were able to get onto the comprehension section except in Russia. In countries where children are approaching age 6, they are entering into the third developmental cycle suggested by Andreas Demetriou in Chap. 7 of this section, although it is important to note that these cycles do not have fixed boundaries. Some younger children may have reached it; some older children not. In this third cycle, making connections between representations so that they can be called upon efficiently to strengthen understanding is a priority. This is evident in the data which indicate that many children have made connections between spoken and written representations of letters. At this stage though, few children are able to read. In addition to the developmental theory proposed by Demetriou, this is likely to be influenced by pre-school policies which tend to focus on personal, social and emotional development and vocabulary. Informal

8  Children’s Developmental Levels at the Start of School

145

learning in the home of basic concepts is likely to be seen as important and there is a sense that formal teaching does not commence until school, which would include learning to read. In mathematics, across all countries with the exception of Brazil, which had a slightly lower distribution, a high proportion of pupils were able to count and identify single digits and do informally-presented arithmetic problems, with some able to do simple calculations involving formal notation. Again, this is in line with the theory of cognitive development proposed by Demetriou and perhaps also reflects an informal curriculum and input from the home. In countries with an older school starting age, pupils’ personal, social and emotional development tend to be more advanced. The mean scores of children in Brazil and South Africa were higher than those in England and Scotland and higher still in Russia. However, concentration on teacher-directed activities was found to be at a similar level for all, at about scale point 3 for which the description was ‘able to settle to a task for a sustained period but may be distracted’. At the start of school, regardless of age, it appeared that pupils had not yet learned to maintain concentration in the face of competing activities, so although Demetriou’s chapter suggests that children acquire a level of attention in the second major cycle of development that enables them to focus on developing understanding and knowledge, the length of time a child can concentrate continues to develop with age. This chapter provided an insight into children’s developmental levels at the start of school. The developmental phases which occur during pre-school and into the primary school years, as described by Andreas Demetriou in Chap. 7 of this section, have underpinned the skills and knowledge that children have acquired by the time they are aged 4 and a half onwards. Those developmental processes are evident in the types of questions that children were able to answer in the iPIPS assessment. There was some variation by age and the context of the country, as would be anticipated. The data from the various iPIPS projects suggested that in the different settings explored, which have a variety of early development of cognitive processes, early years policies and home environments, these backgrounds appear to have provided children with a functional vocabulary, an informal education of literacy and mathematics concepts, and social skills which, in general, position them well to access the school curriculum.

References Aleksic, G., Merrell, C., & Tymms, P. (2019). Links between Socio-Emotional Skills, Behaviour, Mathematics and Literacy of Preschool Children in Serbia. European Journal of Psychology of Education, 32(4), 417–438. https://doi.org/10.1007/s10212-­018-­0387-­8 Anderson, L. M., Shinn, C., Fullilove, M. T., Scrimshaw, S. C., Fielding, J. E., Normand, J., & Carande-Kulis, V. G. (2003). The effectiveness of early childhood development programs: A systematic review. American Journal of Preventive Medicine, 24(3), 32–46. ISSN 0749-3797. https://doi.org/10.1016/S0749-­3797(02)00655-­4

146

C. Merrell

Bartholo, T. L., Koslinski, M. C., Costa, M., & Barcellos, T. M. (2019). What do children know upon entry to pre-school in Rio de Janeiro? Ensaio, (Rio De Janeiro. Online), 1–22. Bartholo, T. L., Koslinski, M. C., Costa, M., Tymms, P. B., Merrell, C., & Barcellos, T. M. (2020a). The use of cognitive instruments for research in early childhood education: constraints and possibilities in the Brazilian context. Pró-Posições (UNICAMP.online), 31, 1–24. https://doi. org/10.1590/1980-­6248-­2018-­0036 Bartholo, T. L., Koslinski, M. C., Andrade, F. M., & Castro, D. L. (2020b). School segregation and Education inequalities at the start of schooling in Brazil. Revista Electronica Iberoamericana Sobre Calidad, Eficacia Y Cambio En Educacion, 18, 77–96. Beck, I. L., McKeown, M. G., & Kucan, L. (2013). Bringing words to life: Robust vocabulary instruction (2nd ed.). Guildford Press. ISBN 1462508162. Bolton, P. (2019). Education Spending in the UK, Briefing Paper Number 1078, October 2019. UK Parliament. https://dera.ioe.ac.uk/34232/1/SN01078%20%281%20%28redacted%29%29.pdf. Accessed 18 March 2023. Clay, M. (1989). Concepts about print in English and other languages. The Reading Teacher, 42(4), 268–276. Department for Education. (2010). Provision for children under five years of age in England: January 2010, statistical first release. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/218962/main_20text_20sfr162010.pdf. Accessed 28 May 2020. Education Scotland. (2020). https://education.gov.scot/education-­scotland/scottish-­education-­ system/policy-­f or-­s cottish-­e ducation/policy-­d rivers/cfe-­building-­f rom-­t he-­s tatement-­ appendix-­incl-­btc1-­5/what-­is-­curriculum-­for-­excellence. Accessed 31 July 2020. Ivanova, A., Kardanova, E., Merrell, C., Tymms, P., & Hawker, D. (2018). Checking the possibility of equating a mathematics assessment between Russia, Scotland and England for children starting school. Assessment in Education: Principles, Policy & Practice, 25(2), 141–159. Koslinski, M. C., & Bartholo, T. L. (2020). Desigualdades de oportunidades educacionais no início da trajetória escolar no context brasileiro. Lua Nova (Impresso), 110, 215–245. https://doi. org/10.1590/0102-­215245/110 Lonigan, C. J., & Shanahan, T. (2009). Developing early literacy: A scientific synthesis of early literacy development and implications for intervention (Report of the National Early Literacy Panel, Executive Summary). National Institute for Literacy. Maluleke, R. (2020). Child poverty in South Africa: A multiple overlapping deprivation analysis, Report No. 03-10-22, Statistics South Africa. ISBN 978-0-621-48540-0. Merrell, C., & Bailey, K. (2008). Predicting achievement in the early years: How influential is personal, social and emotional development? Paper presented at the International Association for Educational Assessment conference, Cambridge, September 2008. https://iaea.info/documents/predicting-­achievement-­in-­the-­early-­years-­how-­influential-­is-­personal-­social-­and-­ emotional-­development/. Accessed 29 June 2020. Ministry of Development and Planning/UNICEF. (2018). Child poverty in Lesotho: Understanding the extent of multi- overlapping deprivation. UNICEF: Lesotho. https://www.unicef.org/esa/ sites/unicef.org.esa/files/2018-­12/UNICEF-­Lesotho-­2018-­Child-­Poverty-­Report-­Summary. pdf. Accessed 29 June 2020. Ministry of Finance/UNICEF. (2018). Lesotho national budget brief: Fiscal year 2018/19. UNICEF. https://www.unicef.org/esaro/UNICEF-­Lesotho-­2018-­National_Budget_Brief.pdf. Accessed 29 June 2020. Orel, E., Brun, I., Kardanova, E., & Antipkina, I. (2018). Developmental patterns of cognitive and non-cognitives skills of Russian first-graders. International Journal of Early Childhood, 50, 297–314. https://doi.org/10.1007/s13158-­018-­0226-­8 Paz, J., & Arévalo, C. (2018). Wellbeing and multiple deprivations in childhood and adolescence in Brazil. UNICEF. https://www.unicef.org/brazil/sites/unicef.org.brazil/files/2019-­07/br_ well-­being-­and-­multiple-­deprivations.pdf. Accessed 30 July 2020.

8  Children’s Developmental Levels at the Start of School

147

Santos, K. P. A. O. (2020). Primeiro ano na pré-escola: a relação entre desenvolvimento cognitivo, comportamento e habilidades socioemocionais. Dissertação (Mestrado em Educação) – Universidade Federal do Rio de Janeiro, 2020. Sparapani, N., McDonald Connor, C., McLean, L., Wood, T., Toste, J., & Day, S. (2018). Direct and reciprocal effects among social skills, vocabulary, and reading comprehension in first grade. Contemporary Educational Psychology, 53, 159–167. https://doi.org/10.1016/j. cedpsych.2018.03.003 Statistics South Africa. (2019). Sustainable development goals country report 2019 South Africa, Statistics South Africa. ISBN 978-0-621-47619-4. Tanner, J. C., Candland, T., & Odden, W. S. (2015). Later impacts of early childhood interventions: A systematic review, IEG Working Paper 2015/3. World Bank. ISBN-10: 1-60244-261-4 ISBN-13: 978-1-60244-261-0. Tymms, P. (1999a). Baseline assessment and monitoring in primary schools: Achievements, attitudes and value-added indicators. David Fulton Publishers. Tymms, P. B. (1999b). Baseline assessment, value-added and the prediction of reading. Journal of Research in Reading, 22(1), 27–36. Tymms, P., Jones, P., Albone, S., & Henderson, B. (2009). The first seven years at school. Educational Assessment and Evaluation Accountability, 21, 67–80. Tymms, P., Merrell, C., Hawker, D., & Nicholson, F. (2014). Performance indicators in primary schools: A comparison of performance on entry to school and the progress made in the first year in England and four other jurisdictions: Research report. Department for Education. https://www.gov.uk/government/publications/performance-­indicators-­in-­primary-­schools. Accessed 31 July 2020. Tymms, P., Merrell, C., & Buckley, H. (2016). Children’s development at the start of school in Scotland and the progress made during their first school year: An analysis of PIPS baseline and follow-up assessment data (Research report for the Scottish Government). ISBN: 9781785448942. http://www.gov.scot/Publications/2015/12/5532/0. Accessed 31st July 2020. Tymms, P.  Howie, S., Merrell, C., Combrinck, C., & Copping, L. (2017). The first year at school in the Western Cape: Growth, development and progress (Project report funded by the Nuffield Foundation). https://www.nuffieldfoundation.org/sites/default/files/files/Tymms%20 41637%20-­%20SouthAfricaFinalReport%20Oct%202017.pdf. Accessed 30 July 2020. Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. United Nations. (2015). Sustaining our world: The 2030 agenda for sustainable development, A/ RES/70/1. www.sustainabledevelopment.un.org. Accessed 18.05.20. United Nations. (2019a). Special edition: Progress towards the sustainable development goals (Report of the Secretary General. High-level segment: ministerial meeting of the high-level political forum on sustainable development, convened under the auspices of the Economic and Social Council, 2019 session 26 July 2018–24 July 2019 Agenda items 5 (a) and 6). https:// undocs.org/E/2019/68. Accessed 18.05.20. United Nations. (2019b). The Kingdom of Lesotho voluntary national review of the implementation of the agenda 2030, report 2019. https://sustainabledevelopment.un.org/content/ documents/23777Lesotho_VNR_Report_2019_Final.pdf. Accessed 18/05/20. World Bank. (2019). Kingdom of Lesotho education public expenditure review. Report Number: 136894-LS. World Bank. http://documents1.worldbank.org/curated/en/419381558335864401/ pdf/Lesotho-­Education-­Public-­Expenditure-­Review.pdf. Accessed 29 June 2020.

Chapter 9

Progress Made During the First Year at School Katharine Bailey

The first year of school is characterised by developmental leaps for young children and this is typical wherever they are in the world. It can be tempting to compare the extent of these gains between different countries and make claims about the quality of educational provision but contextual factors undermine these comparisons.

9.1 Introduction The start of full-time school represents a transition from which often follows a period of rapid development for children. They continue to acquire language, learn how to read and increase their understanding of mathematics, develop their social skills and start to moderate their behaviour to the classroom setting. Knowing the skills and aptitudes of children as they start school, and understanding the rate of progression during the first year of school, is essential for teachers, curriculum developers and policy makers. The first year of school is unique. It is characterised by an intake with little uniformity in terms of previous educational experiences. Some children have been to nursery, others have not. Some have been exposed to a rich environment and others may have had little exposure to books. Their teachers are adept at evaluating the range of skills and capabilities in their classroom and using that information to help each child start their school career with strong foundations. This chapter considers the first year of school from three perspectives. Firstly, I explore what pupils know and can do when they start school and the progress that K. Bailey (*) Cambridge University Press & Assessment, Cambridge, UK e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_9

149

150

K. Bailey

they make during that all important first year. I do this by comparing two very different contexts: England and the Western Cape of South Africa. Secondly, I consider what we can learn, and what we cannot, from a comparison of those two countries given the deep differences in context. Some of those contextual differences are explored as well as their implications for comparing performance and progress. Finally, I consider the long-term implications of early schooling experiences. In the previous chapter (Chap. 8), the author discussed children’s cognitive, personal, social and emotional development on starting school in different countries. Although there was expected variation in context and age of the children, their pre-­ school environments appear to have provided them with the early experiences needed to commence formal schooling. Continuing this discussion, I focus on two of the countries previously discussed and explore the first year of school in more detail, describing what children know and can do at the end of that year and discussing factors which may be associated with the progress seen, including child-level characteristics, the different school curricula and educational policies.

9.2 The First Year of School in England and the Western Cape of South Africa iPIPS studies of pupils’ progress in reading and mathematics during their first year of school were conducted in England and the Western Cape of South Africa. A common assessment instrument, adapted for the local context and language groups, was used. (For more information about the history and development of iPIPS, see Section 1, Chap. 1, and for more information about the content of the assessment, see the previous chapter; Chap. 3). Data from the assessment, which was administered within the first few weeks of starting school and then again at the end of the first year, describe pupils’ levels of development which can be interpreted in terms of the progression up pedagogical ladders presented in the previous chapter (Chap. 8) and shown in Fig. 9.1 below. The samples from both England and the Western Cape of South Africa were longitudinal, allowing the results from both time-points to be linked.

England The characteristics of the sample participating in iPIPS in England, which was nationally representative, are described in the previous chapter (Chap. 8). Rather than present the data from the assessments, the results are described below. Quantitative presentation of these findings would support comparison between countries, but the validity of these comparisons would be questionable. This argument is explicated later in the chapter.

9  Progress Made During the First Year at School

151

Reading Ladder Comprehension Can read with understanding

Mathematics Ladder Advanced Examples include 15+21=

4th Rung

Sentences Can read simple sentences aloud

Formal arithmetic Examples include 7+3= 8-3=

3rd Rung

Words Can recognise some high frequency words

Simple formal arithmetic Children were able to do more formally presented arithmetic such as what is three more than seven? What is two more than six?

2nd Rung

Letters Can name letters of the alphabet

Informal arithmetic Name numbers up to 10 and do informally presented arithmetic such as ‘here are three balls, if I took one away how many would be left?’

1st Rung

Ground level Can point to someone who is writing and someone who is reading in a classroom scene. Can point to a word in a classroom scene but not read it.

Ground Level Count to four and understood vocabulary such as biggest and smallest.

5th Rung

Fig. 9.1  Summary of pedagogical ladders for reading and mathematics

At the start of school in England, nine percent of children started school at the ‘Ground’ level, with a functional vocabulary and some knowledge of concepts about print. The highest proportion (80 percent) could name letters and a much smaller group able to read some high-frequency words. In general, it seems that they are in a good position to start learning to read. At the end of the year, we see that pupils have progressed from knowing letters to now being able to read words and almost half of children are at the level of reading words with some being able to read with comprehension. This is a significant educational achievement, the amount of progress in that first year amounts to an effect size of 1.76 after accounting for progress due to maturity alone (Tymms, et al., 2014). This effect size is very large and rarely seen in education. Hattie (1999) and Hill et al. (2008) note that effect sizes in education rarely exceed one, and Luyten et al. (2020) suggest that effect sizes of 0.2 should be considered as important for educational interventions of which going to school is one. A similarly substantial amount of progress is seen in mathematics. At the beginning of school, a high proportion of children are at the informal stage of being able to perform simple calculations but by the end of the year, around two-thirds of children are able to do some simple formally presented arithmetic or answer more complex number problems. Tymms et al. (2014) estimated this progress to be an effect size of 1.78, again very large in educational terms.

152

K. Bailey

Western Cape of South Africa As part of the iPIPS study, around 2500 children were assessed at the start and again at the end of year (Tymms, et  al., 2017). In the previous chapter (Chap. 8), the author described the starting points of children as they entered school in the Western Cape. At the start of school, over half of the pupils had a sound understanding of concepts about print and could name letters of the alphabet with one third of the sample being able to read some high frequency words. Very few children were able to read simple sentences or read with understanding. Compared with the children starting school in England, a higher proportion (33 percent) of those in South Africa had reached the ‘Words’ level when they started school. Overall, the pupils made considerable progress in reading over the year with 40 percent of the sample able to read simple sentences compared to just 6 percent at the start of the year. Almost a quarter were able to read sentences and a smaller percentage were reading with comprehension. In mathematics, at the start of year, 90 percent of children were at or above the informal arithmetic stage. At this stage, pupils were able to identify numbers 6 to 10 and able to do simple informal sums. They had some basic mathematical vocabulary including words such as ‘most’ and ‘more’. Over half of the pupils were at the simple formal arithmetic stage where they were able to identify two-digit numbers, do some more difficult informal calculations and simple formal sums. One quarter of the children were able to perform formal arithmetic, able to identify three-digit numbers, do formal sums, identify coins and use simple fractions. Progress was made over the year. At the repeat assessment, two thirds of the pupils had moved to the formal arithmetic stage. As in England, this represents significant educational progress in both reading and mathematics of a magnitude rarely seen in other aspects of education. On closer inspection of the language groups, it has been found that, in general, the children attending isiXhosa medium schools began the year at a lower point on the ladder than their peers in English medium and Afrikaans medium schools. The rate of progress during the year was on a par across all three language groups but because the children in the isiXhosa schools started at a lower level, they remained at a lower level at the end of the year (Tymms et al., 2017). Nevertheless, looking at the progress made in the schools of the Western Cape, as in England, it clearly is a most important time in children’s education careers.

9.3 Validity of Comparisons It is appealing to compare the level of attainment and progress between England and South Africa. By doing this, I feel able to draw some inferences about the quality of educational experiences in those two countries. But to do so is fundamentally flawed

9  Progress Made During the First Year at School

153

because measuring learning across countries in such a way is complex. Differences observed in performance or progress, although partly explained by curriculum, pedagogy and policy, are impacted by how the populations of these countries vary significantly in their language, culture and socio-economic status. Countries also vary in the way childhood and education have developed historically. The validity of comparing these two countries based on their performance on a single assessment such as iPIPS would justifiably be challenged. In attempting to capture learning in an educational assessment, there is a tendency to reduce layers of complexity into a single score and then make claims about the underlying constructs measured in the assessment. For example, in an assessment of reading, we would use a group of questions that measure performance in specific reading tasks in an attempt to generalise to reading performance more broadly. If we are to make a plausible claim that our assessment can indeed be used as a proxy for reading ability, we would need to be able to back up that claim with evidence (For a more detailed examination of validity, see Kane, 2013). Assuming that the assessment correlates well with performance on a range of other reading tasks, I would argue that our claim is a valid one. In the case of the studies described above, we might claim that one country made more progress, as defined by this assessment, than another. But an explicit or implicit claim that one country has higher quality education provision than another could not be substantiated because of deep contextual differences between the two countries and, as such, would give us very little credible or useful insight. Some of those contextual differences and the way they might confound comparisons are explored next. Language of instruction presents a first area for consideration. In the Western Cape study, around 12 percent of pupils were taught (and assessed) in a second language due to there being different language versions of iPIPS and schools offering their curricula through different languages of instruction in the early years of the primary phase. In England, the percentage of pupils being taught in a second language was 16 percent. Beneath the numbers, the patterns of language in the two countries are very different. In South Africa, multilingualism is common particularly in urban areas. By the time the pupils reach Grade 4 and 5, as many as 70 percent of children will be speaking a different language to that spoken in their home environment (Howie et al., 2012). By contrast, the UK has one of the lowest levels of bilingualism in Europe with 62 percent of the population not having any knowledge of other languages (European Commission, 2012). In the Western Cape iPIPS study, the language of instruction varied across schools. The sample included approximately one-third who were taught in Afrikaans, one-third in English and one-third in isiXhosa. Within those groups relatively small numbers of pupils were not taught in their first language (5 percent in Afrikaans settings and 18 percent in English language settings). In the sample from England, the language of instruction was English in all participating schools, but for only 16 percent of pupils, that was a second language. Understanding the impact of these language patterns on the findings is complex. One possible implication is the effect of higher levels of multilingualism in South African schools. Some research has suggested that bilingualism confers a cognitive advantage on children (Blom et al., 2017) and encourages early

154

K. Bailey

use of additional languages. Other work in the African context suggests that learners benefit from being taught in their home language (Kioko, 2015). Kioko also suggests that particular problems in early development may arise when the teacher is not a native speaker of the language of instruction. It is possible that the complexities around language acquisition and use supported higher levels of development at the start of school for children in the Western Cape through the cognitive advantage gained through additional language use at home but equally could have delayed strong progress in the first year because the development of very important early skills and concepts needed were impaired by the complexities of language. The age at entry to school differed between the two countries. In South Africa, pupils start compulsory schooling in the year in which they turn seven although parents can choose for the children to start from the age of six. The average age on entry to schools in the Western Cape was 6.8 years with a standard deviation of 0.49, but there was large variation between the youngest pupil who was 5.7 years and the oldest who was 9.7 years. Children were younger entering school in England and with far less variation in ages. The youngest child was just under four and the oldest around 5.5 years. Compulsory school in England commences in the reception class year with most children starting school in September of the year they turn five years of age. In the Western Cape, a new policy for reception classes was in the process of being implemented and some children had attended such classes. Referring to the assessment results, the starting points of pupils on entry to school show higher levels of development in the Western Cape than in England and the likelihood is that it is partly explained by the difference in age on entry. Most children were at least one year older. Socio-economic status (SES) is an important variable in understanding educational development. It is linked to the resources available to the household for spending on education and health, as well as the time available to parents to dedicate to their children and the likelihood of a stimulating and engaging home environment. However, comparing SES across countries is very challenging. What constitutes an indicator of SES is one country is unlikely to be a valid indicator in another. The common measure of SES in education in England used by the Department for Education is the Income Deprivation Affecting Children Index (IDACI). This measure is a particular subset of deprivation data in England that reports the proportion of children aged between 0 and 15 who are living in income deprived families and is based on pupils’ home postcodes (Noble et al., 2019). In the South African study, the authors used two measures of SES. Firstly, they considered a school level measure linked to the affluence of neighbourhoods and used for the funding formula (Department of Education, 2006). They also implemented a parent questionnaire adapted from the Progress in International Reading Literacy Study (PIRLS) in South Africa (Martin et al., 2017). This questionnaire correlated highly with the official quintile SES measure and added important pupil-level variables. At the start of the year in South Africa, the SES measure was found to be the most important predictor of reading and mathematics with effect sizes of ~0.2 and 0.3 respectively. At the end of the year, however, start assessment scores, language and

9  Progress Made During the First Year at School

155

age were more important with the SES correlations dropping to effect sizes of below 0.1. A similar relationship has been found in other studies in England with effect sizes of around 0.3 at the start of the year (Tymms et al., 2014) and much lower at the end of the year with factors such as prior achievement and school attended becoming more important (Tymms et al., 1997). This exploration of some contextual differences between England and South Africa has highlighted some of the challenges in comparing performance across countries. Even when rigorous measures, such as SES, are taken into consideration, those measures vary in their underlying indicators and, as such, do not support broader claims about education systems. What the two studies do highlight is that in both countries, pupils make good progress in their first year of schooling. As such, I explore the importance of that first year in giving firm foundations for later educational outcomes.

9.4 The Importance of the First Year at School The previous chapter (Chap. 8) highlighted some of the evidence supporting the importance of formal pre-school experiences for positive social and academic outcomes. In a further study carried out in England, the authors investigated the influence of pre-school provision on children’s academic and social-behavioural outcomes. The Effective Pre-school Primary and Secondary Education (Sammons et al., 2014) longitudinal study found that pre-school attendance, full or part time, had a positive and long-term impact on attainment, progress and social-behavioural development. Children starting school at the age of five who had attended pre-­ school before the age of 3 and those attending high quality pre-school, saw greater gains. The impact of those gains remained throughout primary school and into secondary school with children having attended pre-school achieving better end of compulsory schooling (GCSE) results. The impact was greatest for those pupils starting pre-school before the age of 3 or who had attended a high-quality pre-­ school. Pupils who attended pre-school were more likely to go onto higher levels of academic study. The evidence linking effective pre-school provision to later outcomes is well documented. Evidence of the impact of the first year in school is scarce in comparison but as early as 1978, researchers identified the special importance of this period of schooling. A longitudinal study conducted in the USA tracked pupils from a single first grade teacher through to the end of secondary school (Pedersen et al., 1978). This teacher achieved higher results for early literacy and numeracy with her class than her fellow teachers with parallel and comparable intakes. Researchers used pupils’ annual assessment scores to track their academic progress and found that the boost they received from their first-grade experiences stayed with them throughout the remainder of their elementary and secondary schooling. A later study carried out in

156

K. Bailey

England, supported the idea that effective teaching in the first year confers an advantage for pupils later on (Tizard et  al., 1988). The Infant School study tracked a cohort of pupils in London from the end of pre-school to the end of infant school and found that pupils who made the greater rates of progress during the reception year remained the highest achieving during their infant years. More recently, examples of large-scale studies have followed children through the first few grades of school (see Lonigan et al., 2008 for a synthesis). In an example from England, researchers studied children’s progress from the age of four at the start of schooling to age 11 at the end of primary school (Tymms et al., 2009). The PIPS Baseline Assessment was used to measure early reading and mathematics ability in the first few weeks of the reception class. The assessment was repeated at the end of the first year to measure progress. Effective classes were identified as those where classes made significantly greater progress than other classes – at least two standard deviations above the mean. Data from national assessments at the age of 7 and 11 were gathered. Pupils’ progress and attainment were tracked over time and compared with their peers. The study found that those pupils who were in effective classes maintained the boost in attainment through the primary years with a small but statistically significant effect, still found seven years later in their last year of primary school. Researchers were able to explore the benefits of membership of effective classrooms for other age groups but found nothing as significant as that found for the first year in school. They found an additive effect when pupils were in more than one effective classroom during their time in primary school, but this was rare. Five years later, the same cohort of pupils was tracked through to their GCSE qualifications at the age of 16. A small but statistically significant effect was still detected and amounted to 0.23 and 0.18 standard deviations for English and mathematics respectively. This study provides evidence of the long-lasting impact of an effective first year in school. However, it falls short of identifying what constitutes an ‘effective class’ and further work is needed to identify which policies, practices or pedagogy contributed to this effect.

9.5 Summary The first year of schooling represents an important period of development for pupils. Data from England and South Africa show that progress is made in reading and mathematics in both countries but fall short of comparing those levels of progress. Rather, I have challenged making claims about the quality of provision from those comparisons and recognised that attainment and progress is rooted in fundamental differences in the history, politics and economics of those countries. The importance of the first year of schooling has been explored and I have identified that laying strong foundations during this year can have an impact on attainment twelve years later at the end of compulsory education. As such, it is a responsibility of schools and policy makers to ensure high-quality provision for this first year.

9  Progress Made During the First Year at School

157

References Blom, E., Boerman, T., Bosmaa, E., Cornips, L., & Evaraert, E. (2017). Cognitive advantages of bilingual children in different sociolinguistic contests. Frontiers in Psychology, 8(552). https:// doi.org/10.3389/fpsyg.2017.0052 Department of Education. (2006). Amended National norms and standards for school funding. Pretoria, South Africa. Available at: https://www.gov.za/documents/south-­african-­ schools-­act-­national-­norms-­and-­standards-­school-­funding-­amendment European Commission. (2012). Europeans and their languages. Brussels, Belgium.Available at: https:// op.europa.eu/en/publication-­detail/-­/publication/f551bd64-­8615-­4781-­9be1-­c592217dad83 Hattie, J. (1999). ‘Influences on student learning’ [Lecture]. University of Auckland. 2 August 1999. Hill, C. J., Bloom, H. S., Black, A. R., & Lipsey, M. W. (2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172–177. Howie, S., Van Staden, S., Tshele, M., Dowse, C., & Zimmerman, L. (2012). PIRLS 2011: South African children’s reading literacy achievement report. Available at: http://hdl.handle. net/2263/65996 Kane, M. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. Kioko, A. (2015). Why schools should teach young learners in home language. British Council. Available at: https://www.britishcouncil.org/voices-­magazine/why-­schools-­should-­teach-­ young-­learners-­home-­language Lonigan, C. J., Schatschneider, C., & Westberg, L. (2008) Identification of children’s skills and abilities linked to later outcomes in reading, writing and spelling. In Developing early literacy: Report of the National Early Literacy Panel. National Institute for Literacy. Luyten, J. W., Merrell, C., & Tymms, P. (2020). Absolute effects of schooling as a reference for the interpretation of educational intervention effects. Studies in Educational Evaluation, 67. org. ezp.lib.cam.ac.uk. https://doi.org/10.1016/j.stueduc.2020.100939 Martin, M. O., Mullis, I. V., & Hooper, M. (Eds). (2017). Methods and procedures in PIRLS 2016. Boston College, TIMSS & PIRLS International Study Center. Available at: https://timssandpirls.bc.edu/publications/pirls/2016-­methods.html Noble, S., McLennan, D., Noble, M., Plunkett, E., Gutacker, N., Silk, M., & Wright, G. (2019). The English indices of deprivation 2019. Ministry of Housing, Communities & Local Government. Available at: https://www.gov.uk/government/statistics/english-­indices-­of-­deprivation-­2019 Pedersen, E., Faucher, T. A., & Eaton, W. (1978). A new perspective on the effects of first-grade teachers on children’s subsequent adult status. Harvard Educational Review, 48(1), 1–31. https://doi.org/10.17763/haer.48.1.t6612555444420vg Sammons, P., Sylva, K., Melhuish, E., Siraj, I., Taggart, B., Toth, K., & Smees, R. (2014). Influences on pupils’ GCSE attainment and progress at age 16: Effective pre-school. Primary & Secondary Educational Project (EPPSE) research report, Department for Education. Available at: https:// www.researchgate.net/publication/266373219_Influences_on_pupils'_GCSE_attainment_ and_progress_at_age_16_Effective_Pre-­S chool_Primary_Secondary_Education_ Project_EPPSE Tizard, B., Blatchford, P., Burke, J., Farquhar, C., & Plewis, I. (1988). Young children at school in the Inner City. Lawrence Erlbaum. Tymms, P., Merrell, C., & Henderson, B. (1997). The first year at school: A quantitative investigation of the attainment and Progress of pupils. Educational Research and Evaluation, 3(2), 101–118. Tymms, P., Jones, P., Albone, S., & Henderson, B. (2009). The first seven years at school. Educational Assessment, Evaluation and Accountability, 21(1), 67–80.

158

K. Bailey

Tymms, P. B., Merrell, C., Hawker, D., & Nicholson, F. (2014). Performance indicators in primary schools: A comparison of performance on entry to school and the progress made in the first year in England and four other jurisdictions. Department for Education. Available at: https:// www.gov.uk/government/publications/performance-­indicators-­in-­primary-­schools Tymms, P., Howie, S., Merrell, C., Combrinck, C., & Copping, L. (2017). The first year at school in the Western cape: Growth, development and progress. Project Report. Nuffield Foundation, London. https://doi.org/10.13140/RG.2.2.21670.27209

Part IV

The First Year at School: Education Inequality, Poverty and Children’s Cognitive Development Tiago Bartholo

and Mariane Koslinski

Introduction Part IV of the book discusses the endeavours undertaken within the scope of the iPIPS project to produce robust evidence about educational inequality at the start of school and the challenges to understand the effect of poverty and family socio-­ economic status (SES) on children’s cognitive development. Evidence from the iPIPS project suggests that children make more progress in numeracy and literacy in their first year at school than they do in any other year during their school career (Hawker, 2015; Tymms et al., 1987). Moreover, it is widely recognised that children’s early development and progress during the first years of school are crucial for their later success. Research for the early years also suggests that good quality provision can benefit children from lower socio-economic backgrounds (Peisner-Feinberg et al., 2001; Sammons et al., 2006; Sylva et al., 2006; Tymms et al., 2009). Evidence to guide educational policy for the early years begins with understanding what children know and can do when they start compulsory education in a particular education system. The evidence generated from the iPIPS study can provide a learning path to inform curriculum development and a framework to assess the impact of interventions. It also enables the investigation of patterns of educational inequality already existing at the start of school to identify policies/programmes and school characteristics capable of bridging the gap between pupils of different socio-­ economic backgrounds at the beginning of school. Finally, the baseline measure can provide critical information to guide researchers and policymakers to shape policies to address the ‘poverty effect’ and help disadvantaged children have a chance to learn and develop.

T. Bartholo (*) · M. Koslinski Federal University of Rio de Janeiro, Rio de Janeiro, Brazil e-mail: [email protected]; [email protected]

160

IV  The First Year at School: Education Inequality, Poverty and Children’s Cognitive…

There is increasing pressure in society to improve the quality and equity of education. Very often, the concerns arise from an analysis of pupil outcomes in standardised tests produced locally or by international bodies, such as PISA (Programme for International Student Assessment), TIMSS (Trends in International Mathematics and Science Study), and PIRLS (Progress in International Reading Literacy Study). The iPIPS project, compared to other international comparative studies, has two unique characteristics that enable researchers to evaluate the school effect and education inequalities in a crucial phase of education. Firstly, at the start of compulsory education, the baseline measure provides a starting point against which later progress can be measured. The second measurement, usually collected at the end of the first year in school, provides longitudinal data about children’s progress in their first year (see Part II of the book for more information). Unfortunately, despite the growing influence of international surveys, such as PISA, there is no widely used international baseline study of children starting compulsory education (Hawker, 2015). One critical study which presents similar goals to iPIPS is the OECD Early Learning and Child Well-being. The study collected representative data for 5-year-­ old children in England, the United States of America (USA), and Estonia. However, there are two main differences between the OECD endeavour and iPIPS. The first one is related to study design. The OECD study presents a cross-sectional design. Therefore, even though it is cheaper and faster to collect the data, the quality of the information, especially related to school effects or the impact of educational programmes, is limited (OECD, 2020). The second is that that the OECD work collects data at a specified age whereas iPIPS collects data at the start of school, whatever age that is. Nevertheless, the OECD reports and analysis are important and highlight the need for better evidence for young children – see Part II. The iPIPS project allows observing inequality at the start of compulsory education. Moreover, the importance of value-added models to estimate school and policy impact cannot be underestimated. The iPIPS research design provides what is needed to help teachers understand their pupils’ progress and government to make better-informed decisions, including those to reduce inequality. This section presents two chapters addressing the challenges of measuring poverty and family SES across different countries. Social inequality and early childhood education’s role in reducing inequality at the starting point of compulsory education are key topics. The iPIPS project presents unique data that can help estimate the ‘poverty effect’ early and identify effective programs implemented in different countries to reduce educational inequality. The first chapter from Alves and de Paula highlights the ways iPIPS research teams have measured family SES and the main similarities across them. The second chapter by Lee Cooping discusses the poverty effect and how value-added data collected at the first year of school can help researchers understand the poverty effect on learning at a very early age.

IV  The First Year at School: Education Inequality, Poverty and Children’s Cognitive…

161

References Hawker, D. (2015). Baseline assessment in an international context. In Handbook of international development and education (pp. 305–325). Edward Elgar Publishing. OECD. (2020). Early learning and child Well-being: A study of five-year-olds in England, Estonia, and the United States. OECD Publishing. https://doi.org/10.1787/3990407f-­en Peisner-Feinberg, E. S., Burchinal, M. R., Clifford, R. M., Culkin, M. L., Howes, C., Kagan, S. L., & Yazejian, N. (2001). The relation of preschool child-care quality to children's cognitive and social developmental trajectories through second grade. Child Development, 72(5), 1534–1553. https://doi.org/10.1111/1467-­8624.00364. PMID: 11699686. Sammons, P., Sylva, K., Melhuish, E., Siraj-Blatchford, I., Taggart, B., & Hunt, S. (2006). Influences on Children’s attainment and Progress in key stage 2: Cognitive outcomes in year 6. Effective pre-school and primary education 3–11 project (EPPE 3–11). Research Report No DCSF-RR048. Department for Children, Schools and Families, 2006. Sylva, K., Siraj-Blatchford, I., Taggart, B., Sammons, P., Melhuish, E., Elliot, K., & Totsika, V. (2006). Capturing quality in early childhood through environmental rating scales. Early Childhood Research Quarterly, Amsterdam, 21, 76–92. Tymms, P., Merrell, C., & Henderson, B. (1987). The first year at school: A quantitative investigation of the attainment and progress of pupils. Educational Research and Evaluation, 3(2), 101–118. Tymms, P., Jones, P., Albone, S., & Henderson, B. (2009). The first seven years at school. Educational Assessment, Evaluation and Accountability, 21(1), 67–80.

Chapter 10

Measures of Family Background in the iPIPS Project – Possibilities and Limits of Comparative Studies Across Countries Maria Teresa Gonzaga Alves and Túlio Silva de Paula

The chapter focuses on publications that used family background measures in iPIPS from the United Kingdom (England and Scotland), Australia (Western Australia), Russia (Tatar Republic), South Africa (Western Cape) and Brazil (Rio de Janeiro- RJ and Sobral-CE). It presents the measures of family background by country and technical growth complexity of the measures, organising the results of the review in a table by country according to the following topics: (i) the variables and indicators to measure family backgrounds or SES; (ii) their descriptions; (iii) type; (iv) scale or categories; (v) sources; (vi) methodological note about the estimation methods or use of the measures and (vii) references.

10.1 Introduction This chapter presents a systematic review of family background measures in the International Performance Indicators in Primary Schools (iPIPS). The review focuses on publications that used family background measures in iPIPS from the United Kingdom (England and Scotland), Australia (Western Australia), Russia (Tatar Republic), South Africa (Western Cape) and Brazil (Rio de Janeiro- RJ and Sobral-CE). These countries have vastly different socio-economic contexts. In Australia and the United Kingdom (as there are no separate data for England and Scotland) people ages 25 and older have an average of 13 years of schooling, in Russia 12 years of schooling, in South Africa 10.2  years of schooling, and in Brazil 7.8  years of M. T. G. Alves (*) · T. S. de Paula Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_10

163

164

M. T. G. Alves and T. S. de Paula

schooling (United Nations Development Programme, 2020, data from 2018). Only South Africa and Brazil report the percentage of illiterate people over 15 years of age: 5.6 percent and 8 percent respectively (United Nations Development Programme, 2020, with data from 2015). Both are also among the most unequal countries in the world, with the Gini index of 63 percent and 52 percent, respectively, on a scale of 0 to 100 – that is, from non-existent to absolute inequality of wealth distribution. Australia and the United Kingdom are much less unequal – each with an index of 34 percent – and Russia is in an intermediate position, with 40 percent (World Bank, 2020, data from 2014). There are many other examples of social, economic, cultural and political differences across these countries, although some countries have more similarities with each other due to historical ties. Such specificities justify the development of their own background measures, albeit creating difficulties for making comparisons. In school effectiveness research using longitudinal data, it is assumed that backgrounds are implicit in the prior achievement scores. However, when inequalities between and within countries are severe, this assumption may not hold up (Buchmann, 2002; Lee & Burkam, 2002; Rutkowski & Rutkowski, 2013). That said, comparative research between learning and family backgrounds, makes more sense when considering the challenge of measuring background variables in such diverse countries. In educational assessment research, the concept of family background refers to the influence of parental attributes or investments on children’s education. The attributes refer to any parental resource or characteristic which includes economic and material resources, human, cultural, or social capital, socio-economic status, family structure, ethnic origin, and others (Bourdieu, 1997; Coleman et al., 1966; Coleman, 1988; Fahle et al., 2020; Ogbu, 1997). Family investments are actions taken by parents aiming to, directly or indirectly, influence their children’s school performance. For example, family investments include financial expenditure on their children’s education and well-being, parents’ time and efforts in supervising school activities, and other types of support that could create a home learning environment and increase the children’s educational opportunities (Brown, 1990; Lareau, 2003; Vasilyeva et al., 2018; Zhang & Bray, 2020). These dimensions are closely related to each other. For instance, investing resources in education depends on sufficient income. The ability to supervise children’s school life may rely on the level of parental education, the family structure, and their available time. In empirical research, parents’ education and occupation, and family income are usually background indicators. In many studies, researchers synthesise these variables in a construct called socio-economic status (SES), an indicator of the global influence of family backgrounds on educational results (Buchmann, 2002). The meaning of family background and SES often overlap and there is no universal consensus on how to measure them (Avvisati, 2020; Buchmann, 2002; Rutkowski & Rutkowski, 2013; Harwell, 2019). In almost all measures, SES refers to the relative position of families in a social hierarchy in the context – neighbourhood, city, country – where formal schooling takes place. In this chapter, we interchangeably use the terms background and SES, unless we refer to a specific indicator in one of the iPIPS studies. The review focused on family background in the iPIPS studies, comparing the measure of this construct (data,

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

165

variables, and methods) to answer the following questions: Are there common family background indicators across countries? Are there specific ones within countries? Can we compare results from different analyses across countries regarding their socio-economic differences? It is important to consider that this work focuses on implemented measures observed in papers and reports analysing iPIPS. We do not aim to discuss theoretical definitions of family background or measures of socio-economic status, nor point to a desirable approach or theory. We hope that this analysis can offer a wide framework of relevant issues about possibilities and limits when we compare family background in iPIPS studies and indicate directions for future research and educational policies.

10.2 The Measures of Family Background in iPIPS Studies The primary sources of this review are articles and reports produced by iPIPS partners. We selected those which use SES or other related background variables – at the level of the child, its family, or school. We also recorded the source of the data, explanations about how the SES was calculated, and any other relevant information useful to understand the measures’ meaning. We contacted some co-authors of the documents to clarify technical aspects of the variables and indicators that were not clear to us. These measures were not the focus of publications and were often briefly described. As iPIPS projects started at various times in each country, the number of publications differed among them. We consider the areas of the iPIPS studies as the major unit of analysis, the variety of the backgrounds and SES measures within them as a secondary one. We have not described the results by documents as they overlapped. In reviewed publications, background measures were used as control variables, as potential confound factors or acknowledged as key factors in predicting performance. The idea of the relevance in equal educational opportunities and education outcomes is implicit. Each iPIPS study customised their backgrounds’ measures according to their aim, the data source, the available data and methodological treatment applied to the information. Some researchers only present partial explanations about the properties of these measures, and when their foci demanded more explanations, they briefly described the family background variables. Descriptions in this chapter summarise the best information about conceptualisation and methodological procedures relevant to the specific indicator or index. When necessary, we consulted other sources. Indicators and indexes analysed here could involve levels of analysis: the individual level, when the information corresponds to characteristics of pupils or their families or the school level, that describes differences between school attendance or intake. We present the measures of family background by country and technical growth complexity of the measures, organising the results of the review in a table by country according to the following topics: (i) the variables and indicators to measure family backgrounds or SES; (ii) their descriptions; (iii) type; (iv) scale or categories; (v) sources; (vi) methodological note about the estimation methods or use

166

M. T. G. Alves and T. S. de Paula

of the measures and (vii) references (see the Table  10.1 in the appendix of this chapter). Next, we describe and comment these results, organising them according to the countries taking part in the iPIPS, and then answer our research questions.

The United Kingdom (England and Scotland) England and Scotland present complex indexes of family backgrounds, which share a similar approach. They are both based on the notion of deprivation, which uses demographic and spatial inequalities of living conditions as a key concept. People may live in poverty if they lack the financial resources to meet their needs, whereas people can be deprived if they lack resources, not just income (Ministry of Housing, Communities & Local Government, 2019). In these countries, we also find the use of other indicators like English as a Second Language (ESL) and entitlement to free school meals. The Index of Multiple Deprivation (IMD) is the official measure of relative deprivation in England. The IMD comprises seven different domains of deprivation related to income, employment, education (skills and training), health and disability, crime, barriers to housing and services and living environment. It is part of a suite of outputs that form the Indices of Deprivation (IoD). It is used to guide the priorities of local policymakers in areas with a great need of services and follows an established methodological framework in broadly defining deprivation to encompass a wide range of an individual’s living conditions. A set of relative measures of deprivation for small geographical areas (named Lower-layer Super Output Areas – LSOA) across England are used to produce the indicators. These are based on Census data which are collected every 10 years. The Income Deprivation Affecting Children Index (IDACI) is the indicator of deprivation used in England’s iPIPS studies. It measures the proportion of children aged 0 to 15 living in income deprived families. It is a supplementary index based on the Income Deprivation Domain (IDD), one of the seven IMD dimensions. The IDD measures the proportion of the population in an area experiencing deprivation related to low income, including both people out-of-work and people who have low earnings (McLennan et al., 2019). As with the IMD, the Scottish Index of Multiple Deprivation (SIMD) is based on work conducted by Oxford University in 1999. The SIMD is one of four indices that cover all the United Kingdom (UK), so they are quite similar; it combines 38 different indicators for the same seven domains of deprivation. The spatial references for the SIMD are called data zones, which are smaller geographical units than the LSOA (SG, 2016). The indicators of each domain are combined using standardisation and transformed to an exponential distribution, then weighted and ranked to create the overall rank. Data sources are Census or Small Area Population Estimates (SAPE), Department for Work and Pensions (DWP), Her Majesty’s Revenue and Customs (HMRC), National Records of Scotland (NRS), Police Scotland and Scottish Qualifications Authority (SQA). Scotland’s studies use the measure of SIMD, which includes all seven deprivation dimensions to represent the family background.

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

167

Entitlement to free school meals is administrative information that indicates pupils from more affluent or deprived home backgrounds. It differs in England and Scotland, but both include income-based and/or entitled support policies criteria. Merrell et al. (2016) used this indicator in the same multilevel model with IDACI indicator, both to explain attainment at age 11. The variable ethnic origin describes groups of pupils to analyse equal education opportunities. In England studies, White pupils are the reference category, compared with Black, Asian, Mixed, Chinese, any other, and not recorded. Studies conducted in England and Scotland include the indicator English as a Second Language (ESL) at home, considering that the mismatch between home and school language affects educational results.

Australia (Western Australia) Geographical location in Australia is associated with inequality of educational opportunities. The Australian Bureau of Statistics (ABS) has a standard classification of areas, to enable the publication of comparable statistics spatially integrated. In Australian iPIPS studies, use is made of distinctions between provincial, remote and very remote areas (Wolgemuth et al., 2011), remote and rural areas (Wolgemuth et al., 2014) and metropolitan, regional and remote areas (Styles et al., 2014). Indigenous status in their educational systems is important to countries such as Australia. Indigenous students present the pervasive trend of being more developmentally at-risk than non-Indigenous ones. Language minority status, low socio-­ economic status, low phonological awareness and low letter-naming abilities are related to this trend (Styles et al., 2014). Australian iPIPS studies also include the indicator English as a Second Language (ESL) at home. Over the years, public and private strategies, in the form of additional health and educational resources, were designated to assist schools with high proportions of indigenous students to counteract this trend. Remote areas of the country have exacerbated problems and difficulties, being those with most Indigenous students (Styles et  al., 2014). Thus, it serves to identify an individual characteristic and a school characteristic as a proportion of Indigenous students. The Australian Curriculum, Assessment and Reporting Authority (ACARA) created the Index of Community Socio-Educational Advantage (ICSEA) for each school that participates in the National Assessment Program  – Literacy and Numeracy (NAPLAN). Although none of the reviewed iPIPS studies referred to the ICSEA, this is an interesting measure of school composition available in the country. The ICSEA compares schools with similar conditions of student attendance, geographical location, and proportion of Indigenous students. Parents answer questions about their occupation, schooling, and non-school education when enrolling a child in school. If the measures taken directly from children’s enrolment are statistically unreliable, indirect variables from Australian Bureau of Statistics (ABS) census data are used to replace them (Australian Curriculum, Assessment and Reporting Authority, 2020).

168

M. T. G. Alves and T. S. de Paula

Russian Federation (Tartar Republic) In the Russian Federation, iPIPS publications such as the parent questionnaire, is used to characterise family backgrounds. Its items provide data on the classical sociological variables – that is, the parent’s education and occupation, and family income – as well as aspects related to the family structure, ethnic origin, parents’ investments in education, and other issues about family-school relationships.1 Although it is not a short questionnaire, its response rate was remarkably high (93 percent) and there were little missing data (Kuzmina & Ivanova, 2018; Vasilyeva et al., 2018). In the reviewed publications, the parents’ educational attainment and income levels are the crucial dimensions of the family SES. They reported parental education on a scale ranging from incomplete high school to graduate degree (Vasilyeva et al., 2018). They also used it as dichotomous variable contrasting families with parents that did not achieve higher education with ones in which at least one parent has higher education (Kuzmina & Ivanova, 2018). This cut-off point is consistent with the average schooling of the Russian adult population, which reaches 12 years, as mentioned above. Likewise, family income ranges across three income bands (in roubles). Low-­ level income corresponds to the poverty households, in which the monthly family’s income was below 20 thousand roubles. The middle-level corresponds to 20 to 50 thousand roubles and the high-income is above this. In the analysis, categorical dummy variables are used to distinguish incomes below or above the average. Key types of parental investment related to home activities, cultural resources, books at home, and outside-home activities were tested in a ‘Family Investment Model’ (Vasilyeva et al., 2018). Home activities address parental practices with the child  – such as reading books, writing, playing  – before formal schooling. Activities outside the home refer to the child’s participation in enrichment activities, such as music lessons, dance classes, sports clubs or arts programmes, and in academic preparation activities before the first grade. The results reinforce the interconnection between classical variables related to the SES index – mainly the parents’ level of education – with actions and beliefs that can favour the learning of offspring. Ethnicity is addressed through items on nationality and language spoken at home. As Kuzmina and Ivanova (2018), point out, the region where iPIPS was conducted is multi-ethnic (mainly Tatar and Russian people) and may have specific characteristics related to culture and religion. The authors of the reviewed publications emphasise that this region has a similar SES to the average in Russia, including the unemployment rate.

 We thank Alina Ivanova for providing us with the versions of the Russian iPIPS questionnaires of 2014 and 2016. 1

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

169

South Africa (Western Cape) The association between residential segregation and educational opportunities is a prominent topic with the South African iPIPS. All South African public ordinary schools are classified into an official system called ‘quintiles’ – largely for the allocation of financial resources – according to the affluence of its location and certain infrastructural factors (Department of Education, 2006). Quintile 1 represents the poorest neighbourhoods, often in the formerly designated black townships. Quintile 5 is the most affluent, often in white suburbs. Schools in Quintile 1 to Quintile 3 have been declared no-fee schools. The quintiles are an indicator of school socio-­ economic status (SES). There were no Quintile 1 schools in the sampling, which are usually in remote areas and extremely poor environments (Tymms et al., 2017). Besides this, we record two similar indicators of home SES calculated using data from the parents/guardians’ questionnaire. Both comprised items related to household attributes (such as possession of durable goods, access to services) and parents’ investments through cultural resources that could assist a child’s learning (such as child’s books, his/her own room, and internet connection). The first SES indicator – used in the report about the first year of the iPIPS – was estimated using the Rasch model from Item Response Theory (IRT) (Tymms et al., 2017). The second one – a predictor variable in an investigation about the influence of dialects and code-switching on the achievement of isiXhosa learners  – was developed by a Principal Component Analysis (PCA), and the scores were converted into an ordinal variable of low, middle, and high SES (Mtsatse & Combrinck, 2018).2 The percentage of missing data for the SES variable was quite high (35 percent), according to the report (Tymms et al., 2017). The spoken language at home plays an important role in describing family background, as it has more variations than the teaching language options at school. The iPIPS report found that the learners came from homes speaking a possible range of 10 languages, although 99% reported that they spoke one of the three assessment languages at home, that is, Afrikaans, English and isiXhosa (Tymms et al., 2017). Overall, there was an association between all those background indicators applied in the Western Cape iPIPS studies. The SES indicators of families and schools (quintiles) are associated with the language of learning and teaching of schools, which seem to reflect children’s ethnic origins. To paraphrase the report, this illustrates the persistent effect of South Africa’s historical racially-based policies. The country is at the top of income inequalities, according to the Gini index, and the background indicators appear to largely measure this context.

 Information about estimation methods was obtained with Lee Copping and Celeste Combrinck, who we would like to thank for their availability to solve our doubts. 2

170

M. T. G. Alves and T. S. de Paula

Brazil (Rio de Janeiro – RJ and Sobral – CE) In Brazilian iPIPS publications, the family background is described by sociological variables related to the socio-economic status construct. They also include measures of family structure and cultural resources in the household related to investments in education, which show a multidimensional conception of backgrounds. Parental education is an ordinal variable with three levels: elementary education, secondary education and higher education (Bartholo et  al., 2020c). This variable was also applied in statistical models as a dichotomous variable distinguishing those families with parents/guardians that have not achieved high school from ones that have at least one parent with a high school or higher education diploma (Koslinki & Bartholo, 2019). This cut-off point is compatible with the country’s level of education since only in the last decade has an educational reform expanded mandatory schooling to 12 years (Brasil, 2009). At the school level, the proportion of children with at least one parent/guardian with high school or higher education was applied as a measure of school composition. There were two indirect indicators of family income in Brazil. The poverty indicator informs whether the family is a beneficiary of a conditional cash transfer (CCT) programme for mitigating the country’s extreme inequality (Bartholo et al., 2020b, c; Koslinski & Bartholo, 2019). It is a reliable measure to distinguish the families at the bottom of the social hierarchy as it focuses on extremely poor households. The average of CCT beneficiaries per school was also used as a measure of school composition. The other indicator  – possession of goods  – stratifies households in a range according to the combination of their assets and some services at home on a continuous scale (Bartholo et al., 2020b). This indicator, and others that encompass multiple items, was elaborated using the Rasch model from IRT. While the poverty indicator was based on administrative and questionnaire data, this index was devised based only on items of the questionnaire, so it had a higher percentage of missing data. An indicator of learning environment measures parental investment in a child’s education at home, which synthesises questionnaire items about cultural resources and practices that can favour child’s development (Bartholo et al., 2020a, b, c). As the indicator of possession of goods, it also had a high percentage of missing data. A global measure of the socio-economic status (SES) of families was prepared for the samples of public schools in Rio de Janeiro-RJ and Sobral-CE (Bartholo et al., 2020a; Koslinski & Bartholo, 2020). It relied on the previous indicators (education, poverty, possession of goods), in addition to measures of home density and the existence of a room for the child. The combination of administrative data and items from questionnaires reduced the missing data. The SES’s average at the school level was also applied as a measure of school composition. Student’s ethnicity/colour is an important background dimension since racial belonging is a dimension of social hierarchy in Brazil (Marteleto, 2012). In official statistics, racial categorisation is done according to the skin colour in a continuum, in which black and white people are the two extremes. Despite the apparent fluidity of this classification system, it has already been well documented that white people have significant social and economic advantages over the others (Muniz & Bastos,

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

171

2017). Thus, this variable has been dichotomised in the analyses with the categories of white and non-white children (Bartholo et al., 2020b, c; Koslinski & Bartholo, 2019). Besides, the average of non-white children per school was applied on statistical models as a variable of racial composition.

10.3 Preliminary Answers: Possibilities and Limits of Comparative SES In the reviewed publications, we found direct and indirect indicators of family background or home SES, with secondary and/or primary data, with some similarities between them. These indicators measure the dimensions of family attributes and their investments in the education of children. Although they were not built for comparisons across countries, we sought to establish some parallels between indicators by answering the questions we stated. Are There Equivalent SES Indicators in iPIPS?  All iPIPS studies used some kind of broad schools’ SES indicators and there are similarities between some of them. They capture the levels of schools’ segregation in the territories or communities where the studies were carried out. In England, Scotland and South Africa, the indicators are based on the level of neighbourhood deprivation. In Australia, it is related to school composition (including percent of Indigenous students) and geographical location. The schools’ SES indicators in Brazil as well in Russia, were constructed from the aggregation of the family background measures (at the classroom level), to analyse the schools’ compositional effects on learning. At the student level, family background indicators were developed with data from contextual questionnaires, school administrative data and Census data. Parents’ schooling was used in the revised studies from Russia and Brazil, which can be compatible by applying the International Standard Classification of Education (ISCED-2011) (UNESCO, 2012). However, when this indicator was used as a dummy variable, the cut-off point on the scale of educational levels was lower in Brazil, as its population has lower educational attainment. Regarding the level of household income, in Russia, the contextual questionnaire included an item with ranges of family income, a dimension of the family SES. In Brazil, it is possible to infer whether the family is extremely poor if they are beneficiaries of an income transfer programme. In the England and Scotland studies, a student’s eligibility for free lunch was an indicator of the child’s household income level. We further note that in England, Scotland and South Africa, indicators of deprivation of the school surroundings use data on schooling and income of the population for calculation. Eligibility for specific help relies on income level requirements which differ from countries and this hinders comparability. They also rely more on the attributes than on the investments of families. The latter can be directly approached by questionnaires. South Africa and Brazil have developed their own household SES indexes applied to items partially in common.

172

M. T. G. Alves and T. S. de Paula

The iPIPS in Russia stands out for the broadness of measures on parental investments in the child’s education. Brazilian iPIPS also has collected similar items and developed an indicator of the home learning environment. In South Africa, the indicator of home SES includes items about resources that could assist a child’s learning. As Russia is a less unequal country than the other two, there seems to be more room for these other determinants to stand out. In Brazil and South Africa, the most structural variables – poverty, education, the quintiles, home SES – overlap the variables related to parental practices and investments. However, there are two limitations to the comparison of these measures. First, the age at which the child enters regular education is different in these countries, and, considering the stage of child development, these parental investments can have different meanings in each situation. Another limitation is the missing data. The response rate of Russian parents was much higher than in Brazil and South Africa. Are There Specific Background Indicators in iPIPS?  We found specific background measures in the publications, but we highlight ethnic origins. The multiple languages of teaching in South Africa, the indigenous students in Australia, the gap between white and non-white students in Brazil, and the multi-ethnic population in Russia are important indicators of socio-economic and cultural differences within and between countries. England, Scotland, Australia and South Africa have the variable ‘English as a second language’, which can measure both the condition of a migrant family or the original languages of people in these last two countries. In this way, the child’s ethnic origin can be analysed in interaction with English as the child’s additional language. How Can We Compare Them?  Although family background indicators have a similar underlying theoretical basis and there are common items across countries, comparability cannot be guaranteed. England, Scotland and Australia are more comparable with each other than with the other iPIPS partners, given their similarities in education, wealth, language and historical ties. We can see comparisons with them in Tymms et al. (2015), for example. Although South Africa also has historical ties to the UK, and probably the Western Cape is less unequal than the country’s average – as there were no Quintile 1 schools – the variation in affluence, culture and other factors are still vast, as discussed by Tymms and co-authors (2017). All these countries share the English language, whilst it is not the only one. As mentioned above, language can be an indicator of underprivileged family, depending on the child’s ethnicity backgrounds. It is worth mentioning that the location of schools in the territory was a criterion in sample planning. Then, it is possible to draw some parallels between indicators based on the notion of the geography of educational opportunities, whereby the effects of school quality on children’s levels of development and progress during their school trajectory are dampened or amplified by residence in an advantaged versus disadvantaged neighbourhood. Even with longitudinal data, more specific home-based measures may be more relevant to research on school effectiveness as Tymms et  al. (2015) point out.

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

173

Arguably, based only on our results, this should be more relevant in very unequal social contexts, such as South Africa and Brazil. The report for Western Cape records that school SES, as measured by the official quintile classification, was weakly related to cognitive measures, unlike home SES indicator developed with data from questionnaires. Creating a unique indicator for the socio-economic background is a complex and difficult task. It would be necessary to carry out studies to equalise measures across countries, use questionnaires with some common items related to the SES (not necessarily all), and test the psychometric properties of the items to investigate whether there is differential behaviour of the items before constructing an index. Nevertheless, the missing data is a threat to the validity of an index obtained through questionnaires. The response rate of families is different across countries, perhaps evidence of cultural differences in the family-school relationship. This point should be the subject of further investigation.

10.4 Final Remarks We started this chapter with some caveats about the contextual differences among iPIPS countries’ partners. When results of research on school effectiveness in different countries are compared, it is important for policymakers to consider the backgrounds to avoid the risk of confounding the relationship between these variables and educational outcomes over the ‘school effects’. But it is also important not to take differences in home background as excuses for poor education. This discussion is especially important in more unequal countries, where school systems reproduce the disadvantages from one generation to the next, despite educational policies to expand educational opportunities for all. There are comparative publications regarding the development of children with PIPS data (Tymms et al., 2015; Ivanova et al., 2018). However, unless involving England and Scotland and Australia – which applied ‘English as a second language’ as background variable – socio-economic measures have not been employed in the statistical models. There is an expectation that more comparative studies will be done to explore differences in early childhood development despite contextual differences across countries (Tymms et al., 2015; Tymms et al., 2017). With the measures available, the researchers involved can seek broader explanations about the effect of socio-economic contexts in comparative analyses. After all, the design of all studies was planned to represent the socio-economic differences between schools and children in their respective geographical contexts. School samples are stratified at levels that consider the location and other criteria appropriate for the country, region, or municipality. Besides this, the administrative data on schools gives a general sense of their socio-economic context, and questionnaires answered by parents/guardians generate rich data at students’ level. It is up to researchers to comparatively analyse their data to advance this investigation.

Type Scale

White, Asian, Chinese, black, mixed, other, unclassified.

Ethnicity

Categorical

No; yes

Questionnaire

Questionnaire

Categories/ scale Source Normalized (z-score) The National Pupil / fraction (0 to 1) Database (NPD) from Department of Education (England); Scottish educational system (educational analytical services division data) (Scotland). No; yes Department of Education.

English as Dummy Additional Language (EAL)

Entitlement to Dummy free school meals

Background Country variable England and Deprivation Scotland

Table 10.1  Background variables in iPIPS studies

Appendix

Methodological note Census data and other official measures of population are the primary data source to create these indicators. National Educational Systems use information about home postcode to link pupils’ home and the area’s deprivation. In England, the Income Deprivation Affecting Children Index (IDACI) is a supplementary index derived from a specific domain: The Income Deprivation Domain (IDD). In Scotland, the Scottish Index of Multiple Deprivation (SIMD) is used, comprising all domains of deprivation. Administrative data. Information about England is available in . Information about Scotland is available in . Questionnaire for parents/carers captures basic background information on the children included in the sample. Information about iPIPS is available in

As above.

Copping et al. (2016); Tymms et al. (2016)

Tymms et al. (2015); Cramman et al. (2020); Tymms et al. (2016)

Merrell et al. (2016); Cramman et al. (2020)

References Tymms et al. (2014); Scottish Government. (2016); Tymms et al. (2018).

Country Australia

Type Categorical

500 to 1300

Index of Community Socio-­ Educational Advantage (ICSEA)

Scale

Non-indigenous; indigenous

Categories/ scale Metropolitan, regional or remote areas; provincial, remote and very remote areas; or rural and remote areas. No; yes

Indigenous status Dummy

Dummy English as Second Language (EAL)

Background variable Geographical location

Australian Curriculum, Assessment and Reporting Authority (ACARA)

Questionnaire

Questionnaire

Source Department of Education.

Styles et al. (2014)

References Styles et al. (2014); Wolgemuth et al. (2014); Wolgemuth et al. (2011)

(continued)

Styles et al. (2014); Wolgemuth et al. (2014); Wolgemuth et al. (2011) Wolgemuth et al. The development of ICSEA involved collecting (2014); ACARA student family background data and identifying, with a statistical model, the combination of variables (2020) that have the strongest association with student performance in the National Assessment Program – Literacy and Numeracy (NAPLAN) results. ICSEA values typically range from approximately 500 (representing schools with extremely disadvantaged student backgrounds) to about 1300 (representing schools with extremely advantaged student backgrounds). ACARA calculates an ICSEA value for all schools for which sufficient aggregate-level data is available.

Questionnaire for parents/carers captures basic background information on the children included in the sample, such as ages, special needs, and socio-economic status. Information about iPIPS available in As above.

Methodological note Information about Australian statistical geography standards (ASGS) is available here .

Ordinal

Diversity of resources at home

Ordinal

Home activities 7-point Likert-type not related to language/literacy scale

Household income

Table 10.1 (continued) Background Country variable Type Russia Parent education Ordinal

1. Never; 2. Very rarely; 3. Once a month; 4. Once in two weeks; 5. Once a week; 6. Once a day; 7. More than once a day. Scores on this scale varied from 0 to 4, reflecting the total number of selected items. Questionnaire

Questionnaire

Categories/ scale Source 1. Did not complete Questionnaire high school; 2. High school; 3. Vocational certificate; 4. Incomplete higher education; 5. College degree; 6. Masters; 7. Doctoral. Questionnaire 1. Up to 20,000; 2. 20,000 – 50,000; 3. 50,000 – 100,000; 4. More than 100,000 (in roubles)

Methodological note The Parents/guardians answered about their educational level. Vasilyeva et al. (2018) coded the responses by the number of years of schooling corresponding to the reported level of education from 9 years (minimal required education) to 20 years (doctoral degree). The mother’s years of schooling was used as the measure of parental education. Parents/guardians answered about the average monthly whole family’s income. Vasilyeva et al. (2018) reduced the original four categories to three: low-income (0–20 thous. roubles), middle (20–50 thous. roubles), and high-income (above 50 thous. roubles). Kuzmina and Ivanova (2018) reduced the categories as dichotomous variable: 0 = 50 thous. List of activities (preceded by the question: ‘In the year prior to first grade, how often did you, or another member of your family, engage in the following activities with your child?’): eight activities not directly related to literacy (e.g., building with construction blocks, playing with puzzles). Scores by the sum. Parents were asked: ‘Which of the items listed below are present at your home?’ This was followed by a list of four items: computer (iPad), electronic educational games, board games, and non-fiction children’s books. Scores on this scale varied from 0 to 4, reflecting the total number of selected items.

Vasilyeva et al. (2018)

Vasilyeva et al. (2018)

Vasilyeva et al. (2018); Kuzmina and Ivanova (2018)

References Vasilyeva et al. (2018); Kuzmina and Ivanova (2018)

Country Russia

0 to 1

1. Only Russian; 0. Other 0 to 1

No; yes

Dummy

Outside-home academic preparation activities

Language Dummy spoken at home Scale Proportion of parents with higher education Scale Proportion of students from high incomes’ families

No; yes

Dummy

Outside-home enrichment activities

School-level variable created

School-level variable created

Questionnaire

Questionnaire

Questionnaire

Categories/ scale Source 0. None; 1. 1–25; 2. Questionnaire 26–50; 4. 51–100; 4. More than 100.

Type Ordinal

Background variable Number of children’s books

Variable at the pupil level aggregated by its average at the classroom level.

Methodological note Parents were asked to estimate the number of children’s books by checking one of four categories: Scores on this scale varied from 0 (none) to 4 (more than100). Kuzmina and Ivanova (2018) reduced the categories as dichotomous variable: 0 = 100 books The question asked whether the child, in the year prior to first grade, participated in enrichment activities, such as music lessons, dance classes, sports clubs, or arts programs. The question asked parents whether the child participated in academically-oriented programs designed to prepare children for school. For these items, parents responded on a dichotomous yes/no scale. The question asked parents which language is spoken at home most of the time. Variable at the pupil level aggregated by its average at the classroom level.

(continued)

Kuzmina and Ivanova (2018)

Kuzmina and Ivanova (2018) Kuzmina and Ivanova (2018)

Vasilyeva et al. (2018)

Vasilyeva et al. (2018)

References Vasilyeva et al. (2018); Kuzmina and Ivanova (2018)

Continuous

Nominal

Language spoken at home

Type Ordinal

Home socio-­ economic status

Table 10.1 (continued) Background Country variable South Africa School SES (quintiles) Source Provincial Education Department

10 languages were cited.

Questionnaire

Normalized (z-score) Questionnaire

Categories/ scale NQ1 to NQ5

Methodological note Quintiles are nationally defined based on poverty scores of schools calculated with three indicators from the surrounding community: Income levels of households; unemployment rate; and levels of education (literacy rate), as well as certain infrastructural factors. There were no NQ1 schools in WC sample. Parents/guardians answered the question: “Do you have the following in your home?” It was followed by 15 binary items about household assets and basic needs, such as running water, electricity, and flush toilets, as well as luxury items such as satellite television, a car, and resources that could assist a child’s learning such as access to books, their own room, and internet connection. Tymms et al. (2017) elaborated an SES indicator using a Rasch model comprised of 13 items. Mtsatse & Combrinck (2018) created a SES scale through Principal component Analysis (PCA), divided into three ordinal categories: low-SES, middle-SES, and high-SES. Parents/guardians answered the following question: “What language do you speak at home most of the time?”. The learners came from homes speaking 10 languages, 99% reported speaking one of the assessment languages at home: Afrikaans; English; isiXhosa

Tymms et al. (2017); Mtsatse & Combrinck (2018)

Tymms et al. (2017); Mtsatse & Combrinck (2018).

References Tymms et al. (2017); Department of Education (2006)

Country Brazil

Type Ordinal

Proportion of family benefited by CCT programs Possession of goods

Normalized (z-score) Questionnaire

Continuous

School-level variable created

Questionnaire and secondary data

No; yes

0 to 1

School-level variable created

0 to 1

Categories/ scale Source Questionnaire and Elementary education; secondary secondary data education; higher education

Continuous

Continuous Proportion of parent/guardian with high school or highest Poverty/ family Dummy benefited by CCT programs

Background variable Parental education

Koslinski & Bartholo (2019)

Koslinski & Bartholo (2019); Bartholo et al. (2020b, c)

Koslinski & Bartholo (2019)

References Koslinski & Bartholo (2019); Bartholo et al. (2020b, c)

(continued)

Indicator elaborated using the Item Response Theory Bartholo et al. (2020a) (IRT) with the following items: Possession of car, washing machine, computer, tablet, printer; access to Internet and cable TV service.

Methodological note Parents/guardians answered the maximum educational level of the child’s main caregiver. Missing data were partially supplemented by administrative data from academic administration system of the Rio de Janeiro municipal education secretariat. It was also used as dummy variable, dichotomizing parents with schooling below and above high school. Variable at pupil level aggregated by the average at the school level. Proportion of children with at least one parent/guardian with high school and/or complete higher education Parents/guardians answered if the family received money from any conditional cash transfer (CCT) program in the past 3 years. Missing data were partially complemented by administrative data from academic administration system of the Rio de Janeiro municipal education secretariat. Variable at the pupil level aggregated by its average at the school level

Country Brazil

Scale

1. Non-white; 0. White

0 to 1

Continuous

Continuous

Dummy

Continuous

Pupils’ SES aggregated by average

Home learning environment

Colour/race

Proportion of non-white children

Academic administration system of the Rio de Janeiro municipal education secretariat. School-level variable created

Normalized (z-score) Questionnaire

School-level variable created

Categories/ scale Source Normalized (z-score) Questionnaire and secondary data

Type Continuous

Background variable Pupil’s socio-­economic status (SES)

Table 10.1 (continued)

Variable at the pupil level aggregated by its average at the school level.

Indicator elaborated using the IRT with the following items: Parents’/guardians’ activities with children (reading, drawing, singing, playing/ developing activities with numbers, colours, and the alphabet); number of rooms in the home, a bedroom for the child, possession of books and children’s games. At enrolment, parents must inform the child’s colour/race from the categories used in official statistics: White, pardo (mixed-race), black, yellow (Asian descendants), and indigenous. In data analysis, the variable was reduced to the white and non-white categories.

Methodological note Indicator elaborated using the IRT with the following items: Parents’ education; participation in an income transfer program; home density; room only for the child; internet access; car; cable TV; washing machine; computer; tablet; and printer. The missing data on parents’ education and income transfer program were partially supplemented by administrative data from academic administration system of the Rio de Janeiro municipal education secretariat. Variable at the pupil level aggregated by its average at the school level.

Koslinski & Bartholo (2019)

Koslinski & Bartholo (2019); Bartholo et al. (2020b, c)

Koslinski & Bartholo (2020); Bartholo et al. (2020a) Bartholo et al. (2020b)

References Koslinski & Bartholo (2019); Bartholo et al. (2020a)

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

181

References Australian Curriculum, Assessment and Reporting Authority – ACARA. (2020). Guide to understanding the Index of Community Socio-educational Advantage (ICSEA). Retrieved from: https://www.myschool.edu.au/media/1820/guide-­to-­understanding-­icsea-­values.pdf Avvisati, F. (2020). The measure of socio-economic status in PISA: A review and some suggested improvements. Large-Scale Assessments in Education, 8(8). https://doi.org/10.1186/ s40536-­020-­00086-­x Bartholo, T. L., Koslinski, M. C., Andrade, F. M., & de Castro, D. L. (2020a). School segregation and education inequalities at the start of schooling in Brazil. REICE Revista Iberoamericana Sobre Calidad, Eficacia y Cambio en Educacion, 18(4), 77–96. https://doi.org/10.15366/ REICE2020.18.4.003 Bartholo, T. L., Koslinski, M. C., Costa, M., & Barcellos, T. (2020b). What do children know upon entry to pre-school in Rio de Janeiro? Ensaio: aval. pol. públ. Educ., 28(107), 292–313. https:// doi.org/10.1590/S0104-­40362019002702071 Bartholo, T. L., Koslinski, M. C., Costa, M., Tymms, P., Merrell, C., & Barcellos, T. M. (2020c). The use of cognitive instruments for research in early childhood education: Constraints and possibilities in the Brazilian context. Pro-Posições, 31, e20180036. Epub August 10, 2020. https://doi.org/10.1590/1980-­6248-­2018-­0036 Bourdieu, P. (1997). The forms of capital. In A.  H. Halsey, H.  Lauder, & P.  Brown (Eds.), Education: Culture, economy, society (pp. 46–58). Oxford University Press. Brasil. (2009). Emenda Constitucional N° 59, de 11 de novembro de 2009. [Constitutional Amendment No. 59, November 11, 2009] (2020, 28 October). Retrieved from: https://legislacao. presidencia.gov.br/atos/?tipo=EMC&numero=59&ano=2009&ato=57ccXSE1UeVpWTd7d Brown, P. (1990). The ‘Third Wave’: Education and the ideology of parentocracy. British Journal of Sociology of Education, 11(1), 65–85. http://www.jstor.org/stable/1392913 Buchmann, C. (2002). Measuring family background in international studies of education: Conceptual issues and methodological challenges. In A.  C. Porter & A.  Gamoran (Eds.), Methodological advances in cross-National Surveys of educational achievement (pp. 150–197). Coleman, J.  S. (1988). Social capital in the creation of human capital. American Journal of Sociology, 94, 95–120. https://doi.org/10.1086/228943 Coleman, J. S., Campbell, E. Q., Hobson, C. J., McPartland, J., Mood, A. M., Weinfeld, F. D., & York, R. L. (1966). Equality of educational opportunity (p. 548). US Department of Health, Education & Welfare. Office of Education (OE-38001 and supp.). Copping, L. T., Cramman, H., Gott, S., Gray, H., & Tymms, P. (2016). Name writing ability not length of name is predictive of future academic attainment. Educational Research, 58(3), 237–246. https://doi.org/10.1080/00131881.2016.1184948 Cramman, H., Gott, S., Little, J., Merrell, C., Tymms, P., & Copping, L. T. (2020). Number identification: A unique developmental pathway in mathematics? Research Papers in Education, 35(2), 117–143. https://doi.org/10.1080/02671522.2018.1536890 Department of Education. (2006). South African Schools Act, 1996 (act no 84 of 1996). In Amended National Norms and Standards for School Funding. (2020, 11 November). Retrieved from: https://www.gov.za/sites/default/files/gcis_document/201409/29179.pdf Fahle, E. M., Reardon, S. F., Kalogrides, D., Weathers, E. S., & Jang, H. (2020). Racial segregation and school poverty in the United States, 1999–2016. Race and Social Problems, 12(1), 42–56. https://doi.org/10.1007/s12552-­019-­09277-­w Harwell, M. (2019). Don’t expect too much: The limited usefulness of common SES measures. Journal of Experimental Education, 87(3), 353–366. https://doi.org/10.1080/0022097 3.2018.1465382 Ivanova, A., Kardanova, E., Merrell, C., Tymms, P., & Hawker, D. (2018). Checking the possibility of equating a mathematics assessment between Russia, Scotland and England for children starting school. Assessment in Education: Principles, Policy and Practice, 25(2), 141–159. https://doi.org/10.1080/0969594X.2016.1231110

182

M. T. G. Alves and T. S. de Paula

Koslinski, M. C., & Bartholo, T. (2019). Impact of child development centers in the first year of preschool. Estudos em Avaliação Educacional, 30(73), 280–311. https://doi.org/10.18222/eae. v30i73.5850 Koslinski, M.  C., & Bartholo, T. (2020). Inequalities in educational opportunities at the beginning of the educational trajectory in Brazil. Lua Nova, 110, 215–245. https://doi. org/10.1590/0102-­215245/110 Kuzmina, Y., & Ivanova, A. (2018). The effects of academic class composition on academic progress in elementary school for students with different levels of initial academic abilities. Learning and Individual Differences, 64(March), 43–53. https://doi.org/10.1016/j.lindif.2018.04.004 Lareau, A. (2003). Unequal childhoods: Class, race, and family life. University of California Press. Lee, V. E., & Burkam, D. T. (2002). Inequality at starting gate: Social background differences in achievement as children begin school. Economic Policy Institute. Marteleto, L. J. (2012). Educational inequality by race in Brazil, 1982–2007: Structural changes and shifts in racial classification. Demography, 49, 337–358. https://doi.org/10.1007/ s13524-­011-­0084-­6 McLennan, D., Noble, S., Noble, M., Plunkett, E., Wright, G., & Gutaker, N. (2019). The English indices of deprivation 2019: Technical report. Ministry of Housing, Communities & Local Government. Retrieved from: https://www.gov.uk/government/publications/ english-­indices-­of-­deprivation-­2019-­technical-­report Merrell, C., Sayal, K., Tymms, P., & Kasim, A. (2016). A longitudinal study of the association between inattention, hyperactivity and impulsivity and children's academic attainment at age 11. Learning and Individual Differences, 53, 156–161. https://doi.org/10.1016/j.lindif.2016.04.003 Ministry of Housing, Communities & Local Government. (2019). The English Indices of Deprivation 2019 (IoD2019): Statistical release. Retrieved from: https://www.gov.uk/ government/statistics/english-­indices-­of-­deprivation-­2019 Mtsatse, N., & Combrinck, C. (2018). Dialects matter: The influence of dialects and code-­switching on the literacy and numeracy achievements of isiXhosa grade 1 learners in the Western cape. Journal of Education, 72. https://doi.org/10.17159/2520-­9868/i72a02 Muniz, J. O., & Bastos, J. L. (2017). Classificatory volatility and (in)consistency of racial inequality. Cad. Saúde Pública, 33(supl. 1), e00082816. https://doi.org/10.1590/0102-­311x00082816 Ogbu, J. U. (1997). Racial stratification and education in the United States: Why inequality persists. In A. H. Halsey, H. Lauder, & P. Brown (Eds.), Education: Culture, economy, society (pp. 765–778). Oxford University Press. Rutkowski, L., & Rutkowski, D. (2013). Measuring socioeconomic background in PISA: One size might not fit all. Research in Comparative and International Education, 8(3), 259–278. https:// doi.org/10.2304/rcie.2013.8.3.259 Scottish Government (SG). (2016). The Scottish Index of Multiple Deprivation: SIMD16 technical notes. Retrieved from: https://www.webarchive.org.uk/wayback/archive/3000/https://www. gov.scot/Resource/0050/00504822.pdf. Accessed 25 July 2022. Styles, I., Wildy, H., Pepper, V., Faulkner, J., & Berman, Y. (2014). Australian indigenous students’ performance on the PIPS-BLA Reading and mathematics scales: 2011–2013. International Research in Early Childhood Education, 5(1), 103–123. Tymms, P., Merrell, C., Hawker, D., & Nicholson, F. (2014). Performance indicators in primary schools: A comparison of performance on entry to school and the progress made in the first year in England and four other jurisdictions: Research report. In Project Report. Department for Education. Tymms, P., Merrell, C., & Wildy, H. (2015). The progress of pupils in their first school year across classes and educational systems. British Educational Research Journal, 41(3), 365–380. https://doi.org/10.1002/berj.3156 Tymms, P., & Merrell, C., & Buckley, H. (2016). Children’s development at the start of school in Scotland and the progress made during their first school year: An analysis of PIPS baseline and follow-up assessment data. Research Report. The Scottish Government, Jan 6, 2016. ISBN: 9781785448942.

10  Measures of Family Background in the iPIPS Project – Possibilities and Limits…

183

Tymms, P., Howie, S., Merrell, C., Combrinck, C., & Copping, L. (2017). The first year at School in the Western Cape: Growth, development and Progress. Nuffield Foundation, report. Retrieved from: http://www.cem.org/attachments/Tymms-­41637-­SouthAfricaFinalReport-­Oct-­2017.pdf Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. https://doi.org/10.1080/0924345 3.2017.1404478 UNESCO. (2012). Institute for Statistics: International Standard Classification of Education ISCED 2011. Montréal. ISBN: 978-92-9189-123-8. Retrieved from: http://www.uis.unesco. org/Education/Pages/international-­standard-­classification-­of-­education.aspx United Nations Development Programme. (2020, October 10). Human Development Data (1990–2018). Retrieved from: http://hdr.undp.org/en/data Vasilyeva, M., Dearing, E., Ivanova, A., Shen, C., & Kardanova, E. (2018). Testing the family investment model in Russia: Estimating indirect effects of SES and parental beliefs on the literacy skills of first-graders. Early Childhood Research Quarterly, 42(July 2017), 11–20. https://doi.org/10.1016/j.ecresq.2017.08.003 Wolgemuth, J. R., Savage, R., Helmer, J., Lea, T., Harper, H., Chalkit, K., Bottrell, C., & Abrami, P. (2011). Using computer-based instruction to improve indigenous early literacy in Northern Australia: A quasi-experimental study. Educational and Psychological Studies Faculty Publications, 173. https://scholarcommons.usf.edu/esf_facpub/173 Wolgemuth, J. R., Abrami, P. C., Helmer, J., Savage, R., Harper, H., & Lea, T. (2014). Examining the impact of ABRACADABRA on early literacy in northern Australia: An implementation fidelity analysis. The Journal of Educational Research, 107(4), 299–311. https://doi.org/10.108 0/00220671.2013.823369 World Bank. (2020, October 18). Gini index (World Bank estimate). Retrieved from: https://data. worldbank.org/indicator/SI.POV.GINI Zhang, W., & Bray, M. (2020). Comparative research on shadow education: Achievements, challenges, and the agenda ahead. European Journal of Education, 55(3), 322–341. https://doi. org/10.1111/ejed.12413

Chapter 11

The Association Between Adverse Socio-­economic Circumstances and Cognitive Development Within an International Context Lee T. Copping

This chapter discusses evidence from the iPIPS and PIPS projects that highlight the detrimental impact of factors associated with low socio-economic circumstance on the early cognitive processes in developing children.

11.1 Introduction Socio-economic circumstances play a major role in education generally and appear at the forefront of public policy at both national and international levels, with impoverishment consistently highlighted as a key barrier to social mobility. In the UK for instance, 20 percent of children in poverty do not leave school with a basic qualification, while almost 40 percent of children eligible for free school meals (FSM) do not leave school with at least one GCSE level or equivalent (Longfield, 2019). The United Nations sustainable development goals also emphasise the importance of educational empowerment to increase social mobility and as a mechanism to help children escape from poverty (United Nations, 2020). It comes as no surprise that large scale international assessments (which governments pay particular attention to) such as PISA (OECD, 2018), PIRLS (IEA, 2016) and TIMSS (IEA, 2015) have attempted to measure indicators of socio-economic status within their overall testing frameworks since the early days of their inception to help countries examine its impact and to help shape future policy.

L. T. Copping (*) Teesside University, Middlesbrough, UK e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_11

185

186

L. T. Copping

Previous research has indicated that poverty in the early and pre-school environment has a pervasive longitudinal impact on many cognitive abilities (Dickerson & Popli, 2016) and that the quality of early child-parental interaction also declines with increasing levels of impoverishment (Del Bono et al., 2016; Katz et al., 2007). As socio-economic status can change throughout the life course (often dramatically), it is vital to know about a child’s circumstances in those pivotal early years. Generating detailed profiles of learners and identifying children at risk of future educational difficulties are just two of several reasons that valid baseline assessments and an understanding of early environmental contexts are useful to researchers, practitioners and policy makers alike (Blatchford & Cline, 1992; Tymms et al., 2004a). The remainder of this chapter is dedicated to reviewing the findings from close to 30 years of international research between the PIPS/IPIPS assessment and measures of socio-economic status. What follows is a narrative synthesis of the current body of work to highlight key findings that may be of interest to the development of future research, practice and policy in the areas of early literacy and numeracy development.

11.2 Assessing Socio-economic Status in the Context of the IPIPS Project From the early days of the inception of the PIPS and IPIPS projects, there was an interest in examining key baseline data in relation to readily accessible information regarding socio-economic status. To begin with, in UK-based schools, this often involved the use of pupil indicators held by schools, such as FSM eligibility. Other designs encapsulated additional measures based on local post-code data calculated by local authorities and government agencies that aggregated various local indicators of deprivation such as access to healthcare, employment figures, crime etc. (for example, the Indices of Deprivation Affecting Child Impoverishment, or IDACI – Ministry of Housing, Communities and Local Government, 2019). These types of measures are useful in profiling areas but are not necessarily linked directly to an individual child’s immediate circumstances (it is possible for a relatively wealthy family to live in a deprived area) unlike FSM eligibility which is assessed based on the actual family unit. Already then, there are at least two distinct levels from which socio-economic status can exert influences on a child’s developmental environment: the actual resources and comfort of the home and family unit itself; and the wider localised environment and its general level of affluence, opportunity and access to services and amenities. As the IPIPS project developed and additional countries became involved, the variety of measures for socio-economic status increased. These were diverse and reflected local contexts and reporting mechanisms as well as the inclusion of measures deemed theoretically relevant based on progression in developmental and

11  The Association Between Adverse Socio-economic Circumstances and Cognitive…

187

educational research. There was, however, some attempt to create common indices across various project groups. For instance, measures such as access to books through to additional, advanced amenities (from access to own desk to work at, to laptop computers) are now routinely included, given their importance as highlighted by other international assessments (IEA, 2015, 2016; OECD, 2018).

 hat Has the IPIPS Project Contributed to Our Understanding W of Early Socio-economic Impact? This chapter examines the literature by the various socio-economic measures, each of which carries its own set of implications and limitations when considering cognitive development internationally. The following factors are considered: local composite measures (such as IDACI); specific indicators such as Free School Meal eligibility, parental education and wider home background measures. Some of the studies reviewed therefore appear multiple times as they included multiple indicators. This section concludes with a summary of the key findings. Composite Measures of Socio-economic Status Composite measures of the environment have their uses in establishing levels of deprivation. They are objective indicators, calculated statistically from other national data sets such as a census, and are linked to a specific geographical location, removing the need for parental self-report. They often cover a broad array of important localised contextual factors (such as local unemployment rates, education and skills data, health, and disability data) with different metrics focusing on different domains. To date, in the context of the PIPS/iPIPS project, such studies using this kind of data have taken place only within the UK, specifically, England and Scotland. Alongside the earliest incarnations of PIPS measures, Tymms et al. (1995) used the Townsend index (based on indicators derived from postcode and census data) to show that total scores on PIPS measures were negatively related to deprivation (r = −.28) in England. Poorer children performed worse than their wealthier contemporaries. Notably however, the negative relationship between average PIPS scores and Townsend scores for schools and clusters (schools grouped by meaningful characteristics such as rural/urban) was stronger still (r = −.54 and − .59 respectively). This suggests that disadvantaged children are more likely to congregate in the same school environments. Tymms et al. (1997) found similar (albeit slightly weaker) correlations using the reverse of the Townsend measure (termed affluence) in a later UK study. However, alongside additional demographic and school-based indicators using multi-level modelling, the results differ slightly. Progress in reading during the first year appeared to be affected by affluence, but not so for

188

L. T. Copping

mathematics. Prior attainment and the school attended had a much larger impact (a point we will return to later). Later studies examining cohorts in England and Scotland show much the same thing. Using IDACI scores, Tymms et al. (2014) examined PIPS baseline data in the first year of school in England and found that there were statistically significant correlations between this measure and all components of the test, with effect sizes in excess of r = .16. Here again we see that those from deprived postcode areas do significantly worse than those from more affluent areas. Much the same result was found in a large representative sample of Scottish children using the Scottish Index of Multiple Deprivation (SIMD, 2013), with correlations of much the same strength (Tymms et al., 2016) which appeared stable over three studied cohorts. Interestingly, in this analysis, children were divided into quintiles based on their SIMD score and the difference between the highest and lowest groups in terms of ability was quantified in months. On entry to the first year of formal education in Scotland, the most affluent group was 15.4 months ahead in reading and 13.0 months ahead in mathematics than their least affluent counterparts. Over the first year, these children made much progress but still lagged behind their more affluent peers in reading by the equivalent of 2.2 months. A final point is that, as PIPS was created and tested for a long period of time in the UK, longitudinal work has now been conducted giving insight into the long-­ term impacts of deprivation. The baseline assessment scores are moderately correlated with outcomes in mathematics, reading and vocabulary scores (r = .56 in all cases), at ages 11 (Tymms et al., 2012), whilst baseline assessments and the end of first-year assessments are moderately correlated with English and Mathematics outcomes at KS1 (aged 7), KS2 (age 11) and GCSE (age 16), with correlations ranging between r = .45 and .64 (Tymms et al., 2018). These outcomes however, are still associated with postcode measures of deprivation when modelled at each stage. Whilst baseline measures are still predictive of variation in later outcomes, deprivation measures continue to have a significant detrimental effect, even after controlling for factors such as age, sex, English as a second language, ethnicity and special educational needs status. Furthermore, links with deprivation, when allowed to vary by class and school, do not noticeably impact on this relationship, suggesting that the effects of deprivation are generally system wide rather than associated with school level. This work has been pivotal in demonstrating links between deprivation in individuals and in schools in relation to assessment outcomes. From this, we see that deprivation, at least in the UK, is present and pervasive from entry into primary school (additional work by Merrell et al. (2003) with a different measure suggesting this effect is present even in pre-school) with less affluent children being considerably disadvantaged from entering school. Whilst they do catch up, and many clearly do well, they still largely lag behind wealthier children in less deprived schools. There are two main challenges to this work, however. Firstly, except for within UK nation comparisons, it is hard to directly compare these measures of deprivation internationally (these indices are based on UK government census returns and data repositories only). This makes them less useful as a companion measure in an

11  The Association Between Adverse Socio-economic Circumstances and Cognitive…

189

international framework. Secondly, the unit of analysis is not individual. It is possible for two individuals to have the same IDACI score but radically different levels of actual wealth (with each even attending different schools). This potential error in the measure may therefore have an impact on the strength of the relationship between deprivation, attainment and progress, and it may thus be better to examine individual level factors surrounding each individual child (which we will now move on to). Disadvantaged Group Status Most education systems have several status flags attributable to different groups that may represent children from socio-economically disadvantaged groups. These vary from country to country and in this section, we briefly examine three of them. In the UK, the FSM measure has been examined in relation to PIPS baseline measures. In Scotland, the differences between FSM and Non-FSM pupils on entry to school equates to effect sizes close to .70 (almost 7/10 of a standard deviation) in favour of wealthier children (Tymms et al., 2004a, b). However, in a sample of close to 1000 deaf children, FSM seemed to have no real impact on baseline attainment and progress measures (Tymms et al., 2003) nor did school level FSM have a significant impact on several intervention waves in a study in Northern Ireland (McGuinness et al., 2014). FSM is therefore not always a clear-cut measure. In Brazil, a measure of poverty is whether a parent has access to cash transfer programmes (CTP). When examining pre-school children’s abilities in Rio de Janeiro as part of a longitudinal study, Bartholo et al. (2020) found a small, statistically significant effect of having CTP status in mathematics outcomes on the IPIPS assessment. However, once more specific measures are added, this coefficient becomes non-significant, a point we return to later. CTP status had no impact on early language development. However, in a later wave of their investigation, Koslinski and Bartholo (2019) in their most inclusive models showed that, while CTP status does not seem to have an impact at the level of the child or class (once their baseline performance was accounted for), CTP status does have a significant impact at school level (that is, schools with more CTP students do not perform as well). In South Africa (Western Cape), socio-economic status is often quantified by the government as an allocation of a quintile group based on affluence of location (with the first quintile being the poorest and the fifth the most affluent). This grouping is then used to allocate resourcing by the government, with schools in Quintiles 4 and 5 being fee paying schools. Tymms et  al. (2017) found significant differences in performance at baseline between Quintiles 2 and 5 in both mathematics and language, with the effect size being smaller for mathematics than language (.14 versus .41 respectively). This variable was not significant in terms of progress over the first year, however. This may perhaps reflect issues with the quintile classification system, in that this categorisation is probably not sensitive enough to within school and

190

L. T. Copping

within quintile socio-economic variation. Other general issues relating to the quintile system have also been raised (see Ogbonnaya & Awuah, 2019). These types of ‘flags’ leave much to be desired, and their results seem to vary. It cannot be denied that these deprivation indicators have an impact on baseline and first year progress measures, but they are not directly comparable with each other. There are also local variations on how such indicators are used. For instance, the criteria for FSM in the UK can vary, and can be subject to manipulation in some cases. Parental Education Parental education is an important index to consider when assessing socio-economic status. There are two key reasons for this. Parents’ education itself is a strong, positive correlate of earnings and general socio-economic status (Blaug, 1972). However, it also provides potential information on the ability of parents to interact in an educationally stimulating way with their children during key developmental windows. As such, variations of these measures are now appearing in international assessments (IEA, 2015, 2016; OECD, 2018). Several iPIPS studies have now integrated this useful background measure into models of performance. In the aforementioned longitudinal study in Brazil (Bartholo et al., 2020; Koslinski et al., 2019), having parents educated to at least high school level was a significant predictor of better mathematics and language performance at baseline. The effect is stronger still for parents with higher levels of education. Interestingly, with regards to progress in language (Koslinski et al., 2019), this variable does not have an impact at the level of the school or class, only on individual children, suggesting that this is likely related to better education provision and support in the home environment rather than some form of group aggregated advantage within an institution (unlike measures such as FSM or CTP). In Russia, the literacy measures in the iPIPS assessment were used to test a model of family investments in relation to first grade literacy in a large representative sample (Vasilyeva et al., 2018). Parental education was one of several factors (the remaining reviewed in the next section) assessed for its predictive power in literacy. The results showed a strong direct effect of parental education on early literacy, as well as effects on other home environment measures (suggesting that this variable supports learning through multiple routes) resulting also in a significant indirect effect. These models were all conducted controlling for the age, child and preschool experience of the children. A study on Dutch kindergarten children using iPIPS also suggests how disadvantaged backgrounds can be potentially ameliorated. Children from classes with mixed socio-economic origins (defined by parental education and minority status) were compared to classes with targeted socio-economic groups (where those believed at risk due to their socio-economic status were taught separately from others). In this study, children taught in mixed kindergarten classes made greater progress over the year in both literacy and mathematics than those in targeted classes (de

11  The Association Between Adverse Socio-economic Circumstances and Cognitive…

191

Haan, Elbers, Hoofs & Leseman, 2013) suggesting that early disadvantage can be compensated for via mixed teaching in the early years (Schechter & Bye, 2007) (See also Chap. 20). In Serbia, the education level of the mother appeared to have no association to mathematics and literacy at the start of formal education (Aleksic et  al., 2018). However, in this study, the iPIPS assessment had already been administered in a preschool year and, as we have seen in previous studies, the inclusion of a baseline in relation to a later assessment does often reduce the size of the effects of indices of deprivation. The relationship between mother’s education and the first preschool assessment was not recorded in this study. Generally, parental education across countries appears to be a good predictor of attainment and, to a lesser extent, progress of early performance in iPIPS assessments. The measure also has the advantage of being directly tied to individual children as well as being objective (so long as parents truthfully report) which makes it more targeted than some of the other measures described earlier. Wider Background Measures This subsection covers more of the specific measures that have occasionally been implemented to examine the influence of the home environment. These are measures such as the number of books available in the home or access to certain amenities. These measures are often used and operationalised in different ways but many of the item types are present in other international assessments too (IEA, 2015, 2016; OECD, 2018), such is their theoretical importance. Sometimes they are aggregated into an overall index, sometimes, they are used as single item measures. Each method has its advantages and disadvantages which will not be discussed here. Starting with the iPIPS Brazil project (Bartholo et al., 2020) which we examined in the last two sections, two additional measures of socio-economic status were included in the multi-level models: home learning environment (indexed by items such as own desk, own room) and possession of goods (indexed by access to luxuries and amenities). Both variables are significant predictors of baseline mathematics and language on entry to education. Interestingly, however, when these measures are included, the CTP poverty measure (see section “Disadvantaged Group Status”) becomes non-significant. The impact of the parental education measures diminishes only slightly, yet the overall variance explained in the cognitive outcomes increases by 4.3 percent for mathematics and 5.0 percent for language. This would suggest that these variables overlap considerably with other indices of poverty but are themselves better predictors of baseline performance. Nonetheless, these contributions to explaining variance are small and may suggest that measures of goods and material resources represent only one aspect of deprivation. Furthermore, the authors noted that these variables represent parental self-report measures and that non-response rates greatly reduced the number of usable observations for modelling. In iPIPS South Africa (Tymms et al., 2017), a similar pattern occurs when measures tailored to the home environment are included. A measure based on the

192

L. T. Copping

possessions in the home scale (IEA, 2016) was used alongside the school quintile measure at baseline. This scale total showed a positive correlation with quintile measures (albeit small, r = .31) suggesting that those in the higher quintiles were from more affluent families. In modelling baseline performance, both variables are statistically significant and seem to have a large effect on performance in mathematics and language. However, when we examine progress over the first year of education, school quintile is no longer significant to reading and neither are significant in mathematics. In both cases, the introduction of the baseline measure is the biggest predictor of yearly progress. Recall from the previous section that the study in Russia (Vasilyeva et al., 2018) also had multiple indicators of the home learning environment in relation to baseline reading outcomes. These included measures on family income, home resources (such as laptops, books), home activities (interactions between parent and child) and outside home activities (participating in enrichment programmes, for example). In their structural model, Vasilyeva and colleagues found that home resources, home activities and outside home activities were all significantly related to early literacy in Russian children. The effect of parental income, however, was indirect only, in that it was related to the other three measures (higher income families had more resources and spent more time on home and outside home activities) but did not predict child literacy. This therefore suggests that income alone is less important than whether that income is being channelled into investments in the child’s education (be it time, resource or relevant, stimulating activity). A literacy study in Germany using iPIPS (Niklas & Schneider, 2017) examined the relationships between home learning and environment (indexed by access to books and time spent engaging in literacy activities), parental occupation and reading at the end of the first year of school. Even though these two variables were significantly related to each other (r = .38), home learning environment had a small but statistically significant effect on reading (r  =  .09), whereas parental occupation did not. Finally, a UK-based study (Tymms et al., 2000) examined a home environment measure in relation to literacy and mathematics at baseline, the end of the first year of school and at a two-year follow up. The home environment measure examined books in the home, frequency of library visits and some home literacy activities, and was termed cultural capital. This variable was positively correlated with PIPS baseline measures as well as PIPS reading and mathematics scores at the end of the first year and later mathematics and reading. These correlations were all between r = .13 and .17. When modelled alongside the baseline assessment to assess progress, those with higher cultural capital scores made only a little more progress than those with lower scores, although this does not appear statistically significant in the first year of school. To conclude this section, measures looking at the wider developmental context do seem to be powerful predictors of performance, particularly compared to other measures used alongside PIPS baseline measures. Whilst these measures across the iPIPS programmes to date are not identical, many of the items and themes they represent are at least common (for instance, access to books, access to learning

11  The Association Between Adverse Socio-economic Circumstances and Cognitive…

193

resources) and suggest, particularly alongside studies using similar measures via PISA, PIRLS and TIMSS assessments, that these are important sources of variation in educational outcomes.

11.3 Summary It would appear that there are clear impacts of socio-economic status on early cognitive development and that this impact appears, so far as we can deduce from iPIPS data, universal, varying largely by measurement type. Whilst the strength of these relationships varies by location and measure, there appears to be no educational advantage coming from deprived economic circumstances, and we rarely see instances of it having no impact at all. Thus, one of the major barriers to increasing cognitive development in children internationally is the set of factors surrounding economic and social deprivation. That said, the news is not all bad. One thing that we can show using the PIPS/iPIPS systems is that, despite initial disadvantage, children in poorer socio-economic circumstances make considerable strides in terms of their literacy and numeracy within the first year of formal education, with some being able to catch up, although there is still evidence of a lag for others (Tymms, et al., 2016). We also see across the breadth of these studies, that the effect of socio-economic status wanes in the presence of a baseline or prior attainment measure, suggesting that to know the true impact of complex home circumstances, we need some assessment of a child’s starting capabilities. These two points underscore the importance of having an effective baseline from which to examine children if we are to understand the barriers to early educational success. The iPIPS project is a testament to what we can learn about development and teaching effectiveness when we do just that.

11.4 Challenges and Future Research Directions for Early Socio-economic Status Researchers We now turn to discussing not what we already know, but what more we would like to know and the challenges we need to overcome in the pursuit of that knowledge. The iPIPS project has given us much needed, detailed information regarding socio-­ economic status over recent years, and this is a long way from the position the assessment was in at inception, where background measures were not part of the original design specification (Tymms & Albone, 2002). Whilst recent projects have begun to include wider measures of child background, as with other international measures, there is no single, readily available standardised measure that would neatly satisfy every project and geographical location. Nations often have different thresholds for what classes as poverty, and

194

L. T. Copping

comparably, an impoverished child in the UK is still likely to be having a very different educational experience to an impoverished child in China, Australia or Africa. How to do this effectively then, still requires attention from researchers. Whether these measures would also be invariant enough (that is, measure the same construct equivalently across nations) for accurate comparability is also an issue. A recent study by Sandoval-Hernandez et al. (2019) which examined socio-economic status across international assessments (including PISA and TIMSS) suggests this is not the case. Until then, international comparisons regarding socio-economic status must be done cautiously. Related to the above is the need to continue to identify specific key factors that are related to later literacy and numeracy outcomes. Many deprivation indices are nation specific, and others (such as FSM or ESL) are often uninformative because they do not provide specific information on what is going on at home (they merely reflect some form of threshold or indicator for use by schools). Measures of amenities are also interesting to examine but can vary internationally (access to a car for instance is likely more indicative of wealth in some countries than in others, where having a car is largely normative). What we do not have many measures of thus far are other indications of deprived earlier environments. These could include measures of maternal time inputs, more detailed information on family structure (such as presence of step-parents or wider kin), consistency of structured time or functions of a family (such as shared play, meal provisioning and consistency in discipline). We know that these types of activities decline in the presence of impoverishment (Del Boni et al., 2016; Katz et al., 2007) yet we do not account for these in many research projects (although note some of the exceptions earlier in section “Wider Background Measures”). We also need to include multiple measures of socio-economic status in designs to build a more accurate picture of what is important to early literacy and numeracy development. The designs implemented in Brazil and Russia for example, are much more informative to an extent as we can quantify specific, potentially targetable, features of the home environment. Existing international assessments are beginning to expand home background measures to encompass more factors (such as values and time inputs  – see IEA, 2015, 2016; OECD, 2018 for examples) but as these lack a baseline to start from, their true impact is going to be difficult to interpret. Knowing their impact on baseline and later performance would be invaluable and knowing their impact alongside the types of factors we are already examining would enable more targeted and potentially preventative interventions from as early a stage as possible. We also have limited information on how invariant the iPIPS assessment itself is by socio-economic status (that is, how can we be sure that the assessment works in the same way across all levels of socio-economic status). This is again largely confounded by the lack of a uniform measure across projects which can be used to achieve such an end. There has been some attempt to examine invariance in some contexts or at least on some of the sub-tests (for instance, see Copping et al., 2016; Cramman et al., 2018) but these are largely based on UK samples and no systematic attempt has been conducted thus far. However, this is a problem that cannot be solved until the problems in the preceding paragraphs are resolved. Current

11  The Association Between Adverse Socio-economic Circumstances and Cognitive…

195

international assessments (e.g., PISA) proceed as though they have comparable background measures, even though the comparability of these measures is not well established and are confounded by the same issues.

11.5 Implications for Policy What are the potential policy implications of this body of research? The last decade has seen a large rise in the use of iPIPS in a way that has made some degree of international comparison possible. Knowing what we know is not enough in its own right however, and it remains to be seen how policy makers will respond to some of the challenges raised by low socio-economic status internationally. Here we make some tentative suggestions that may pay future dividends for research and practice. Firstly, a point that has been made here (and elsewhere in this volume) is the importance of implementing a valid baseline assessment for use in schools. Use of such a measure provides invaluable information to practitioners in the classroom, regardless of whether socio-economic status is examined. However, where these measures are included, researchers can use the combination of this data to get a sense of what factors may present barriers to future attainment and progress. This is vital to get a complete picture of early years cognitive development. Secondly, and related to the previous point, while we recognise the importance and value of the current large-scale international assessment programmes (PISA, PIRLS for example), these are difficult to interpret without the inclusion of a baseline (Tymms et al., 2004a, b). They in fact tell us very little of those all-important early years, with the earliest assessments taking place only around Grade 4 (around ages 7–9). Having the information from something like PIPS on the scale of something like PISA would be an incredibly valuable repository of information to inform future practice, research and policy. Whilst the PIPS/iPIPS programme has made great strides in profiling what children know and can do across participating countries, there is still much more that could be done. As will have been made clear in other chapters within this volume, it is already possible to make successful international comparisons of attainment and progress in literacy and numeracy and conducting more research of this calibre would be fruitful. Finally, whilst greater international effort is needed to examine specific features of home backgrounds, the IPIPS literature and other international assessments demonstrate some key recurring features that could be dealt with via national (and potentially international) policy changes. For instance, interventions focused upon provision of books and reading materials (through static or mobile libraries, school level provision or some other means) to impoverished families and communities could be implemented to help bolster early child cognitive development. Simple changes such as this might be a relatively cost-effective way of fostering progress to the UN’s Sustainable Education goals.

196

L. T. Copping

11.6 Conclusion Clearly the PIPS/iPIPS projects have contributed an enormous amount to our understanding of the impact of socio-economic status and early years cognitive development. Continued work in this area will no doubt help to elucidate other relevant factors related to early backgrounds that may also be research worthy. But a final point to end on is the relative importance of that baseline measure, and that knowing socio-economic context alone only paints some of the story. A valid baseline assessment alongside other contextual data is vital for practitioners and researchers alike to get a fuller understanding of a child’s cognitive capabilities.

References Aleksic, G., Merrell, C., Ferring, D., Tymms, P., & Klemenović, J. (2018). Links between socio-­ emotional skills, behaviour, mathematics and literacy of preschool children in Serbia. European Journal of Psychology of Education, 34(2), 417–438. Bartholo, T. L., Koslinski, M. C., Costa, M. D., & Barcellos, T. (2020). What do children know upon entry to pre-school in Rio de Janeiro? Ensaio: Avaliação e Políticas Públicas em Educação, 28(107), 292–313. Blatchford, P., & Cline, T. (1992). Baseline assessment for school entrants. Research Papers in Education, 7(3), 247–269. Blaug, M. (1972). The correlation between education and earnings: What does it signify? Higher Education, 1, 53–76. Copping, L. T., Cramman, H., Gott, S., Gray, H., & Tymms, P. (2016). Name writing ability not length of name is predictive of future academic attainment. Educational Research, 58(3), 237–246. Cramman, H., Gott, S., Little, J., Merrell, C., Tymms, P., & Copping, L. T. (2018). Number identification: A unique developmental pathway in mathematics? Research Papers in Education, 35(2), 117–143. de Haan, A. K. E., Elbers, E., Hoofs, H., & Leseman, P. (2013). Targeted versus mixed preschools and kindergartens: Effects of class composition and teacher-managed activities on disadvantaged children's emergent academic skills. School Effectiveness and School Improvement: An International Journal of Research, Policy and Practice, 24(2), 177–194. Del Bono, E., Francesconi, M., Kelly, Y., & Sacker, A. (2016). Early maternal time investment and early child outcomes. Economic Journal, 126, 96–135. Dickerson, A., & Popli, G. K. (2016). Persistent poverty and children’s cognitive development: Evidence from the UK millennium cohort study. Journal of the Royal Statistical Society, 179, 535–558. IEA. (2015). TIMSS 2015 user guide for the international database supplement 1: International version of the TIMSS 2015 context questionnaires. Available at: https://www.iea.nl/data-­tools/ repository/timss. Accessed 20 June 2020. IEA. (2016). PIRLS 2016 user guide for the international database supplement 1: International version of the PIRLS 2016 context questionnaires. Available at: https://www.iea.nl/data-­tools/ repository/pirls. Accessed 20 June 2020. Katz, I., Corlyon, J., La Placa, V., & Hunter, S. (2007) The relationship between parenting and poverty. Report by the Joseph Rowntree Foundation. Koslinski, M. C., & Bartholo, T. L. (2019). Impact of child development centers in the first year of preschool. Estudos em Avaliação Educacional, 30(73), 280–311.

11  The Association Between Adverse Socio-economic Circumstances and Cognitive…

197

Longfield, A. (2019). The children leaving school with nothing,’ Report to the Children’s Commissioner. Available at: https://www.childrenscommissioner.gov.uk/wp-­content/ uploads/2019/09/cco-­briefing-­children-­leaving-­school-­with-­nothing.pdf. Accessed 20 June 2020. McGuinness, C., Sproule, L., Bojke, C., Trew, K., & Walsh, G. (2014). Impact of a play-based curriculum in the first two years of primary school: Literacy and numeracy outcomes over seven years. British Educational Research Journal, 40(5), 772–795. Merrell, C., Tymms, P., & Jones, P. (2003) The impact of pre-school education on language and mathematical development.13th European early childhood educational research association annual conference, Glasgow. Ministry of Housing, Communities and Local Government. (2019). English indices of deprivation 2019: Mapping resources. Available at: https://www.gov.uk/guidance/english-­indices-­of-­ deprivation-­2019-­mapping-­resources. Accessed 28 June 2020. Niklas, F., & Schneider, W. (2017). Home learning environment and development of child competencies from kindergarten until the end of elementary school. Contemporary Educational Psychology, 49, 263–274. OECD. (2018). PISA 2018 Database. Available at: https://www.oecd.org/pisa/data/2018database/ CY7_201710_QST_MS_STQ_NoNotes_final.pdf. Accessed 20 June 2020. Ogbonnaya, U. I., & Awuah, F. K. (2019). Quintile ranking of schools in South Africa and learners’ achievement in probability. Statistics Education Research Journal, 18(1), 106–119. Sandoval-Hernandez, A., Rutkowski, D., Matta, T., & Miranda, D. (2019). Back to the drawing board: Can we compare socioeconomic background scales? Revista de Educacion, 383, 37–61. Schechter, C., & Bye, B. (2007). Preliminary evidence for the impact of mixed-income preschools on low-income children’s language growth. Early Childhood Research Quarterly, 22, 137–146. SIMD. (2013). The Scottish index of multiple deprivation. Available at: http://www.scotland.gov. uk/Topics/Statistics/SIMD/BackgroundMethodology. Accessed 28 June 2020. Tymms, P., & Albone, S. (2002). Performance indicators in primary schools. In A. J. Visscher & R.  Coe (Eds.), School improvement through performance feedback (pp.  191–218). Swetz & Zeitlinger. Tymms, P., Merrell, C., & Henderson, B. (1995). Pre-school experience: An analysis of reception assessment data. A report for the Audit Commission. Available from the Commission. Tymms, P., Merrell, C., & Henderson, B. (1997). The first year at school: A quantitative investigation of the attainment and progress of pupils. Educational Research and Evaluation, 3(2), 101–118. Tymms, P., Merrell, C., & Henderson, B. (2000). Baseline assessment and Progress during the first three years at school. Educational Research and Evaluation, 6(2), 105–109. Tymms, P., Brien, D., Merrell, C., Collins, J., & Jones, P. (2003). Young deaf children and the prediction of reading and mathematics. Journal of Early Childhood Research, 1(2), 197–212. Tymms, P., Merrell, C., & Jones, P. (2004a). Using baseline assessment data to make international comparisons. British Educational Research Journal, 30(5), 673–689. Tymms, P., Jones, P., Merrell, C., Henderson, B., & Cowie, M. (2004b). Children starting School in Scotland. A report of research funded by the Scottish Executive Education Department. Tymms, P., Merrell, C., Henderson, B., Albone, S., & Jones, P. (2012, June). Learning difficulties in the primary school years: Predictability from on-entry baseline assessment. Online Educational Research Journal. Tymms, P., Merrell, C., Hawker, D., & Nicholson, F. (2014). Performance indicators in primary schools: A comparison of performance on entry to school and the progress made in the first year in England and four other jurisdictions: Research report. Available at: Department for Education: London. https://www.gov.uk/government/publications/performance-­indicators-­in-­ primary-­schools. Accessed 25 June 2020). Tymms, P., Merrell, C., & Buckley, H. (2016). Children’s development at the start of school in Scotland and the progress made during their first school year: An analysis of PIPS base-

198

L. T. Copping

line and follow-up assessment data. Research report for the Scottish Government. ISBN: 9781785448942. Available at: http://www.gov.scot/Publications/2015/12/5532/0 Tymms, P., Howie, S., Merrell, C., Combrinck, C., & Copping, L. (2017). The First Year at School in the Western Cape: Growth, Development and Progress. Report to the Nuffield Foundation. Available at: http://www.cem.org/attachments/Tymms-­41637-­SouthAfricaFinalReport-­ Oct-­2017.pdf. Accessed 25 June 2020. Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. United Nations. (2020). Sustainable development goals: Quality education. Available at: https:// www.un.org/sustainabledevelopment/education/. Accessed 20 June 2020. Vasilyeva, M., Dearing, E., Ivanova, A., Shen, C., & Kardanova, E. (2018). Testing the family investment model in Russia: Estimating indirect effects of SES and parental beliefs on the literacy skills of first-graders. Early Childhood Research Quarterly, 42, 11–20.

Part V

Using iPIPS Data for Teaching and Informing Policy Tiago Bartholo

and Mariane Koslinski

Introduction Teachers and policymakers have used research based on PIPS data in the UK since 1993.1 One significant finding was that the first year in school (Reception class age 4) had a substantial impact on their literacy and numeracy – a 2.0 effect size on average – making it the most significant year in pupils’ school careers (Tymms et al., 1997). The ‘teacher effect’ was the most significant factor related to the classroom (Hawker, 2015). The difference between having a good teacher in the 3 years from Reception to Year 2 and having a poor teacher, all other things considered, was 0.82 of a standard deviation. Another finding was that the ‘school impact’ in 1  year appears to be unrelated to the school effect of the same school in the following year. This UK research suggests that effectiveness is almost entirely dependent on the teacher and the classroom. Other research groups have used iPIPS outside the UK to promote data use by different stakeholders such as national and local governments, principals, teachers and parents. However, it is challenging to transform evidence and data produced by research or monitoring systems into compelling data to influence policy decision-­ making or inform pedagogical/instructional planning. These challenges have been widely discussed in literature focused on data-driven decision making (DDDM) or data-base decision making (DBDM) in education (Marsh, 2012; Marsh et al., 2006;

 See Parts I and III for the history and content of iPIPS.

1

T. Bartholo (*) · M. Koslinski Federal University of Rio de Janeiro, Rio de Janeiro, Brazil e-mail: [email protected]; [email protected]

200

V  Using iPIPS Data for Teaching and Informing Policy

Mandinach & Gummer, 2016; Schildkamp et al., 2017). This section presents four case studies from Brazil, Lesotho, Russia and South Africa describing different strategies and insights from different research groups on how to promote data use by different stakeholders at the start of compulsory education. The potential of education to transform society or even an individual’s life is clear. This potential is increased in countries with a prominent level of social inequality. Put in different terms, disadvantaged children benefit more from the high-quality provision of education (Schmidt et  al., 2001; Sylva et  al., 2010). Researchers, civil society and politicians therefore have significant responsibility to make wise decisions about education spending. The iPIPS project provides a unique set of international comparative data that can help guide those decisions and positively impact education inequality and children’s well-being. The following four chapters provide examples from Brazil, Lesotho, Russia, and South Africa of how to make the iPIPS data/evidence comprehensive and, therefore, to support decision making. They describe the experiences developed in each country, focusing on how evidence was presented and the information provided for different groups: parents, teachers, headteachers, school principals, local/national government. The chapters also bring thoughts on the effectiveness of the strategies used to make the data/evidence understandable and examples of teachers, headteachers/principals and policymakers. The last chapter summarises and compares the four reported experiences. It also discusses ways forward for the iPIPS project for future efforts to share research evidence to inform policy decision-making and pedagogical planning by schools.

References Hawker, D. (2015). Baseline assessment in an international context. In Handbook of international development and education (pp. 305–325). Edward Elgar Publishing. Mandinach, E.  B., & Gummer, E. (2016). What does it mean for teachers to be data literate: Laying out the skills, knowledge and dispositions. Teaching and Teacher Education, 60, 366–376. Marsh, J. (2012). Interventions promoting educators’ use of data: Research insights and gaps. Teachers College Record, 14(11), 1–65. Marsh, J., Pane, J. F., & Hamilton, L. (2006). Making sense of data-driven decision making in education: Evidence from recent RAND research. RAND Education Occasional Paper. Accessible in: https://www.rand.org/pubs/occasional_papers/OP170.html Schildkamp, K., Poortman, C. L., Luyten, H., & Ebbeler, J. (2017). Factors promoting and hindering data-based decision making in schools. School Effectiveness and School Improvement, 28(2), 242–258. Schmidt, W. H., McKnight, C. C., Houang, R. T., Wang, H., Wiley, D., Cogan, L. S., & Wolfe, R. G. (2001). Why schools matter: A cross-national comparison of curriculum and learning. The Jossey-Bass Education Series.

V  Using iPIPS Data for Teaching and Informing Policy

201

Sylva, K., Melhuish, E., Sammons, P., Siraj-Blatchford, I., & Taggart, B. (2010). Early childhood matters: Evidence from the effective pre-school and primary education project. Routledge. Tymms, P., Merrell, C., & Henderson, B. (1997). The first year at school: A quantitative investigation of the attainment and progress of pupils. Educational Research and Evaluation, 3(2), 101–118.

Chapter 12

Strategies to Enhance Pedagogical Use of iPIPS Data and to Support Local Government Decision-Making in Brazil Tiago Bartholo

, Mariane Koslinski, and Daniel Lopes de Castro

The chapter describes the strategies undertaken to enhance the use of iPIPS data in Brazil by school principals, teachers and local governments.

12.1 Introduction The educational system in Brazil comprises four stages: early childhood (for children ages 0–5), fundamental (for children/teenagers ages 6–14), high-school (for teenagers ages 15–17) and higher education. Until recently, compulsory education started at age 6 (first grade of fundamental education). In 2009, new legislation changed the age group of access to compulsory education. Preschool, which at the last two years of early childhood education in Brazil, became part of mandatory education. It was a major change and local governments had seven years (from 2009 when the new legislation was put in place to 2016) to adapt and ensure that all children at aged 4 were enrolled in preschool. Brazil is a federation composed of 26 states and the federal district. The education administration is decentralised, and states and municipalities play a crucial role in ensuring public education quality. There are general national guidelines, but early childhood education is administered by each of the 5570 municipalities, which have the autonomy to regulate and establish local guidelines for Early Childhood and Fundamental Education.

T. Bartholo (*) · M. Koslinski · D. L. de Castro Federal University of Rio de Janeiro, Rio de Janeiro, Brazil e-mail: [email protected]; [email protected]; [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_12

203

204

T. Bartholo et al.

Over the past 25  years, comprehensive large-scale assessment systems for fundamental education and high school at the national level and local governments has been developed. These monitoring systems have enabled researchers and policy-­makers to analyse cross-sectional data to look for trends in the quality of Brazilian education. However, there is not yet a national monitoring system in place at the start of compulsory education. As a result, there is little information about school characteristics and policies associated with childhood development and about patterns of inequality. From 2017 to 2019, researchers from the Federal University of Rio de Janeiro (UFRJ), in collaboration with Durham University, analysed children’s development in their first two years at school – ages 4 to 5 – within the scope of the iPIPS project. Three studies were conducted with different samples in two cities and longitudinal data were collected for around 6400 children enrolled in 123 schools (public, private and non-profit private schools).1 The study developed in Brazil with the three samples tracking children’s development in their first two years of compulsory education in three dimensions: cognitive development, motor skills and behaviour and personal, social and emotional development. The study provided an opportunity to describe what children know and can do when they start school and what they learn in the first two years of compulsory education. They also allowed us to observe patterns of educational inequality and to identify the impact of education policies/programmes and school characteristics associated with children’s learning/development, especially of those from disadvantaged backgrounds (Aguiar et  al. 2021; Bartholo et  al. 2020a, b; Koslinski & Bartholo, 2019, 2020; Koslinski et al., 2022). The chapter describes the strategies undertaken to enhance the use of iPIPS data in Brazil by school principals, teachers and local governments. The strategies included school/classroom reports, workshops with teachers and principals to explain the data collected, interpretation of reports, use of data for pedagogical planning and meetings with staff and heads of local departments of education to discuss the main outcomes of the research and its potential to guide policies at the local level and curriculum reform.

12.2 Strategies to Provide Incentive for Pedagogical Use by Schools in Brazil The main strategies were to incentivise schools in Brazil to use iPIPS data for pedagogical purposes including producing individual school and classroom reports and offering workshops to school principals and teachers. This approach was appropriate for public and private schools in Rio de Janeiro, as all children starting compulsory education in the selected schools were eligible to participate in the study. Working in collaboration with colleagues from Durham University, the researchers from Brazil developed reports for teachers and school principals. The aim was

 For more detailed information on the projects developed in Brazil, Chap. 1).

1

12  Strategies to Enhance Pedagogical Use of iPIPS Data and to Support Local…

205

to provide a clear description of what children know and can do when they start school and how much progress children made during the school year in a friendly and accessible way. The reports contained the following information: 1. Brief description of the research and its main objectives; 2. Description of socio-demographic information about the children of the school and a summary of the whole sample; 3. Number of children at the school who were assessed at the beginning and the end of the academic year; and 4. Pedagogical ladder showing children’s starting points and progress for mathematics, reading, vocabulary and phonological awareness (see Fig. 12.1).

Fig. 12.1  Classroom report tables showing information about pupils’ characteristics, number of children assessed and school complexity index. (Source: Laboratório de Pesquisa em Oportunidades Educacionais (LaPOpE)/UFRJ)

206

T. Bartholo et al.

Fig. 12.2  Graph and table with pedagogical interpretation showing results for Mathematics at the start (Mar) and end (Nov) of the school year. (Source: Laboratório de Pesquisa em Oportunidades Educacionais (LaPOpE)/UFRJ)

The reports also included a chart showing the school/classroom results and the mean for all schools in the study sample (see Figs. 12.2 and 12.3). The classroom report, incorporating mathematics and reading, was designed to inform teachers about their pupils,2 on the recommendation of the Durham researchers whose experience indicates that teachers are more motivated and engaged when discussing their pupils or school than examples. This was important to limit the access to information at the school level and the possibility of undesirable behaviour to create league tables or punish teachers based on the information provided by the research. Some teachers are indeed familiar with statistics and more complex research designs. However, it is reasonable to assume, at least in Brazil, that talking about regression coefficients and confidence intervals will not help to gain audience attention. Describing their pupils’ progress, inequalities within the classroom, and pedagogical strategies to help them develop seemed to be a more effective strategy. The only person who would know the outcome of one specific classroom would be the teacher responsible for the group. Teachers agreed with this strategy, which might have contributed to teachers’ receptiveness to the research and the information contained in the reports. The research team was unsure how much information would be understood if teachers or school principals received only the report without any support to interpret the data. Therefore, along with the reports, the research team in Brazil designed  The report for school principals was similar. However, it aggregated the data at the school level.

2

12  Strategies to Enhance Pedagogical Use of iPIPS Data and to Support Local…

207

Fig. 12.3  Graph and table with pedagogical interpretation showing results for Reading at the start (Mar) and end (Nov) of the school year. (Source: Laboratório de Pesquisa em Oportunidades Educacionais (LaPOpE)/UFRJ)

a six-hour workshop for teachers, head teachers and the principles. The workshop included: (i) a detailed explanation of the research objectives and the instruments used to collect data; (ii) explanation of all indicators, tables and graphs contained in the reports; and (iii) group interaction where participants planned classroom activities, based on evidence/information retrieved from the classroom/school reports. It was administered seven-eight months after the year in which data were collected. Four workshops were offered to participating school principals, headteachers, and teachers (one workshop with public school principals, two workshops with public school teachers, one workshop with private school principals). During the first workshop, principals provided extensive feedback on improving the format of the information presented in the reports to make them easier to read. There were also suggestions for additional information that could be included. For example, school principals suggested that a parameter be introduced in the blue arrow to help compare their school with other similar schools. They believe that a ‘fair comparison’ should consider school intake and the complexity of school management. Researchers from UFRJ designed a quasi-experimental study to assess the impact of workshops conducted with public school teachers. The teachers were divided into two groups and attended the workshops on different dates. A key aspect of the workshop was developing teachers’ ability to read the pedagogical ladder using information contained in the report (Mendonça, 2019). Thus, an assessment was designed to understand how difficult it was for the teachers to interpret the graphs, tables and other information presented in reports. The first group received the report, attended a six-hour workshop, and then answered the questionnaire. The second group received the report, answered the questionnaire, and thereafter attended the workshop. Overall, we had very positive feedback from the teachers and principals

208

T. Bartholo et al.

who participated in the workshops. The first group reported less difficulty in interpreting the graphs and tables than the second group. They also presented a more positive perception of the usefulness of the reports for their pedagogical planning than the second group (Mendonça, 2019). Teachers suggested that the iPIPS data were helpful for pedagogical planning and a better understanding of their classroom and the school. The workshop enhanced teachers’ ability to read the information and increased the chance to perceive the data as relevant. Future research could understand if the workshops also increase the chances of teachers and headteachers using the data or whether they merge the baseline information with other observations and assessments of their pupils throughout the year to build an ongoing, comprehensive picture. Nevertheless, this is an important finding that suggests that passive ways of providing assessment results to teachers, even in a user-friendly format, might not be enough to help them make informed/evidence-based decisions. Furthermore, the findings discussed by Mendonça (2019) corroborate other evidence about what is needed to get evidence into use by schools (Gorard et al., 2020; Marsh, 2012; Marsh et al., 2006). One year after the workshop, researchers from UFRJ interviewed the same headteachers and principals who answered the questionnaire about the usefulness of iPIPS data and enquired if and how they have used the reports for pedagogical planning. Below are some of the strategies reported by schools: The reports presented by the research contributed to a better understanding of our school within the local educational department, as well as enabled us to reflect more on the importance of planning through assessment practices. In our school, in our school staff meetings, we usually reflect and discuss our pupils’ situations in their daily lives, portraying a continuous assessment to meet their needs during the school year. In this sense, the reports presented were of great help (Head teacher, school A). Based on the information from the report, it was necessary to take a new look at our pedagogical planning. Through the pupils’ results, we have perceived some specific areas that needed more attention. It also enabled us to advance in reading/vocabulary where the increase in performance was expressive. Reading activities were intensified, generating a vocabulary improvement. For the teachers, the reports served as a basis of reflection and a reorganisation of the proposed activities for the classes. Both information received in the reports and the workshops were used by the teachers at the time of planning, aiming at an improvement in our projects and pedagogical planning (Head teacher, school B). After the presentation of the report with the baseline results, we have pursued as a goal planning actions such as: to encourage oracy, writing, concentration games, socialisation in the school groups, counting, among others. The school planning carried out in 2019 on the environment theme enabled pupils to build games with recycled material developed from concepts such as logical reasoning, ideas of greater/smaller, more/less, indicated in the report. Analysis of the report was proposed to the pedagogical team, highlighting the positive points to be maintained and the negative ones [...] (Head teacher, school C).

The feedback provided throughout the workshops seemed crucial for teachers’ understanding of the report. However, it should be highlighted that it was also a learning opportunity for the researchers to better understand the schools and the local education system. The close contact between researchers and teachers/headteachers allowed the research team to formulate new and better hypotheses tested in future studies. Moreover, the workshops also helped build a strong bond and trust

12  Strategies to Enhance Pedagogical Use of iPIPS Data and to Support Local…

209

relationship with school principals and teachers, which enabled the researchers to propose additional data collection using classroom video recording with significant acceptance by principals and teachers.3 During the workshop for private school principals and head teachers, the participants inquired about ways to report the assessment results to parents comprehensively and clearly. We discussed some forms of reporting aggregate results with them as they were concerned that individual reports could trigger parents’ anxiety about their child’s development.

12.3 Strategies to Inform Policy Makers Brazil has experienced a significant expansion of preschool coverage in the last two decades, especially as in 2009, school enrolment became mandatory for children from age 4. In 2001 66.4 percent of all 4–5-year-old children were enrolled in preschools, and by 2018, the enrolment rate reached 93.8 percent of the same age group. The iPIPS studies conducted by UFRJ in the two municipalities were carried out after the preschool expansion when national curricular guidelines and monitoring systems for Early Childhood Education (ECE) were discussed and implemented. This context contributed to the enthusiastic welcoming of the iPIPS research by the heads of the local departments of education, as they were in search of evidence to inform their decisions to improve local public ECE provision, reformulate local curricular guidelines, and/or establish their local ECE monitoring system. In both cities, the intention was to deliver feedback shortly after each wave of data collection. Workshops where the main findings of the research were presented and discussed, were arranged and members of each local Department of Education and school principals were invited to attend. For example, in Rio de Janeiro, we presented the research findings on the following occasions: 1. One smaller meeting with the head of the Department of Education and central staff 2. Three more extensive workshops which included central staff of the Department of Education, staff from local authorities and school principals 3. One workshop with a task force of the local Department of Education that was in charge of discussing new curricular guidelines for ECE and its integration with those for fundamental educational The meetings in Rio de Janeiro lasted 3 to 4 hours allowing the opportunity to present and discuss the research findings in-depth with participants. Some findings were the focus of thorough discussions; for example, the pedagogical ladder (the starting

 The classroom video recording took place in 62 Rio de Janeiro public school classrooms during the second year of the research. 3

210

T. Bartholo et al.

point and children’s progress), the evidence of the moderate association between children’s absenteeism and cognitive development, as well as the impact observed for class size. The head of the Department of Education was also interested in analysing the impact of full-day/part-time preschool provision in informing his decision of ECE expansion in Rio de Janeiro. In the city of Sobral, we presented the research findings during three meetings with the head and central staff of the Department of Education and a workshop on ‘Assessment of early childhood education in Sobral’ which focused on the pedagogical ladder, especially the high starting points of the children (Gois, 2020). For example, discussions during the workshop included the possible association of the starting point with ongoing intersectoral policies developed by the local government. Participants were also interested in the relationship between motor skills and cognitive development.4 During the last meeting, special attention was given to the impact of a specific ECE provision (schools specialised in ECE called Centro de Educação Infantil), which appeared to have a significant impact on children’s development in language and mathematics (Koslinski & Bartholo, 2020). The specific programme/provision also seems better equipped to bridge the gap between children with higher and lower starting points than the regular schools that provide for ECE and Fundamental Education in the same building. On all those occasions, power points were presented containing (i) pedagogical ladders5 for reading and mathematics with the aggregate results of all schools; (ii) analyses of the association between behaviour, personal, social and emotional development and motor skills with cognitive measures; (iii) analyses on the association between family background and cognitive development; and (iv) impact of school characteristics and policy/programmes on children’s progress in language and mathematics. The research was conducted in 2019, and it is too early to ascertain if the research evidence and the discussions during the meetings and workshops have supported any decision-making. There was a local election at the end of 2020 which means that the turnover of the head of the Department of Education at the local level is a constant challenge in assuring the continuity of monitoring and high-quality assessment and research evidence to inform decision making. However, the Department of Education suggested we write a short report with the research findings to be incorporated in the documents handed to the next government during the transition period. Moreover, the local Department of Education’s consent to research the municipality was extended, including the first semester of the next administration, to ensure that the research’s continuation would not be interrupted.

 For an example of the evidence on the relationship between motor skills and cognitive development, see Aguiar et al. (2021). 5  For examples of pedagogical ladders, see Chap. 8. 4

12  Strategies to Enhance Pedagogical Use of iPIPS Data and to Support Local…

211

12.4 Concluding Remarks Over three years (2017–2019), the research team from UFRJ has had the opportunity to present the main findings to two heads of local departments of education (a total of 8 meetings). As a result, the research findings have been published in research journals and book chapters and have created some interest in the local community. In addition, two national newspapers published five articles presenting iPIPS findings and interviews with the research team in Brazil. This indicates that the evidence has generated interest among governments and community members in an attempt to improve the quality of education in Brazil. The research team also observed that showing modified descriptive data in the reports was not enough to enhance teacher and principals’ understanding and perception of the usefulness of the iPIPS data for pedagogical planning. Some training was necessary to embed understanding and empower teachers to use the information. Likewise, a user-friendly presentation of the assessment results seemed more effective in communicating the main iPIPS research findings. For example, instead of showing tables with regression coefficients, bar charts with effect sizes and their interpretation in months of progress are shown. Nonetheless, these appeared to be an important aspect of communicating the information. Future efforts will include the development of strategies to inform parents/caregivers and a wider audience with other formats of modified evidence and active engagement in transfer and the use of more robust research designs to assess the impact of delivering reports and workshops in teachers’ pedagogy.

References Aguiar, D. K., Tymms, P. B., Koslinski, M. C., Araújo, C. G. S., & Bartholo, T. L. (2021). Cognitive development and non-aerobic physical fitness in preschoolers: A longitudinal study. Lecturas Educación Física y Deportes, 26, 21–42. Bartholo, T. L., Koslinski, M. C., Costa, M., & Barcelos, T. (2020a). What do children know upon entry to pre-school in Rio de Janeiro? Ensaio: Avaliação e Políticas Públicas em Educação, 28(107), 292–313. Bartholo, T. L., Koslinski, M. C., Costa, M., Tymms, P. B., Merrell, C., & Barcellos, T. M. (2020b). The use of cognitive instruments for research in early childhood education: Constraints and possibilities in the Brazilian context. Pró-Posições, 31, 1–24. Gois, A. (2020). Pesquisa mostra que ações na primeira infância impactam positivamente a aprendizagem. O Globo, Rio de Janeiro, 27 de julho de 2020. Access in: https://oglobo.globo. com/sociedade/pesquisa-­mostra-­que-­acoes-­na-­primeira-­infancia-­impactam-­positivamente-­ aprendizagem-­24551264 Gorard, S., See, B. H., & Siddiqui, N. (2020). What we know already about the best ways to get evidence into use in education. In S. Gorard (Ed.), Getting evidence into education. Routledge. Koslinski, M. C., & Bartholo, T. L. (2019). Impact of child development centers in the first year of preschool. Estudos em Avaliação Educacional, 30(73), 280–311. Koslinski, M. C., & Bartholo, T. L. (2020). Inequalities of educational opportunities at the beginning of educational trajectory in the Brazilian context. Lua nova, 110, 215–245.

212

T. Bartholo et al.

Koslinski, M. C., Gomes, R. C., Rodrigues, B. L. C., Andrade, F. M., & Bartholo, T. L. (2022). Home learning environment and cognitive development during early childhood education. Educação & Sociedade, 43, 1–24. Marsh, J. (2012). Interventions promoting educators’ use of data: Research insights and gaps. Teachers College Record, 14(11), 1–48. Marsh, J., Pane, J. F., & Hamilton, L. (2006). Making sense of data-driven decision making in education: Evidence from recent RAND research. Rand Education Occasional Paper. Access in: https://www.rand.org/pubs/occasional_papers/OP170.html Mendonça, L.  S. F. (2019). Conhecimento e interpretação de dados educacionais: o potencial das devolutivas e da capacitação para a compreensão e utilização de relatórios pedagógicos. Dissertação (Mestrado em Educação). Rio de Janeiro, Universidade Federal do Rio de Janeiro.

Chapter 13

Using Assessment Data to Inform Teaching: An Example from Lesotho Ajayagosh Narayanan, Christine Merrell, and Davis Pasa

This chapter describes how the iPIPS Programme was introduced in Lesotho and the key findings that emerged. The study also explores the opportunities to improve the primary education system at large.

13.1 The Education System in Lesotho The basic goal of the Lesotho Government’s policy for Basic Education is to provide learners with the opportunity to become responsible and respectful global citizens, through quality assured, free and compulsory primary education that recognises the importance of individual learning processes (MoET, 2005). This aligns with the Millennium Development Goal to achieve universal primary education and the targets for education within Sustainable Development Goal 4 (United Nations, 2015). The Ministry of Education and Training (MoET) is mandated to ensure the provision of quality education to all learners at all levels, including those with diverse learning needs with an emphasis on the marginalised children. However, it is also crucial to note that Lesotho is classified as a lower-middle income country according to the world-bank data (www.worldbank.org traced on 23rd January 2021). Teachers constitute the single most important human resource input in Lesotho’s education system and account for a large proportion of its public expenditure. Their development through education and training, employment and management has been A. Narayanan (*) · D. Pasa Seliba Sa Boithuto, Maseru, Lesotho C. Merrell (deceased) iPIPS, Durham University, Durham, UK © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_13

213

214

A. Narayanan et al.

recognised by the Government as a key priority in the national quest of Education for All. The MoET (2005) thus initiates, recognises and supports all professional development programmes for teachers to enhance their skills for effective classroom teaching and learning. As per MoET’s (2008) plan, an integrated curriculum and assessment system was adopted in 2009. The Curriculum and Assessment Policy 2009 represents the latest education reform, which marks a departure from the subject and examination-oriented curriculum to a new dispensation wherein the curriculum is organised into learning areas reflecting practical life challenges (Mahao & Raselimo, 2015). These authors further suggest that the new Curriculum and Assessment Policy can be seen as a shift in education policy intentions from an undemocratic and examinationoriented education system to a more process-­oriented curriculum, with a greater integration of assessment with teaching and learning. This method requires professional tools that categorically assess the knowledge of the pupils.

13.2 A Feasibility Pilot of iPIPS In the light of these challenges, a pilot study was conducted to assess the feasibility of a teacher-administered assessment of Grade 1 pupils (iPIPS) and the usefulness of the data for pedagogy and evaluation. The pilot study involved the cognitive development part of the iPIPS assessment, which was adapted for use in Lesotho. The format of the assessment was a printed book containing instructions for the teacher to administer the assessment and the items (see Chap. 8 for more information and examples of items). The pupils’ responses were recorded on a record sheet. The items in each section of the assessment were organised in order of increasing difficulty and the assessment record sheet was designed in such a way as to give teachers immediate information about each pupil’s developmental level for each section of the assessment. An example sheet is given below (Fig. 13.1). The sheet can be used to assess the vocabulary levels of five pupils in the local language, Sesotho. The teacher starts at the bottom of the list and works up one at a time, referring to pictures in the booklet, recording right or wrong in the rectangles, and stopping when the child has made four mistakes. At the end of the year, the circles are filled in, indicating the child’s progress. The initial stage of the pilot study, undertaken during 2017–18, was for the Lesotho project coordinator (Dr Ajayagosh Narayanan) to establish an action committee to oversee the cultural adaptation and translation of the assessment into Sesotho, to recruit schools and run training sessions and workshops throughout the duration of the study for the participating teachers and school principals. The Lesotho team was coordinated by Drs. Narayanan and Mathot, who selected representatives from Ministry of Education & Training (MoET), Lesotho College of Education (LCE), primary school principals, primary and secondary school teachers. Twelve primary schools were invited to participate. The first training session for Grade 1 teachers and school principals was held in January 2018 at which the iPIPS assessment and the rationale for the study were introduced. A second training

13  Using Assessment Data to Inform Teaching: An Example from Lesotho

215

Fig. 13.1  Example of score sheet for vocabulary

session was held in December 2018 just prior to the administration of the assessments of pupils starting Grade 1  in January 2019. Teachers were shown how to conduct the assessment and use the assessment record sheets. This involved: • Role playing by the teachers conducting the assessment with another teacher to practise, going through the items and using the assessment record sheet.

216

A. Narayanan et al.

• Five Grade 1 pupils were invited to the session and they were assessed by two teachers using the assessment record sheet while others observed. • It was also important to track the progress of the assessment. Therefore, before conducting the final assessment, we suggested that the participant practise the record sheet with one pupil and that that is shared with the coordinator, who monitored the initial record. If the work was satisfactory, the final assessment was completed. Twelve schools from Berea and Maseru districts were selected as the participating schools, from which one Grade 1 teacher per school was selected by the school administration to be part of the study. These teachers were invited for a single day workshop, where the team discussed the assessment book (Lesotho version) and the evaluation tool in detail. At the end of the workshop, role play was conducted so that teachers practised how to evaluate the pupils using the tool. For the pilot study, 15 pupils per school were selected at random by each teacher to be assessed and permission was sought from their parents/guardians. By conducting the assessment themselves, the teachers could see and experience the pupils’ successes and areas of difficulty during the assessment and afterwards had a detailed profile of what each pupil knew and could do. A further workshop was conducted in June 2019 after the completion of the start of Grade 1 to discuss how to interpret the results and ways in which they could be used to inform pedagogy. The final workshop for Grade 1 teachers and school principals was held in November 2019, immediately after the repeat administration of iPIPS at the end of Grade 1, to discuss how the results could be used to inform pedagogy as pupils moved up to Grade 2 and also to reflect on the effectiveness of pedagogic practice in Grade 1. This was an extensive training programme which went beyond simply learning how to conduct and assess the Grade 1 pupils. The Lesotho team worked with teachers and school principals throughout the year, providing training in the use of assessment data, linking them directly to pedagogy. One of the limitations of this study is worth pointing out. The data were analysed by iPIPS Coordinators from Durham University. Our team however made a preliminary analysis of the data with the support of the participant teachers who shared their experience as well. At the same time, the findings were shared with the participants in detail, with a focus on how to improve teaching methodologies in the classrooms.

13.3 Using iPIPS in Schools During the final workshop held immediately after the second assessment at the end of Grade 1, teachers and school principals were asked whether they had found iPIPS valuable and how they had made use of the information about their pupils. A

13  Using Assessment Data to Inform Teaching: An Example from Lesotho

217

questionnaire was given to the participants that they duly completed. In addition, they also shared their experience orally. These were recorded by the Durham team. All Grade 1 teachers who used iPIPS reported that they had found their experience valuable and that it had changed their attitudes towards their pupils. They explained that on the basis of obtaining individual, detailed profiles of their pupils’ cognitive abilities, they now viewed each pupil individually in accordance with their ability and were able to help them make progress from those levels in a way that they had not done previously, in the absence of such information. The iPIPS data had illustrated to them that pupils are not all the same in their rates of progress; even if they come from the same local area, they clearly have different levels of development at the start of school and during their year in Grade 1, they learn at different rates. They realised that the influence of different home environments is a factor that needs to be taken into consideration in their teaching. When asked if they found any of the pupils’ results surprising, many of the teachers commented that they were surprised at the advanced levels of some of their pupils; they were able to answer questions that they would have previously assumed were too difficult for them. The way in which some pupils answered questions suggested a high level of ability in their thinking and because of the iPIPS data, one teacher reported that she had changed her method of teaching, especially in reading. The iPIPS information clearly had an impact on teachers’ expectations of their pupils and caused them to alter their pedagogy to be more aligned with their pupils’ actual abilities rather than what they supposed children could do upon entry to school. Some school principals reported that whilst their Grade 1 teachers were conducting the iPIPS assessments, other teachers helped with activities such as looking after pupils in the class. Teachers in the higher grades were interested in the assessment and created their own assessments to discover more about the abilities of their own pupils at the start of the year. This means that the good practice of conducting a baseline assessment to inform pedagogy was felt to be an important tool for all grades to enhance the learning of pupils through techniques and practices, providing information to enable individual rather than generalised intervention and to help with placing pupils appropriately for group work activities. Below, the acting principal of one primary school describes in more detail how his school participated in the pilot and explains how he and his Grade 1 teachers used the iPIPS resources to inform pedagogy and school policy. I signed up to participate in the iPIPS pilot study in 2017. My school was one of the two that took part in what we called litmus test. This was a very early-stage trial of the assessment to gauge its suitability for children starting school in Lesotho and to inform the way in which it needed to be adapted. We also participated in the full pilot of iPIPS assessment in Lesotho. The litmus test and the pilot study revealed that our learners did well in vocabulary, reading and ideas about mathematics. This motivated me to see to it that we maintained, supported and improved on our learners’ strengths. Based on the study, I approached a non-government organization and requested for donations in the form of reading materials. Not only did they donate books but they also trained my teachers to establish reading clubs. Teachers have used the iPIPS information to improve their instruction skills. They began having high expectations of their pupils in Grade 1. We were able to see whether learners

218

A. Narayanan et al. were progressing as expected or falling behind. An eye opener was that the classroom arrangement plays a major role in the learning process. We found that learners must be seated in such a way that they learn from each other. Our classroom seating arrangement is now a round table model, whereby low achievers learn from high achievers. We also sat down as a staff and developed school-based policies, one of which is a promotion policy. According to this, a pupil must be able to read, write and compute so that he or she can go to the next grade. We have also used the iPIPS assessment as a diagnostic assessment when admitting transfers into our school. I plan to continue using iPIPS assessment to locate learners so as to meet their learning needs. On a personal level, the project motivated me to explore new skills as a teacher and administrator.

13.4 Next Steps in Lesotho The iPIPS assessment fills a gap in Lesotho’s primary school assessment system, which at the present time does not have a formal assessment for teachers to use with their pupils on entry to school. This pilot study, conducted in 2019, found that it was feasible for Grade 1 teachers to use iPIPS with their pupils and that the profiles of cognitive ability influenced their expectations and pedagogy. However, teachers only assessed a sample of their pupils for this study and although many expressed a desire to assess all pupils in their classes, this would need wider school support such as the kind shown in some schools where other class teachers stepped in to help while the assessments were taking place in Grade 1. There was a call from those participating in the pilot study to expand the scale of iPIPS to include more schools since they had found it to be a valuable tool. There has been support for iPIPS from both the Ministry of Education and Training and the Lesotho College of Education. The Seliba Sa Boithuto Trust has also committed to coordinate the activities involved in scaling up to make iPIPS available to more schools in years to come. As noted at the beginning of this chapter, teacher training is recognised as a priority to improve the quality of basic education. The teachers involved in this study attended training and workshop sessions throughout the year, which enabled them to get a thorough understanding of the assessment, learn to interpret the results and use them to inform their pedagogy. A further valuable aspect of these sessions was for them to share their experiences of conducting the assessment, the practical challenges and ways that they used the results, helping to foster an environment of shared ownership. If iPIPS is to be sustainable and more widely adopted, as has been seen in other countries, it is crucial that it is valued by those who are using it and supported by the Ministry and through its incorporation in teacher training by the Lesotho College of Education rather than it becoming something that is imposed upon schools.

13  Using Assessment Data to Inform Teaching: An Example from Lesotho

219

References Mahao, M., & Raselimo, M. (2015). The Lesotho curriculum and assessment policy: Opportunities and threats. South African Journal of Education, 35(1), 1–12. http://www.sajournalofeducation.co.za/ MoET. (2005). Education Sector Plan 2006–2016. Ministry of Education and Training. Kingdom of Lesotho. MoET. (2008). Curriculum and assessment policy – Education for individual and social development, June 2008. Ministry of Education and Training. Kingdom of Lesotho. United Nations. (2015). Sustaining Our World: The 2030 Agenda for Sustainable Development, A/ RES/70/1. www.sustainabledevelopment.un.org. Accessed 18 May 2020. www.worldbank.org. Accessed 23 Jan 2021.

Chapter 14

The Use of iPIPS Data for Policy Assessment, Government Evidence-Based Decision Making and Pedagogical Use by Schools in Russia Alina Ivanova

The chapter describes the use of the international Performance Indicators in Primary Schools (iPIPS) study in Russia. The feedback forms and the instrument were gradually developed to better meet the needs of teachers and other users of the iPIPS data.

14.1 Introduction The development of the Russian version of the iPIPS baseline and follow-up assessment was initiated by the Institute of Education, National Research University Higher School of Economics in 2013. Since then, every year at least one regional sample of the Russian first-graders has been assessed using iPIPS. During this period, the format of the assessment has been developed, starting with a booklet of items and an App to record the results through to a computer-delivered adaptive assessment. A number of regions in Russia received comprehensive feedback on what children know and can do at start of schooling, and what progress they made during their first year in school. Several sets of large longitudinal data were used in research purposes, including the grant work for the Russian Science Foundation. Based on iPIPS theoretical framework and design, a Russian instrument called PROGRESS was developed for Grades 3 and 4. Finally, using fundamental ideas of iPIPS, an instrument START for Grade 1 was developed. The first pilot assessment, serving as a feasibility study, was conducted in autumn 2013. The Russian sample consisted of 310 children recruited from 21 classes of 21 A. Ivanova (*) National Research University Higher School of Economics, Moscow, Russia e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_14

221

222

A. Ivanova

schools in the Velikiy Novgorod region, located in the central part of Russia. This region was selected because its socio-economic characteristics were similar to those in the country as a whole, based on the 2010 census (Social and demographic portrait of Russia, 2010). For example, the distribution of the region’s population by educational level (62 percent college and above; 30 percent high school; 8 percent below high school) was similar to the national figs. (65 percent college and above; 29 percent high school; 6 percent below high school), as was the ratio of urban to rural pupils in the region (72 percent urban; 28 percent rural). The school sample was randomly selected after stratification on two parameters: (i) the school location (rural or urban area) and (ii) the different status of schools; there are three main types of schools in Russia: comprehensive (general regular) schools; schools specialising in a certain subject; and gymnasia. All the chosen schools consented to participate. The parental consent was obtained for children to participate in the study; the majority of parents, around 90 percent gave permission for their children to participate. The assessment cycle included two waves of data collection: baseline assessment was conducted in the beginning of October (children in Russia start the first Grade at September 1st) and the follow-up assessment was conducted at the end of April. The average age of children in the sample was around 7.4 years in October. Further cycles used more or less the same scheme of the assessment procedures (including dates of assessments) and sample construction, and covered several central, southern and eastern (Siberian) regions of Russia. The only exception is a series of assessment cycles in Moscow, which included a non-random sample of around 10–15 schools, as well as private schools in separate cases. The table in the appendix in Chap. 1 summarises the main assessments cycles across years and regions. From 2013 to 2018, several regions of Russia took part in iPIPS baseline and follow-up assessment cycles, including Moscow, Tatar republic, Krasnoyarsk, Sevastopol, Tambov, Velikiy Novgorod, as well as several schools in Petrozavodsk and Ekaterinburg. In total, nearly 23,000 pupils were assessed. The assessment has been refined over time to reflect specific local pedagogical requests of the Russian teachers and regional educational authorities and is now called START. Additionally, based on iPIPS theoretical framework and design, a Russian instrument called PROGRESS was developed for 3 and 4 grades.

14.2 Use of Assessment Results by Teachers, Head Teachers and Parents The main formats of presenting data were reports and workshops. Before teachers and head teachers received their reports (after each assessment stage), workshops with participating school staff were conducted. The main goal of the workshops was to explain to teachers how to use the reports. Workshops were hosted both offline and online. There were different types of reports for teachers, principals and parents. Teachers were provided with comprehensive reports containing both aggregated class results and individual pupil results. Head teachers received reports at the school level, including comparisons of class average scores in mathematics and

14  The Use of iPIPS Data for Policy Assessment, Government Evidence-Based…

223

reading. Parents were provided with their child’s individual results. Samples of reports and videos of workshops were published on the project website. Teachers were provided with two types of the reports: the main report with the aggregated class data and report with individual pupil results (one-page report for each pupil). The main report included two parts: the first one with results on vocabulary, phonological awareness, maths and reading, while the second one with Personal, Social and Emotional Development (PSED) survey results. The report contained general result’ interpretations. General recommendations followed each part of the report. Additionally, parents and teachers received one-page-reports with individual pupil results in reading, mathematics, as well as two survey-based scales (called Rules and Communication) in the format similar to a pedagogical ladder. The principals report provided aggregated results of the cognitive test on the school level. The Higher School of Economics (HSE) Centre for Psychometrics and Educational Measurement provided feedback to each school coordinator. The school coordinator then provided both the teacher and parent reports to the classroom teachers and parents received their reports from teachers. In the Krasnoyarsk region, the research team worked with a group of teachers to investigate how primary school teachers face the challenge of teaching in heterogeneous classes. Specifically, they looked in detail at how the teachers differentiated their pupils and implemented differentiated instructions within their pedagogy. Differentiated instruction is a time-consuming process, and the reliable, valid data of iPIPS could be a time-saving strategy. Based on the iPIPS cognitive and non-­ cognitive measures (reading, mathematics, as well as two PSED and children’s ability to follow rules and their communication skills), we constructed four clusters where the pupils could belong. Teachers in Krasnoyarsk were provided with this information. Working personally with individual teachers, we found that teachers usually also determine the groups of pupils within the classroom, based on their observations and overall teacher experience. We discovered that the groups of pupils identified by teachers did not always match the clusters identified by iPIPS results. After the discussions on what iPIPS is measuring, many teachers accepted the possibility of their bias in determining to which cluster a pupil might belong, and agreed that they could use iPIPS clusters as objective and reliable information. Based on our work with teachers, we concluded that the iPIPS data can satisfy their needs because it is not limited to the exact educational plans or standards. The iPIPS assessment consists of items of increasing difficulty (for example, multi-digit numbers, choosing the right word form in the context (passages in reading). If the pupil cannot correctly answer these items, this is NOT considered as a negative result. Such difficult items help the teacher to assess well-prepared children and their learning interest. While working with the assessment feedback, and especially information on pupil clusters, it is possible to see the total for the whole class areas for development; to estimate the need for focus of teaching methods and techniques; and to select a portfolio of educational activities for pupils. We had several discussions with participating teachers during the iPIPS project in Russia. Firstly, the reports were rather simple, containing basic tables with pupils’

224

A. Ivanova

general results (on a 100-point scale with the mean of 50 and SD of 10), as well as some diagrams, illustrating, for example, fall and spring results of individual pupils within classrooms. After the first discussion, additional information was included in the reports, using for example, bar charts on each block of items (like rhymes, letters, words) with numbers of pupils in the class being able to accomplish each item in the block; additional textual information was also added based on general pedagogical recommendation of how to interpret and use the information presented in the report for educational process planning (see Fig. 14.1). In close cooperation with the regional centre for education quality evaluation (Krasnoyarsk, Russia), we developed a brochure to help teachers to better understand the iPIPS instrument, our feedback and opportunities to form their personal strategies of how to use the report information, as well as some cases from real life of how teachers used the information to ascertain classroom and individual pupil strengths and weaknesses. The monograph was uploaded on the project website and teachers were provided with a link. After another important discussion, we developed a one-page-report for teachers containing individual pupil results on four measures mentioned above (Reading, Mathematics, as well as two PSED scales, Rules and Communication) in the format similar to the pedagogical ladder, with indication of class average on each scale. Parents were also provided with a one-page report with only their child’s results (Fig. 14.2). As teachers mentioned in the discussion and in several following workshops on the use of assessment results, this form of the feedback was the most important, convenient and useful.

14.3 Informing Policy Decision-Making In each region, a comprehensive report was given to schools, teachers and parents. Within two weeks after the last day of the data collection, the schools were provided with feedback. Only teachers, parents and the school coordinator had access to the individual data. Regional authorities receive an analytical report with the aggregated data only. In 2014–2015, iPIPS assessment was carried out in the Krasnoyarsk region. The sample included about 1500 pupils. The Krasnoyarsk Regional Centre for Education Quality Evaluation was the regional coordinator of the assessment. The Centre’s specialists reported that one of the most important directions in the development of the regional assessment system in the Krasnoyarsk region is the assessment of individual educational achievements of pupils. This approach shifts the emphasis from assessment-control (as accountability) to assessment for learning. Analysis and interpretation of such data are aimed at supporting the educational progress of a particular pupil and at designing the development of an educational organisation. This approach forms the basis of federal educational standards for primary schools that have been established in Russia since 2014.

14  The Use of iPIPS Data for Policy Assessment, Government Evidence-Based…

225

Fig. 14.1  Snippet from the teacher report of the vocabulary section (in Russian)

The key problem in the implementation of assessment for learning is associated with the inconsistency of mass teaching practice in educational institutions. The teaching and learning process is guided by the goals of education, which include the federal educational standards, as stated in the strategic documents of the Ministry of Education and the Government. Many teachers do not understand the new types of pupil educational results or how to work towards their achievement. In a traditional situation, teachers, interpreting the results of subject tests, determine the skills that the class has not mastered and this informs further work. However, today teachers are required to work with the subject, meta-subject and personal results of each

226

A. Ivanova Обратная связь для родителей: осень 2018 Школа: Имя ребенка: Класс:

Что такое iPIPS? Чтобы лучше познакомиться с учениками и отслеживать их школьный прогресс, учителя в Вашей школе используют специальный тест iPIPS (мы рассказывали о нем немного, когда просили Вашего согласия на обследование ребенка; узнать об iPIPS больше можно на сайте https://ioe.hse.ru/ipips). Игровые задания по навыкам чтения и математики предлагаются детям два раза: в начале и в конце первого класса. Благодаря этому Вы с учителем сможете наглядно увидеть прогресс ребенка. Перед Вами осенние результаты Вашего ребенка. Пожалуйста, сравнивайте результаты вашего ребенка только с его прошлыми результатами и результатами, которых он добьется в конце учебного года. Сравнение с результатами других детей может привести к их ложному пониманию. ЗНАНИЯ И НАВЫКИ Чтение

СОЦИАЛЬНО-ЭМОЦИОНАЛЬНОЕ РАЗВИТИЕ Математика

Бегло читает и анализирует прочитанное

Решает примеры на сложение и вычитание с двузначными числами. Может проанализировать условие текстовой задачи и, самостоятельно построив последовательность необходимых арифметических операций, решить ее.

Хорошо читает короткие тексты

Знает многозначные числа. Распознает и может продолжить как увеличивающиеся, так и убывающие арифметические последовательности с шагом в 3. Решает примеры на сложение с двузначными числами.

Может читать предложения

Может читать отдельные слова Знает буквы

Знакомится с буквами

Знает трехзначные числа. Решает примеры на сложение и вычитание с переходом через десяток. Решает текстовые задачи в одно действие. Ориентируется на числовой прямой в пределах 100. Понимает концепцию половины на числовом материале. Распознает и может продолжить как увеличивающиеся, так и убывающие арифметические последовательности с шагом в 2, 5 и 10. Знает двузначные числа. Распознает и может продолжить геометрические последовательности с повторяющимся циклом их трех фигур. Распознает и может продолжить увеличивающуюся арифметическую последовательность с шагом в 10. Может решать арифметические примеры в пределах 10. Понимает концепцию половины на наглядном материале. Знает числа в пределах первого десятка, умеет их складывать с опорой на предметы. Узнает некоторые двузначные числа. Распознает и может продолжить простые геометрические последовательности.

Поведение в классе Сосредотачивается на 15 и более минут даже при отвлекающих факторах. Ведет себя вежливо без напоминаний. Соблюдает школьные правила. Ненадолго сосредотачивается во время уроков. Часто задумывается о последствиях своих поступков для себя и других перед тем, как действовать. Может думать о последствиях своих поступков, следовать правилам, не отвлекаться на уроках.

Учится думать о последствиях, прежде, чем действовать. Учится соблюдать школьные правила.

Уверенность в себе Меняет свое поведение с учетом социальной ситуации. Обладает уверенностью в себе. Общается со всеми естественно и понятно. Умеет слушать мнение других и ждать своей очереди в разговоре. Полная самостоятельность в бытовых вопросах. Обращает внимание на чувства других людей. Охотно участвует в обсуждениях. С уверенностью взаимодействует со взрослыми. Легко заводит друзей. Умеет работать в группе Отлично ориентируется в школьных помещениях Распознает чужие эмоции Налаживает общение с одноклассниками. Учится работать в группе. Учится общению со взрослыми Учится самообслуживанию Привыкает к школе

Вот так на стрелке обозначены результаты Вашего ребенка

Fig. 14.2  One-page report for parents

pupil. This means that it is necessary to analyse the relationship among these educational results and their mutual influence, while also taking into account the context. Teachers also need to learn how to ‘read’ the feedback on testing, which is often presented in an unusual way, and present it to different agents (pupils, parents) in a logical and clear form. This is why schools and teachers need to work with new assessment tools that address these requirements. In the schools that conducted iPIPS testing, a series of discussions was launched among participating teachers in order to analyse the results of each stage of diagnostics. This work was carried out by the Krasnoyarsk regional centre for education quality evaluation. To assess the individual progress of each pupil, the specialists of the Centre, in cooperation with teachers, discussed the following list of questions: • What is the pupil able to do at the moment? • Which of these skills can become a resource for his or her further development? • How are problems in cognitive development related to social and emotional development? • What is the forecast regarding his or her educational advancements? • What recommendations can be given for pupils’ successful training? The Centre’s specialists note that it turned out to be effective to identify groups (clusters) of pupils with similar results (based on a combination of subject results with successes or difficulties in social and emotional development). Those teachers who, based on the analysis of the data of the first stage of diagnostics, planned their

14  The Use of iPIPS Data for Policy Assessment, Government Evidence-Based…

227

work with each group of pupils, took into account pupil resources (what they are successful in) and deficits (what needs to be worked on), and observed an increase in the educational achievements for each pupil. Finally, the Centre’s experts believe that a thorough analysis of the results at the regional level will also help to identify priority areas for change in the preschool and school level. For example, the iPIPS assessment has shown that it is necessary to adjust the educational policy concerning the preschool education system. Greater emphasis is needed on the development of the child’s mental functions: attention, emotional-volitional sphere, experience of interaction with peers, as an example.

Chapter 15

Using Data to Inform Teaching: An Example from the Western Cape, South Africa Christine Merrell

The chapter describes the strategies undertaken in the Western Cape, South Africa, to enhance the use of iPIPS data to inform teaching.

15.1 The Education System in the Western Cape and the iPIPS Project The official school starting age is 7 in South Africa although many children enter at a younger age; 4–5 years into Grade R, which is becoming more widespread, and 5–6  years into Grade 1 (Department of Basic Education Department, South Africa, 2020). The iPIPS project in the Western Cape of South Africa took place in 2016, involving a sample of 3000 pupils; 1000 pupils were assessed as representative samples of Afrikaans, English and isiXhosa medium schools. Schools offering education in the three media were chosen at random, then classes within those schools were chosen at random and finally, since class sizes could be large, 25 pupils were chosen at random within each school to participate. Trained researchers assessed the pupils’ cognitive development whilst class teachers were asked to rate their pupils’ personal, social and emotional development, behaviour. Attitudes to learning were provided by the learners themselves. For more information about the project in the Western Cape, see Chap. 8 and Tymms et al. (2017).

C. Merrell (deceased) Durham University, Durham, UK © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_15

229

230

C. Merrell

Participating schools were provided with assessment results from the cognitive development and attitudes to learning parts of the assessment. Each school report contained a description of the assessment; a ‘ladder’ for literacy and one for numeracy showing the percentage of pupils in the school at each level of development at both the start and end of the year (for examples of the ladders, see Chap. 8). The reports also gave comparisons of the mean percentage of pupils on each level of the ladder against the average of the district, province and other schools teaching in the same language, a summary of pupils’ attitudes to learning for the school; individual pupil scores and recommendations for teaching and learning strategies based upon the school’s results.

15.2 Using iPIPS in School After the data had been collected in 2016 and reports provided to schools, a series of workshops for teachers and school principals took place in March 2017. The purpose of the workshops was to find out how the assessment results had been used in schools and to provide further training on interpreting the results and linking them to teaching and learning. As discussed elsewhere (see for example, Chap. 12), we have found that training seems to be a helpful addition to assessment results alone. Of the 112 schools that participated in the project, 39 schools sent representatives to the workshops and a total of 60 attendees from schools. Although the research team felt that they had presented the results from the iPIPS assessments in a simple, user-friendly format, seven of the 60 attendees felt that some aspects of the information were difficult to understand. This might be dismissed as a low percentage but it illustrates an important lesson when providing feedback that can be practically used by all teachers in a system and one that is reflected in the literature. Teachers need to be equipped with skills to understand the purpose of assessments, what they are measuring and how the results can be translated into improved instructional practices (Brookhart, 2011). Datnow and Hubbard (2016) reported that teachers vary in their abilities to use assessment data effectively with many feeling they are unprepared. Furthermore, they suggest that teachers vary in their beliefs about the usefulness of various types of assessment data. They recommend extensive training for teachers in assessment literacy, beginning in pre-service training programmes and continuing throughout teachers’ careers. During the workshops, teachers talked about the everyday challenges that they face, including shortages of resources and the impact of gang violence on both teachers and pupils. This gives context to the environments within which some of them work. At the end of the workshop, all teachers reported finding school reports useful and all but two of the teachers said they would be using the information to inform their practice. One month later, a random sample of 12 teachers was surveyed about actions that they had taken following the workshop. All said that they had read the school reports thoroughly. However, one teacher reported finding it difficult to understand some aspects of the report, illustrating the need for ongoing training and building up a shared understanding within school. All the schools

15  Using Data to Inform Teaching: An Example from the Western Cape, South Africa

231

sampled had deliberated the findings with the school staff and had made decisions to devise an action plan to improve pupils’ educational performance. One respondent stated that their school was planning to start with an action plan “to inform the parents about the results and ask them to be more involved in their kids’ study. Interventions with teachers to do activities according to learners’ different groups.” Further, five of the schools indicated that they had already started with new educational programmes or interventions. Half of the respondents indicated that they intended to meet regularly as a staff to discuss learner results and determine a way forward. One of the respondents described a meeting with colleagues to improve teaching and learning, “We decided to meet with the foundation phase teachers quarterly to discuss the results of the learners and ways forward. We are planning to have a quarterly workshop with other two schools to plan ahead to their previous considerations”. These comments suggested that the iPIPS data were useful to teachers and actions had been taken on their basis. However, the workshops appeared to be an important part of ‘getting to know’ the data and understanding the potential for practice. It was most encouraging to hear that in some instances, teachers were forming learning communities to collectively discuss the next steps for their schools at a strategic as well individual pupil level. Acknowledgement  Professor Sarah Howie (Stellenbosch University, Formerly the Centre for Evaluation and Assessment, University of Pretoria), Celeste Combrinck, Karen Roux and Gabriel Mokoena (Centre for Evaluation & Assessment, University of Pretoria) compiled the report of feedback from the workshops and subsequent interviews with the sample of twelve schools.

References Brookhart, S. M. (2011). Educational assessment knowledge and skills for teachers. Educational Measurement: Issues and Practice, 30(1), 3–12. Datnow, A., & Hubbard, L. (2016). Teacher capacity for and beliefs about data-driven decision making: A literature review of international research. Journal of Educational Change, 17, 7–28. https://doi.org/10.1007/s10833-­015-­9264-­2 Department of Basic Education Department, South Africa. (2020). Admission of learners to public schools. https://www.education.gov.za/Informationfor/ParentsandGuardians/ SchoolAdmissions.aspx#:~:text=ADMISSION%20OF%20LEARNERS%20TO%20 PUBLIC,their%20children%20attend%20school%20regularly. Accessed 20 Aug 2020. Tymms, P.  Howie, S., Merrell, C., Combrinck, C., & Copping, L. (2017). The first year at school in the Western Cape: Growth, development and progress. Project report funded by the Nuffield Foundation. https://www.nuffieldfoundation.org/sites/default/files/files/Tymms%20 41637%20-­%20SouthAfricaFinalReport%20Oct%202017.pdf. Accessed 30 July 2020.

Chapter 16

iPIPS Research Evidence: Case Studies to Promote Data Use Mariane Koslinski and Tiago Bartholo

The chapter summarises and discusses the strengths and limitations of the strategies undertaken by the iPIPS projects in Brazil, Lesotho, Russia and South Africa to disseminate research findings and promote data-based decision-making.

16.1 Introduction iPIPS research projects have produced a unique set of evidence of what children know when they start compulsory education and their progress in the first year. This information has enabled researchers to develop strategies to promote data use by different stakeholders, such as national and local governments, principals, teachers and parents. Transforming evidence and data produced by research or monitoring systems into effective data use to influence policy decision-making or to inform pedagogical/instructional planning, is a challenge that has been widely discussed in literature focused on data-driven decision making (DDDM) or data-based decision making (DBDM) in education (Mandinach & Gummer, 2016; Marsh, 2012; Marsh et al., 2006; Schildkamp et al., 2017). This chapter presents key issues thrown up by this literature regarding the best formats to present research evidence and skills required for effective data use and, thus, to promote school improvement. Next, based on the DDM/DBDM literature, M. Koslinski (*) · T. Bartholo Federal University of Rio de Janeiro, Rio de Janeiro, Brazil e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_16

233

234

M. Koslinski and T. Bartholo

the chapter sums up the strategies undertaken by the iPIPS projects in Brazil, Lesotho, Russia and South Africa, reported in Chaps. 12, 13, 14, and 15, to both disseminate research findings to national and local government, as well as to promote data-informed instructional planning by teachers at the school level. Finally, the chapter discusses the strengths and limitations of the strategies and ways forward to improve and evaluate the effort to enhance data use within iPIPS projects.

16.2 Promoting Evidence into Use: Data Formats and Transfer Public education is far too important to be guided by low-quality research. High-­ quality evidence can lead to substantial gains for pupils (Schmidt et al., 2001) as it can prevent ineffective programmes from fostering and wasting a large amount of taxpayers’ money. Knowing what does not work is useful and sometimes undervalued. Moreover, the availability of good quality data can also support teachers’ instructional planning. When high-quality evidence indicates the potential effect of a programme, regardless of the outcome itself, policymakers are assisted in making informed decisions that may enhance the quality and equity of the educational system. Evidence-based decision-making faces two key challenges: enhancing the quality of the research and data produced and getting those into use. Some factors can lead to better research evidence, such as more open official data collected by the government, funders and governments pushing for higher quality research with robust research designs to enhance the knowledge about causal relationships. There is a growing body of research in the education field that seeks to understand the most effective approaches to getting high-quality evidence and data into use. For instance, Gorard et al. (2020) reviewed studies across areas of public policy, including but not exclusive to education, focusing on the most prominent approaches to getting high-quality research findings into use. The studies report different outcomes of the attempts to encourage evidence use such as altering stakeholder attitudes or knowledge about research evidence, changes in behaviour and practice and improvement in test scores for pupils. They summed up the different approaches to disclosing and transferring knowledge and promoting evidence use along two dimensions: type of evidence and forms of transfer. The type of evidence can range from plain or raw evidence (research reports and articles) up to highly modified or evidence engineered into an artefact easier for policymakers or practitioners to understand and use. The efforts can also vary in strategies researchers adopt to transfer knowledge. It can vary from a passive transfer (simply making it available) to a more active engagement through an intermediary explaining results or an interactive approach, which includes users getting involved in different phases of the research. The Gorard et al. (2020) literature review points out that presenting high-quality engineered evidence into a more usable format and active types of transfer are more likely to succeed. On the other hand, providing raw or slightly simplified evidence

16  iPIPS Research Evidence: Case Studies to Promote Data Use

235

is not an effective approach, as those require more skills from policymakers and practitioners to understand and use the evidence produced. Moreover, the authors point out that involving users in the research is a promising approach. There is also a growing body of literature that focuses on schools’ data-driven decision-making (DDDM) or data-based decision making (DBDM). The DDDM/DBDM literature in education indicates that schools have experienced a context of access to a growing amount of data – which includes input, process, and, primarily, output/assessment data – and external pressure for use. However, the educational data available is still underused. The context does not guarantee data use to inform practice and/or improve learning (Marsh et  al., 2006; Schildkamp et  al., 2017). The DDDM/DBDM literature discusses the conditions such as data formats, teachers’ motivation and skills, and school characteristics, that are more conducive to decision making (Mandinach & Gummer, 2016; Marsh et al., 2006; Schildkamp et al., 2017). Data are useful when they are not only of high quality but also perceived as accurate and up-to-date. However, factors such as accessibility, usefulness, quality, frequency, timeliness, safety and confidentiality are frequently pointed out. The available data are most useful when easy to find, retrieve and understand and are thus relevant and valuable, particularly in understanding pupil learning, pupil progress and growth as they could inform the development of lesson plans. Therefore, longitudinal data and value-added measures would ensure that test scores are more informative/helpful, as they allow teachers to observe growth and progress in pupil achievement (Schildkamp et al., 2017). Timeliness, or the time elapsed before data are available, is another critical aspect. For example, assessment test data only disclosed the following academic year might not be perceived as helpful in informing meaningful decision-making (Marsh et al., 2006). Finally, data disclosure requirements include norms and structures promoting the safety and confidentiality of data (Marsh, 2012). Significantly, the effective use of data requires teachers’ skills, knowledge and disposition. The DDDM/DBDM literature describes a cycle for compelling data use that involves: problem formulation skills, capacity to collect, analyse/synthesize and interpret data and ability to take action/come up with an adequate solution (Marsh et al., 2015; Mandinach & Gummer, 2016; Schildkamp et al., 2017). The first set of skills includes identifying a problem and setting a goal. The second set includes not only abilities to read and interpret tables and charts, but also to understand and recognise data quality criteria (e.g., reliability, missing data, identification of data being used in misleading ways), to draw correct conclusions (Carlson et al., 2011, Schildkamp et al., 2017). Finally, the set of skills requires ‘pedagogical content knowledge’ or ‘pedagogical literacy’, that is, skills to choose excellent and effective practices and resources to address the problems that were identified after data analysis (Mandinach, 2012; Marsh et  al., 2006; Schildkamp et  al., 2017; Schildkamp & Poortman, 2015;). Such pedagogical literacy skills allow taking actions such as changing curriculum, reallocating resources, targeting pupils that need assistance, determining learning needs, pace lessons, among others (Marsh et al., 2006; Schildkamp et al., 2017).

236

M. Koslinski and T. Bartholo

To sum up, compelling data-use by teachers and principals requires a specific data format and a significant number of skills. Therefore, part of the DBDM/DDDM in schools focuses on the impact of interventions (workshops, training programmes, professional development) on outcomes related to teachers’ change in attitudes, beliefs, knowledge, skills, practice and, less frequently, on pupil outputs (Marsh, 2012). For example, some literature identifies a bottleneck using data to inform instructional planning. Even in contexts with an excellent incentive for data use, teachers lacked pedagogical literacy skills and struggled with taking instructional steps (Datnow & Hubbard, 2015; Marsh, 2012). However, Marsh (2012) points out that most research conclusions on the impact of data literacy interventions were based on self-reports from data users, and few on observation or assessment of practitioners’ knowledge or adopted research design from which causal connections could be justifiably drawn. Moreover, some studies found evidence of unintended outcomes of interventions, leading to misuse of data to ‘game the system’, such as actions to improve test-taking skills or focusing on ‘bubble kids’ (kids with scores close to the mean). Those were more common in high-stakes school accountability contexts (Marsh, 2012; Marsh et al., 2006). Recent studies with more robust research designs (which include a pre-test, post-­ test, propensity score matching, and/or randomised experiments) report impacts of data literacy interventions on teachers’ skills and actions and pupil outcomes. For example, Ebbeler et al. (2017) report a positive impact of an intervention on teachers’ satisfaction and some positive results on participants’ data literacy skills and attitudes towards data use. Poortman and Schildkamp (2016) found mixed results in what concerns participating teams’ ability to solve pupil achievement problems, and Kippers et al. (2018) observed that educators, after a one-year data-use intervention, improved their skills in collecting, analysing and interpreting data and making instructional action. However, educators continue to struggle with identifying a problem and setting a goal/formulating a question. Recently, studies have observed the overall impact of professional training and data use interventions on pupil achievement and/or effects on specific groups (low performing, high/low SES background pupils) (Poortman & Schildkamp, 2016; Staman et  al., 2017; Visscher, 2021; Waymar et al., 2017). Although the DBDM studies point out that professional development for data use/data literacy is necessary, and some observed their impact on instructional practices and pupil outcomes, these studies recognise that there are restrictions in their capacity to improve teachers’ knowledge, skills and attitude, including the preparation of teachers by teacher education colleges (Schildkamp et al., 2017). The previous chapters described strategies pursued by research teams in four countries to enhance data-use by different stakeholders, especially teachers and school principals. Although case studies have employed active data transfer, we can expect limited impact as those do not include longer-term training or professional development programmes. Nevertheless, the experiences in Brazil, Lesotho, Russia and South Africa bring insights in terms of incentives for schools to use data and evidence produced by researchers.

16  iPIPS Research Evidence: Case Studies to Promote Data Use

237

16.3 Strategies to Disseminate Data and to Motivate Data Use in Brazil, Lesotho, Russia and South Africa Considering the key points brought up by the evidence-based decision-making literature, we can sum up the strategies undertaken by iPIPS research teams considering the following elements: stakeholders focus of the data-use strategies/interventions; type of evidence/data handled to stakeholders; forms of data/evidence transfer; and outcomes of the strategies/interventions to incentive data-use. Table 16.1 brings a description of these key points, and Table 16.2 summarises the key points for each case study, which are discussed in depth in Sects. 16.4 and 16.5 of this chapter.

16.4 Strategies to Promote Data-Use: Stakeholders, Type of Evidence and Data Transfer The iPIPS projects in Brazil and Russia undertook actions to give back research results to local departments of education, through meetings (Brazil) and research reports (Russia), presenting modified/easy to read data and research findings. In both cases, those stakeholders only had access to aggregate data and/or anonymised pupil datasets, undermining the possibility of building league tables or school ranking. The research teams in Brazil, Lesotho, Russia and South Africa pursued strategies to incentivise data-use by teachers and/or school principals using modified data, including school, classroom and/or individual reports or score sheets. Russia was the only case reporting using iPIPS data to inform parents. In all four case studies, a key concern was to generate reports that would facilitate practitioners’ understanding (not requiring more elaborate skills for reading and interpreting results) and enhancing teachers’ and school principals’ perception of the usefulness of the data. In Brazil, Russia and South Africa, school and classroom reports presented information about the cognitive assessment (PIPS), including data Table 16.1  Strategies to incentive data-use Key points Stakeholders focus of data-use strategy Type of evidence/data

Evidence/data transfer Outcomes of data-use strategies

Description National and local governments, local authorities, school principals, teachers, parents/caregivers Accessibility: raw X modified or engineered data/evidence Type of information provided: input, process and output Timeliness: how long after been collected data is available Frequency: transversal X longitudinal data Passive, active or interactive Outcomes observed: change in attitudes/beliefs, understanding, skills/knowledge, behaviour/practice and test scores/outputs Research design to observe outcomes: practitioners’ self-report, observation, quasi-experimental or experimental designs

238

M. Koslinski and T. Bartholo

Outcomes of data-use strategies

Evidence/data transfer

Type of evidence/ data

Stakeholders

Table 16.2  Strategies to incentive data-use and their outcomes in Brazil, Lesotho, Russia and South Africa Brazil Lesotho Teachers and school Teachers and principals school principals

Russia South Africa Teachers, school Teachers and school principals, school principals coordinators and parents

Local departments of education and local authorities’ staff Modified data (reports)

Local departments of education

Modified data (score sheets)

Modified data (reports)

Modified data (reports)

Information: input and outputs

Information: output

Information: output

Information: output

Timeliness: six months after

Timeliness: immediately after data collection

Timeliness: just after each assessment

Timeliness: following semester

Frequency: beginning and end of year assessment

Frequency: beginning and end of year assessments Active/interactive: booklets with teaching and learning strategies, workshops and teacher involvement with data collection. Outcomes: perceptions/ attitudes, understanding, practice

Frequency: beginning and end of year assessments

Frequency: beginning and end of year assessments

Active: reports with information on teaching and learning strategies, monograph helping to interpret data online and offline workshops and videos

Active: reports with information on teaching/learning strategies and workshops

Outcomes: perceptions/ attitudes, practice, achievement

Outcomes: perceptions, practice

Evidence on outcomes: self-report

Evidence on outcomes: self-report

Evidence on outcomes: self-report

Active: workshops and meetings

Outcomes: perceptions, understanding, practice Evidence on outcomes: quasi-­experiment, self-report

collected at the beginning and the end of the year, organised in pedagogical ladders showing the percentage of pupils in each development level in mathematics and literacy (see examples of pedagogical ladders in Chaps. 8 and 12). Other output information was also included, such as vocabulary (Brazil, Russia), phonological awareness (Brazil, Russia), attitudes to learning (South Africa), personal, social, and emotional development, and rules and communication scale (Russia). The South African and Russian teams also produced reports with individual pupils’ scores for teachers. In Lesotho, as teachers administered the assessment to their

16  iPIPS Research Evidence: Case Studies to Promote Data Use

239

pupils and recorded the results in score sheets, they also had access to each pupil’s developmental level for each test section at the beginning and end of the year. Moreover, the reports included charts with average/mean for all schools in the study/sample (Brazil, Russia and South Africa) and information on pupils’ socio-­ economic background and school context (Brazil). As pointed out in the data-based decision-making literature, the data-use cycle depends on data literacy and pedagogical literacy skills (Mandinach, 2012; Marsh et  al., 2006; Schildkamp & Poortman, 2015; Schildkamp et  al., 2017). The case studies demonstrate strategies not only to facilitate reading but also to encourage action/decision-making, either by generating booklets indicating good/effective practices (Lesotho) or suggesting possible courses of actions in the reports (Russia and South Africa) to enhance teachers’ capacity of taking appropriate action after the diagnosis. Research teams pursued further efforts to improve the reports and make them easier to read. Those included consulting teachers and principals about the best formats to show the results (see case studies in Brazil and Russia, Chaps. 12 and 14). Still, research teams in Brazil, Russia and South Africa noted that some teachers found difficulty understanding the charts and tables, which indicated that active ways to transfer data were crucial to enhance teachers’ comprehension and use of the information provided in the reports. Finally, there was also a variation in timeliness for teachers and school principals to access data. In Lesotho and Russia, teachers had access to data immediately after the assessment was administered, at the start and the end of the year. In South Africa, teachers had access to the reports with data from the beginning and end of the year one assessment at the beginning of the Grade 2 academic year and, in Brazil, later, at the beginning of the second semester of the second year of compulsory education. Another critical issue brought up by evidence-based decision-making literature is related to data transfer formats. All four case studies used active data transfer (workshops and training) and, in some cases, combined this with passive or interactive approaches. For example, in Brazil, 1-day (6 hours) workshops were conducted with teachers and principals. Those focused on explaining the research and the instruments used to collect data, interpret information in the reports (indicators, tables, and graphs) and group interaction to incentive pedagogical planning using evidence retrieved from the classroom/school reports. Likewise, in South Africa, teachers and school principals were invited to attend workshops to provide training on how to interpret results and link those to teaching and learning. However, in both case studies, attendance was voluntary and did not include all schools and teachers in the research sample. In Lesotho, the data transfer strategy was more encompassing, including a training programme throughout the year (four meetings) attended by all teachers involved in the research. The workshops focused on the rationale of the study, learning how to administrate the iPIPS assessment, interpret the data collected, and use the data to inform pedagogy. In addition, Lesotho was the only case study that used an interactive approach to evidence transfer. Teachers administered the assessment with

240

M. Koslinski and T. Bartholo

their pupils and recorded the results on score sheets, indicating pupils’ developmental level in each assessment segment. This aspect enabled them to have ready access to the assessment results, as discussed above, and, most probably, more understanding of what the test measures, compared to the other case studies. This might account for the absence of teachers reporting difficulties in reading and interpreting results. The research teams in Russia combined active and passive strategies: workshops and publishing samples of reports, recorded workshops and additional content in their research website. The active strategies involved feedback to school coordinators, which then provided teacher and parent reports to classroom teachers. The workshops aimed to explain to teachers how to use the reports and support them in using the assessment results. Moreover, in Krasnoyarsk, the Regional Centre for Education Quality Evaluation presented a methodological workshop to help teachers make diagnoses using iPIPS, understand the relationship between cognitive and social-emotional development and make forecasts of pupils’ educational advancements.

16.5 Results and Impacts of the Data-Use Strategies The four case studies provide evidence of results/effects of the strategies employed to promote data-use by teachers and school principals, especially in terms of perceptions about the usefulness of the iPIPS assessment. This includes information provided in the reports for pedagogical planning, the understanding and capacity to interpret iPIPS data and reports, changes in behaviour and practice and/or on pupils’ outcomes. In Brazil, the iPIPS project collected systematic information about teachers’ understanding/comprehension of the reports and their views on the usefulness of the data presented for pedagogical planning. Using a quasi-experimental design (with a group that evaluated the reports before the workshop and another group that did so after attending the workshop), the Brazilian team suggests that workshops improved teachers’ understanding and interpretation of the report tables and charts. Most teachers, in both groups, thought the data presented in the reports were useful for their pedagogical planning. Although the evidence reported was collected systematically, attendance at the workshops was voluntary. The questionnaires might have captured the views/attitudes of teachers who were more engaged or sympathetic to data use than those teachers who chose not to attend the workshops. The Brazilian team collected less systematic evidence related to the use of the reports to inform teachers’ practice. One year after teachers and school principals attended the workshops, they were invited to answer a few open-ended questions about the use of the reports at their schools. Some reported changes included reflections about the importance of planning using data (resembling the ‘cycles of data use’ described by DBDM literature) and changes in planning, adapting projects and activities to improve specific skills such as reading, vocabulary, ideas about mathematics, counting, among others.

16  iPIPS Research Evidence: Case Studies to Promote Data Use

241

The South African research team also asked teachers, at the end of the workshop, if they had found the report useful to inform their practice. A month later, a random sample of 12 teachers was asked about actions taken after the workshop. All teachers reported they found the information in the reports useful and 10 of the surveyed teachers said they were using it to inform practice. Although some participants reported difficulties in reading the reports, they indicated some actions/decisions based on iPIPS data analysis, such as deliberation with school staff, actions to inform parents and seek their involvement with the school, interventions with teachers to focus on different pupil groups, implementing new educational programmes and interventions, fostering regular meetings with staff to discuss learners’ results and plan accordingly, as well as efforts towards building learning communities. As in Brazil, all the information on teachers’ use of the iPIPS evidence was collected through teachers’ self-reports. Attendance at workshops was voluntary and again, this might indicate that the follow-up strategies might have captured the attitudes, views, difficulties and actions of teachers who were already more engaged with evidence-based decision making. In Lesotho, during the last workshop, the research team asked teachers for their views of the usefulness of the iPIPS data: whether they had found the data valuable and how they used the information about the pupils’ cognitive development to inform pedagogy. Overall, teachers stated that they found the data valuable and described changes in their comprehension and understanding of individual pupils’ abilities and progress, their different paces of progress, and the influence of home. In addition, they reported taking actions based on the iPIPS evidence which included: aligning pedagogy to pupils’ abilities, searching for new resources (reading materials and professional training), improving instructional skills, changing classroom seating arrangement (low achievers together with high achievers) and working with staff to develop school-based policies. As in the other study cases, the evidence collected was based on teachers’ self-reports collected during the workshops. In Russia, in the Krasnoyarsk Region, the local research team investigated the challenges of teaching in heterogeneous classes, aiming to capture how teachers clustered pupils and implemented differentiated instructions accordingly. The intervention constructed four pupil clusters based on cognitive and non-cognitive iPIPS data. Teachers discovered that their grouping within the classroom, based on their observations and experiences, did not always match the clusters based on iPIPS results. This experience changed teachers’ attitudes as they agreed that their grouping might be biased and that iPIPS clusters were based on more reliable and objective information. It is interesting to note that, in case studies in both Lesotho and the Krasnoyarsk region, teachers became aware of their misconceptions about pupils’ cognitive development when they compared their perceptions with the iPIPS evidence. Furthermore, after extensive training work with teachers from schools that used iPIPS, the local research team reported that teachers who based their analyses on data, were able to work with the strengths and difficulties of each group of pupils, which, in turn, led to an increase in pupils’ educational achievement.

242

M. Koslinski and T. Bartholo

In Russia, the iPIPS evidence was also accompanied by a shift in the approach of local educational authorities in the Krasnoyarsk region, from an emphasis on assessment-control (as accountability) to assessment for learning, which later formed the bases of federal educational standards for primary schools, established in 2014. The analysis in the Krasnoyarsk region also indicated that the iPIPS data helped to identify priority areas for curriculum change at the preschool level, including a greater emphasis on mental functions such as attention and social-emotional skills, among others. In Brazil, despite the efforts to disclose the research evidence to local departments of education, the research team has not been able to capture decision making or shifts in attitudes influenced by/based on the iPIPS evidence.

16.6 Concluding Remarks and Ways Forward All four case studies promoted data use employing modified data, which included user-friendly classroom and school reports with a pedagogical ladder, showing pupils’ progress, charts and tables contextualising school characteristics (input information, and comparison with average scores of the schools/classrooms in the research sample) and more active forms to transfer data/evidence. Despite all efforts to disseminate data in a user-friendly way, requiring fewer skills for interpretation, and workshops and booklets to explain how to interpret and use data to inform pedagogy, in three of the case studies, the research teams reported that teachers and/or school principals still struggled with the interpretation of some aspects of the reports. The workshops seem to be a crucial part of the strategies to engage schools using iPIPS data for decision making and have motivated changes in attitudes and practice. However, those might not be enough to improve data literacy. As this specialised literature points out, developing such skills requires more extended professional development programmes including pre-service teacher training programmes. In that way, the Lesotho case study brings a promising approach. It included several workshops throughout the year and an interactive approach. In addition, teachers collected assessment data for their pupils and recorded the data on score sheets, which helps visualise their cognitive abilities. Still, we need more systematic and robust evidence, especially using experimental designs and observation techniques, to measure the effectiveness of different strategies to incentivise data-use and teachers’ comprehension/understanding and practice, as well as to try to capture possible impacts on pupils’ achievement. On the basis of the research and discussion in the case studies, we recommend that future endeavours continue in developing structured strategies to engage other stakeholders, such as central and local governments as well as undertaking active ways of disseminating research evidence to inform policy decision-making.

16  iPIPS Research Evidence: Case Studies to Promote Data Use

243

References Carlson, J. R., Fosmire, M., Miller, C., & Nelson, M. R. S. (2011). Determining data information literacy needs: A study of students and research faculty. Libraries and the Academy, 11(2) 629–657. Datnow, A., & Hubbard, L. (2015). Teachers’ use of assessment data to inform instruction: Lessons from the past and prospects for the future. Teachers College Record, 117(4), 1–26. Ebbeler, J., Poortman, C. L., Schildkamp, K., & Pieters, J. M. (2017). The effects of a data use intervention on educators’ satisfaction and data literacy. Educational Assessment, Evaluation and Accountability, 29, 83–105. Gorard, S., See, B. H., & Sidiqqui, N. (2020). What we know already about the best ways to get evidence into use in education. In S. Gorard (Ed.), Getting evidence into education: Evaluating the routes to policy and practice (pp. 110–118). Routledge. Kippers, W. B., Poortman, C. L., Schildkamp, K., & Visscher, A. (2018). Data literacy: What do educators learn and struggle with during data use intervention. Studies in Education Evaluation, 56, 21–31. Mandinach, E. B. (2012). A perfect time for data use: Using data-driven decision making to inform practice. Educational Psychologist, 47(2), 71–85. Mandinach, E. B., & Gummer, E. (2016). What does it mean for teachers to be data literate: Laying out the skills, knowledge and dispositions. Teaching and Teacher Education, 60, 366–376. Marsh, J. (2012). Interventions promoting educators’ use of data: Research insights and gaps. Teachers College Record, 14(11), 1–48. Marsh, J., Pane, J. F., & Hamilton, L. (2006). Making sense of data-driven decision making in education: Evidence from recent RAND research. RAND Education Occasional Paper. Accessible in: https://www.rand.org/pubs/occasional_papers/OP170.html. Marsh, J., Bertand, M., & Huguet, A. (2015). Using data to alter instruction practice: The mediating role of coaches and professional learning communities. Teachers College Records, 117(4), 1–40. Poortman, C. L., & Schildkamp, K. (2016). Solving student achievement problems with data use intervention for teachers. Teaching and Teacher Education, 60, 425–433. Schildkamp, K., & Poortman, C.  L. (2015). Factors influencing the functioning of data teams. Teachers College Record, 117(4), 1–42. Schildkamp, K., Poortman, C. L., Luyten, H., & Ebbeler, J. (2017). Factors promoting and hindering data-based decision making in schools. School Effectiveness and School Improvement, 28(2), 242–258. Schmidt, W. H., McKnight, C. C., Houang, R. T., Wang, H., Wiley, D., Cogan, L. S., & Wolfe, R. G. (2001). Why schools matter: A cross-National Comparison of curriculum and learning. The Jossey-Bass Education Serie. Staman, L., Timmermans, A. C., & Visscher, A. (2017). Effects of a data-based decision-making intervention on student achievement. Studies in Education Evaluation, 55, 58–67. Visscher, A. (2021). On the value of data-based decision making in education: The evidence from six intervention studies. Studies in Educational Evaluation, 69, 1–9. Waymar, J. C., Shaw, S., & Cho, V. (2017). Longitudinal effects on teacher use of a computer data system on student achievement. AERA Open, 3(1), 1–18.

Part VI

Novel and Unexpected Findings from iPIPS Elena Kardanova

Introduction iPIPS, as well as about providing valid and reliable information about what children know and can do when they start school and what progress they make during the first year of schooling, provides unique data on children’s cognitive and non-­ cognitive development that can be accompanied with context information about the parents and teachers. This information is valuable for teachers, head teachers and educators but can also be used by researchers to answer different questions about children’s development and the influencing factors. There are many studies based on iPIPS in different countries which can be classified into four groups. The first group includes papers that analyse children’s basic skills at the start of schooling as predictors of their future attainment with a theoretical basis (see Demetriou et al., 2017; Tymms et al., 2014; Wildy & Styles, 2008). The second group includes papers that study factors influencing children’s achievement, for example family background (see Niklas & Schneider, 2017; Vasilyeva et al., 2018), children’s personal characteristics (see Merrell & Bailey, 2012), teachers’ characteristics and practices (see De Haan et al., 2014; Tymms et al., 2018). The third group includes papers devoted to the development and validation of national versions of iPIPS (see Archer et al., 2010; Boereboom & Tymms, 2018; Kardanova, 2018; Styles et al., 2014). Finally, the fourth group includes papers that consider iPIPS in an international perspective and compare children from different countries when they start school and their progress during the first year (see Hawker, 2015; Ivanova et  al., 2016; Tymms & Merrell, 2009; Tymms et al., 2004).

E. Kardanova (*) National Research University Higher School of Economics, Moscow, Russia e-mail: [email protected]

246

VI  Novel and Unexpected Findings from iPIPS

This part of the book includes five papers based on iPIPS from different countries. With limitations of space, one section cannot embrace all findings, so we have chosen papers that feature unexpected and interesting findings. Moreover, these findings offer a helpful practical application which can be of interest to those dealing in these matters. The first paper in Chap. 17, considers phonological processing and its effect on the emergence of specific mathematics or combined mathematics and reading difficulties during the first year of schooling of Russian first graders. The results reveal that phonological processing is linked to mathematics difficulties as well as reading difficulties. Advanced phonological processing may prevent typically developing pupils from developing mathematical difficulties. Moreover, advanced phonological processing boosts the chance for pupils to move into a typically developing group particularly if they experienced specific mathematics difficulties at the start of schooling. The second paper, Chap. 18, based on a large sample from England, Scotland and Australia, investigates the predictive validity of the name writing item in iPIPS assessment. According to the findings, name-writing ability is a good predictor of future academic outcomes in early reading, phonological awareness and mathematics. Surprisingly, the length of the name appeared not to be related to the ability to write one’s own name and it was hardly predictive of future outcomes. The third paper in Chap. 19, features the findings of research with an extensive sample of preschoolers. Its longitudinal design explores the relationship between cognitive development and non-aerobic physical fitness during the first year of compulsory education in Brazil. The findings show that non-aerobic physical fitness is a predictor for mathematics development in children between 4 and 5 years of age. The fourth paper, Chap. 20, examines the relationship between the manifestations of Attention Deficit Hyperactivity Disorder (ADHD) symptoms in primary school pupils and their reading achievement during the first 3 years of schooling. The findings suggest that inattentiveness rated by teachers in the first grade, is a stable negative predictor of student reading outcomes in subsequent years. At the same time, the effect of hyperactivity on reading achievement being negatively related to pupils’ reading success at the end of the first grade, becomes negligible in the third grade. Finally, the fifth paper in Chap. 21 studies the effects of class composition on first-grader mathematics and reading results. The paper combines two studies conducted in two countries which vary language-wise, culture-wise, in terms of educational curriculum and school-starting age – in the Netherlands and in Russia. The results of both studies reveal that disadvantaged children in mixed classrooms gain more than disadvantaged children in targeted classrooms. Additionally, learning in classes with high average abilities might be useful for pupils with high initial abilities only and might have a negative effect on pupils with low initial reading achievement.

VI  Novel and Unexpected Findings from iPIPS

247

References Archer, E., Scherman, V., Coe, R., & Howie, S. J. (2010). Finding the best fit: The adaptation and translation of the performance indicators for primary schools (PIPS) for the South African context. Perspectives in Education, 28(1), 77–88. Boereboom, J., & Tymms, P. (2018). Is there an optimum age for starting school in New Zealand? New Zealand International Research in Early Childhood Education, 21(2), 32–44. De Haan, A. K. E., Elbers, E., & Leseman, P. P. M. (2014). Teacher- and child-­managed academic activities in preschool and kindergarten and their influence on children's gains in emergent academic skills. Journal of Research in Childhood Education, 28(1), 43–58. Demetriou, A., Merrell, C., & Tymms, P. (2017). Mapping and predicting literacy and reasoning skills from early to later primary school. Learning and Individual Differences, 54, 217–225. Hawker, D. (2015). Baseline assessment in an international context. In Handbook of International Development and Education (pp. 305–325). Edward Elgar Publishing. Ivanova, A., Kardanova, E., Merrell, C., Tymms, P., & Hawker, D. (2016). Checking the possibility of equating a mathematics assessment between Russia, Scotland and England for children starting school. Assessment in Education: Principles, Policy & Practice, 25, 1–19. Kardanova, E. (2018). Obobshchennye tipy razvitiya pervoklassnikov na vkhode v shkolu. Po materialam issledovaniya iPIPS [Generalized types of development of first-graders at the entrance to the school. According to iPIPS]. Voprosy obrazovaniya [Education issues], 1, 8–37. Merrell, C., & Bailey, K. (2012). Predicting achievement in the early years: How influential is personal, social and emotional development? Online Educational Research Journal. http:// www.oerj.org/View?action=viewPaper&paper=55 Niklas, F., & Schneider, W. (2017). Home learning environment and development of child competencies from kindergarten until the end of elementary school. Contemporary Educational Psychology, 49, 263–274. Styles, I., Wildy, H., Pepper, V., Faulkner, J., & Berman, Y. (2014). Australian indigenous students’ performance on the PIPS-BLA Reading and mathematics scales: 2011–2013. International Research in Early Childhood Education, 5(1), 103–123. Tymms, P., & Merrell, C. (2009). On-entry baseline assessment across cultures. In A.  Anning, J.  Cullen, & M.  Fleer (Eds.), Early childhood education: Society & culture (2nd ed., pp. 117–128, 226). Sage Publications. Tymms, P., Merrell, C., & Jones, P. (2004). Using baseline assessment data to make international comparisons. British Educational Research Journal, 30(5), 673–689. Tymms, P., Merrell, C., & Wildy, H. (2014). The progress of pupils in their first school year across classes and educational systems. British Educational Research Journal, 41(3), 365–380. Tymms, P., Merrell, C., & Bailey, K. (2018). The long-term impact of effective teaching. School Effectiveness and School Improvement, 29(2), 242–261. Vasilyeva, M., Dearing, E., Ivanova, A., Shen, C., & Kardanova, E. (2018). Testing the family investment model in Russia: Estimating indirect effects of SES and parental beliefs on the literacy skills of first-graders. Early Childhood Research Quarterly, 42, 11–20. Wildy, H., & Styles, I. (2008). Measuring what students entering school know and can do: PIPS Australia 2006–2007. Australian Journal of Early Childhood, 33(4), 43–52.

Chapter 17

Phonological Processing and Learning Difficulties for Russian First-Graders Yulia Kuzmina and Natalia Ilyushina

In this chapter, we considered the role of phonological processing on the emergence of mathematics or combined mathematics and reading difficulties during the first year of schooling. We also estimated whether a high level of phonological processing could be a resource for coping with mathematics difficulties.

17.1 Introduction Different approaches to the identification of mathematics difficulties (MD) exist. Early studies identified several types of difficulties regarding the severity of disabilities or their content. One of the most severe is dyscalculia that is characterised by sustainable difficulties in the acquisition of mathematics knowledge, operations and concepts for individuals with normal intelligence and working memory (Butterworth et al., 2011). Some pupils, even though they do not have severe problems like dyscalculia, have difficulties with mathematics resulting in low mathematical performance due to a moderate deficit in mathematics knowledge and skills. The majority of pupils with a low level of mathematics performance do not have dyscalculia, but rather experience such difficulties (Peard, 2010). To identify pupils with MD, who do not have dyscalculia, researchers have used different assessment instruments. It has been assumed that pupils demonstrating mathematics achievement lower than the 25th percentile in a sample, have MD or Y. Kuzmina (*) · N. Ilyushina National Research University Higher School of Economics, Moscow, Russia e-mail: [email protected] © Springer Nature Switzerland AG 2023 P. Tymms et al. (eds.), The First Year at School: An International Perspective, International Perspectives on Early Childhood Education and Development 39, https://doi.org/10.1007/978-3-031-28589-9_17

249

250

Y. Kuzmina and N. Ilyushina

are predisposed to it (Peng et al., 2012). Other studies have used different benchmarks for assigning pupils to the group with MD, such as 5th or 10th percentiles (Mazzocco, 2001; Shalev et  al., 2005). The choice of the cut-off criterion might change the perceived differences between children with MD and typically developing children (Murphy et al., 2007). It is a common practice to differentiate between children with only mathematics difficulties (MD), children with both mathematics and reading difficulties (MDRD) and typically developing (TD) children (Peng et al., 2012). Researchers assume that different mechanisms exist underlying difficulties both in mathematics and reading or in mathematics only (see Ashkenazi et  al., 2013). In particular, children with specific MD demonstrate a deficit in different skills such as magnitude comparison, non-symbolic arithmetic or calculation fluency (Gersten et al., 2005; Jordan et al., 2003). It was also demonstrated that a deficit of spatial ability was identified in children with MD but not with MDRD (Passolunghi & Mammarella, 2012). There are also cognitive predictors that may be associated with both mathematics and reading difficulties, among which is the deficit of phonological processing (Carroll & Snowling, 2004; Ostad, 2013). Phonological processing refers to an individual’s sensitivity to the sounds of the language and to the capacity to use these sounds to decode linguistic information (Wagner & Torgesen, 1987). Researchers have identified three main dimensions of phonological processing. The first one is phonological awareness, which combines phonological synthesis (the ability to combine speech segments into syllables or words) and analysis (the ability to identify different sounds within words). The second dimension is lexical access or rapid automatised naming (RAN), which involves recording a visual symbol onto a sound-based representation by retrieving its lexical referent from long-term memory. The third dimension is phonological memory, which is the temporary memory storage of phonological information. The findings regarding the role of phonological processing in the emergence of specific mathematics difficulties or in dyscalculia, are rather controversial. Robinson et al. (2002) proposed that poor phonological processing can be a source of the deficit of arithmetic fact retrieval in children with dyscalculia. Other studies have confirmed this suggestion and have demonstrated that children with dyscalculia suffer from poor phonological processing even when controlling for IQ, working memory (WM) or reading achievement (Vanbinst et  al., 2015; Vukovic & Siegel, 2010; Vukovic et al., 2010). Despite these findings, some researchers believe that the deficit of phonological processing is not the main source of the emergence of specific MD but could be an additional risk factor (see De Smedt, 2018). In particular, studies have shown that children with both mathematics and reading difficulties usually demonstrated a deficit in phonological processing, while children with a specific MD, often did not show phonological impairment (Moll et al., 2015). In line with these results, a deficit of phonological processing was found to be a unique predictor of reading difficulties but not specific mathematics difficulties (Passolunghi et al., 2007), although mathematics difficulties might arise as a secondary deficit of reading difficulties

17  Phonological Processing and Learning Difficulties for Russian First-Graders

251

(see Jordan et al., 2003). Thus, the role of the phonological deficit in the emergence of a specific MD remains unclear. Since there is evidence about the link between the deficit of phonological processing and MD, it is possible to assume that a high level of phonological processing can be a resource for improvement in mathematics performance and overcoming of MD. However, we could not find any related studies. So, little is known about the role of phonological processing in persistence or elimination of MD in elementary school. In this chapter, we aim to establish the role of phonological processing in the emergence and persistence of specific MD or combined MDRD from the start to the end of the first grade. The second goal was to ascertain if the high level of phonological processing could be a source of overcoming of MD at the end of the first grade.

17.2 Method Participants The research was conducted in two stages – the first stage took place at the beginning of the grade 1, in October 2017. The second stage took place at the end of grade 1, in May 2018. The initial sample consisted of 3450 pupils from the Tatar Republic, a region in Russia. The socio-economic characteristics of the Tatar Republic are similar to the average in Russia (based on the 2015 census, Russian Federal Statistics Service). The resulting sample consisted of 3296 pupils (49% girls). Sample size reduction was due to the transition of pupils at another school or illness on the day of testing. The mean age was 7.3 years at the beginning of the school year and 7.8 years at the end. The data were collected anonymously; the parents of the pupils gave their informed consent before the survey. The Institutional Review Board at the Higher School of Economics approved the study, and data were collected according to the guidelines and principles for human research subjects.

Instruments and Measures The results were obtained by means of a baseline and follow-up assessment using the iPIPS instrument. The Russian version was developed and validated from 2013 to 2015 (Ivanova et al., 2018). The Item Response Theory technique was used for examining the achievement level of pupils, particularly the anchor item equating, using the dichotomous Rasch

252

Y. Kuzmina and N. Ilyushina

model (Kolen & Brennan, 2004). The items were equated so that it was possible to measure pupil achievement on a continuous scale from the start to the end of Grade 1. Mathematics Performance For the estimation of mathematics achievement, a total of 19 tasks were presented. These tasks included word problem-solving tasks and two-digit arithmetic tasks. The scale appeared to be unidimensional, with items highly correlated and test reliability (Cronbach’s alpha) varied from 0.8 to 0.9 for the beginning of Grade 1 and the end of Grade 1. Phonological Processing We used two types of tasks to assess phonological processing: rhyming tasks and word/pseudoword repetition tasks. For a rhyming task, the child had to select a word that rhymed with a target word from three options. In total, five target words were presented. As incorporated in the software, each word was illustrated with a picture and pronounced by a professional narrator. In the word/pseudoword repetition task, the child was asked to repeat a word or pseudoword (for example, “frigliyaga” (pseudoword) and “stop” (word) immediately after hearing it pronounced by the assessment software. There were five items for word repetition and three items for pseudoword repetition. The reliability of combined phonology scale was 0.7 at the beginning of Grade 1 and 0.9 at the end of Grade 1 assessment. Number Recognition The number-recognition tasks included single-, two- and three-digit numbers. The child was asked to name numbers that were presented visually. A total of 9 one-, two- and three-digit numbers were presented. The scale appeared to be unidimensional, with items highly correlated, and test reliability (Cronbach’s alpha) was 0.8 for the beginning of Grade 1. Reading Performance The reading performance scale was constructed based on tasks that included letter recognition, word decoding and reading comprehension. The reliability of the reading scale was higher than 0.9 for both time points.

Analysis Plan Firstly, we identified four groups of pupils at the start of Grade 1 and at the end of Grade 1 using the threshold of the 25th percentile. Pupils who had mathematics achievement lower than the 25th percentile but reading achievement higher than the 25th percentile were identified as being in the MD group. Pupils who had reading achievement lower than the 25th percentile but mathematics achievement higher than 25th percentile, were identified as being in the RD group. Pupils who had both mathematics and reading achievement lower than 25th percentile, were identified as being in the MDRD group. Other pupils were identified as being in the TD group.

17  Phonological Processing and Learning Difficulties for Russian First-Graders

253

We estimated the proportion of pupils within each group at the start of schooling that moved to another group or stayed in the same group (MD, RD, MDRD or TD) at the end of the school year. We compared the level of phonological processing between groups at the start and at the end of the school year using one-way ANOVA with Bonferroni multiple corrections. We also estimated the effect of phonological processing on the probability of moving to a different group at the end of the school year using multinomial regression analysis with the group status at the end of Grade 1 (MD, RD, MDRD, TD) as the dependent variable. We also ascertained if the effect of phonological processing varied for pupils with different group status at the start of Grade 1 and included an interaction between group status at the beginning of Grade 1 and phonological processing. We calculated the predicted probability of pupils with different levels of phonological processing and group status at the beginning of the year, moving into MD, MDRD, RD and TD groups by the end of the year.

17.3 Results Descriptive statistics demonstrated that all measures increased from the start to the end of the first grade (Table 17.1). We then traced the transition between different groups from the start to the end of the first grade, see Table 17.2. The results revealed that the majority of children in the TD group (76%), remained within this group at the end of Grade 1. A considerable number of children in the MD or RD groups at the beginning of year moved to the TD group at the end of year (43% and 44% respectively), whereas only 22% of children from MDRD group moved to the TD group. It should be noted, that a substantial proportion of children in the MDRD group remained in the same group (44%). The proportion of children within separate MD or RD groups, whose status remained the same, was lower (30% and 25% respectively), which might indicate the greater stability of combined difficulties compared to separate difficulties. Table 17.1  Descriptive statistics for mathematics, reading, phonological achievement and number recognition Variables Maths performance at the beginning of grade 1 Maths performance at the end of grade 1 Reading performance at the beginning of grade 1 Reading performance at the end of grade 1 Phonological processing at the beginning of grade 1 Phonological processing at the end of grade 1 Number recognition at the beginning of grade 1

Mean (in logits) −1.06 0.80 −.02 2.45 0.81 1.96 2.02

SD 2.05 1.93 2.61 2.12 1.45 1.78 4.79

Min −8.01 −6.63 −7.01 −7.2 −5.24 −5.22 −9.08

Max 6.61 6.63 6.89 6.91 4.36 4.36 8.34

254

Y. Kuzmina and N. Ilyushina

Table 17.2  Transitions between different groups from the beginning of grade 1 to the end of grade 1 (% from a group at the beginning of Grade 1) Group status at the end of Grade 1 TD MD MDRD Group status at the start of grade 1 N % N % N % TD 1560 76% 162 8% 96 5% MD 175 43% 125 30% 57 14% RD 168 44% 48 13% 70 18% MDRD 101 22% 92 20% 203 44% Overall 2004 61% 427 13% 426 13%

RD N 228 53 93 65 439

% 11% 13% 25% 14% 13%

Overall N % 2046 62% 410 12% 379 12% 461 14% 3296 100%

Legend: TD typical development (without any difficulties), MD only mathematics difficulties, MDRD mathematics and reading difficulties, RD only reading difficulties Table 17.3  Level of phonological processing in different groups at the beginning and the end of Grade 1 Groups

TD Mean [95% CI] The beginning of grade 1 1.18 [1.12; 1.24] The end of grade 1 2.55 [2.49; 2.62] ***

MD Mean [95% CI] 0.35 [0.22; 0.48] 1.62 [1.47; 1.77]

MDRD Mean [95% CI] −0.05 [−0.17; 0.08] 0.41 [0.26; 0.55]

RD F Mean [95% CI] 0.33 140.01*** [0.20; 0.47] 1.07 282.21*** [0.92; 1.22]