The Power of Assessment in the Classroom: Improving Decisions to Promote Learning (Springer Texts in Education) 3031458370, 9783031458378

This textbook addresses the main assessment issues that teachers and educational institutions face in their daily work,

113 7 6MB

English Pages 272 [264] Year 2024

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Power of Assessment in the Classroom: Improving Decisions to Promote Learning (Springer Texts in Education)
 3031458370, 9783031458378

Table of contents :
Preface
Contents
Editor and Contributors
About the Editor
Contributors
1 Teacher Assessment Literacy
1.1 Introduction
1.2 Assessment Literacy
References
2 Learning and Assessment: What Is Not Assessed, Is Not Learnt
2.1 Introduction
2.2 Understanding Learning
2.3 Taxonomies of Progress
2.4 Taxonomy by Bloom et al. (1956) Revised by Anderson et al. (2001)
2.5 Affective Domain of Krathwohl, Bloom, and Masia’s Taxonomy (1964)
2.6 Psychomotor Domain of Simpson’s Taxonomy (1972)
2.7 Integrating the Three Taxonomic Domains in a Task
2.8 Structure of Observed Learning Outcomes (S.O.L.O.) Taxonomy
2.9 Evaluation Indicators and Definition of Assessment Tasks
2.10 Common Problems in the Formulation of Learning Objectives and Outcomes
References
3 Integrated Planning of Teaching and Assessment
3.1 Introduction
3.2 Planning for Learning
3.3 Components of the Assessment Strategy
3.4 Organisation of the Assessment Strategy
References
4 The End Justifies the Means: Purposes of Assessment
4.1 Introduction
4.2 Conceptualizations of Summative and Formative Assessments
4.3 Formative Assessment
4.4 Initial Formative Assessment
4.5 Ongoing Formative Assessment
4.6 Summative Assessment
4.7 Distinctions and Articulations Between Formative and Summative Assessments
4.8 Tensions Between Summative and Formative Assessments
4.9 Good Formative and Summative Assessment Practices in the Classroom
4.10 Resources and Tools to Be Used in Classroom Assessment Practices
References
5 Effective Feedback and Its Potential to Enhance Learning
5.1 Introduction
5.2 Feedback
5.3 From a Teacher-Centred View to a Learner-Centred View
5.4 The Power of Feedback
5.5 Effective Feedback
5.6 The Role of Context
5.7 Essential Components of Effective Feedback
5.8 Types of Feedback and Their Effect on Learning
5.9 Key Aspects Associated with Effective Feedback
5.9.1 When Do I Give Feedback?
5.9.2 What Should Be the Content of Effective Feedback?
5.9.3 How to Develop Effective Feedback?
5.9.4 Who Provides Feedback?
5.10 Practices for Effective Feeding-Back
References
6 Students as Assessment Agents
6.1 Introduction
6.2 The Self-evaluation
6.2.1 Why Incorporate Self-assessment as an Assessment Practice?
6.2.2 How to Implement Self-assessment?
6.2.3 What Can Be Assessed Through Self-assessment?
6.2.4 Advantages and Limitations of Self-assessment
6.3 The Peer Assessment
6.3.1 Why Incorporate Peer Assessment as an Evaluation Practice?
6.3.2 How to Implement Peer Assessment?
6.3.3 What Can Be Assessed Through Peer Assessment?
6.3.4 Peer Assessment Advantages and Limitations
References
7 Learning Assessment Tools: Which One to Use?
7.1 Introduction
7.2 Test Situations
7.3 Planning the Development of a Test
7.3.1 Close-Ended Response Items
7.3.2 Open-Ended Items
7.3.3 Mixed Items
7.4 Checklists
7.5 Rating Scales
7.6 Types of Scales
7.6.1 Numerical scales
7.6.2 Graphic scales
7.6.3 Descriptive scales
7.7 Recommendations for the Construction of Checklists and Rating Scales
7.8 The Rubric
7.8.1 What is a Rubric
7.8.2 Contribution of the Rubrics
7.8.3 Types of Rubrics
7.8.4 Steps for Designing and Developing a Rubric
7.8.5 Rubric Application Process
7.9 Suggestions for the Use of Rubrics in the Classroom
References
8 Assessing with Graphic Organisers: How and When to Use Them
8.1 Introduction
8.2 Theoretical Background
8.3 Graphic Organisers as a Generic Type
8.3.1 Mind Map
8.3.2 Semantic Map
8.3.3 Gowin’s Vee Diagram
8.3.4 Concept Map
8.3.5 Timelines or Technique of Representation and Development in Time
8.3.6 Venn Diagram
8.3.7 Process Flow Diagram or Flowchart
8.3.8 Cause-Effect Diagrams: The Fishbone
8.4 How to Evaluate with Graphic Organisers and Other Forms of Conceptual Representation
8.4.1 What Do I Assess in an Organizer?
8.4.2 With What Assessment Purpose to Use Graphic Organizers?
8.4.3 Which Agents Are Involved in Assessing with Graphic Organisers?
8.4.4 Do I Need an Additional Instrument to Assess Graphic Organisers?
8.4.5 Validity and Reliability in the Assessment Process with Graphic Organisers
References
9 Quality Criteria for Developing Assessment Tools
9.1 Introduction
9.2 Validity
9.2.1 Content Validity
9.2.2 Instructional Validity
9.2.3 Consequential Validity
9.3 Reliability
9.3.1 Objectivity
References
10 A Case Analysis Methodology to Guide Decision Making in the Schooling Context
10.1 Introduction
10.2 Conceptualization of the Case Analysis Methodology
10.2.1 What Is a Case?
10.2.2 What Is a Case Study?
10.2.3 Responsibilities of the Participants in the Case Analysis
10.3 Organisation and Structure of the Case Analysis Method for Decision Making
10.3.1 Relevance of the Case
10.3.2 First Stage: Building the Case
10.3.3 Second Stage: Case Analysis
10.3.4 Third Stage: Proposal for Improvement
10.4 Considerations Regarding the Ethical Aspects of the Case Analysis
10.5 Final Case Report
10.6 Limitations
10.7 Examples of the Application of the Case Analysis Method for Decision Making
10.7.1 Experience of Teachers Who Have Used This Methodology and Comments on Each of Its Parts
References

Citation preview

Springer Texts in Education

Carla E. Förster   Editor

The Power of Assessment in the Classroom Improving Decisions to Promote Learning

Springer Texts in Education

Springer Texts in Education delivers high-quality instructional content for graduates and advanced graduates in all areas of Education and Educational Research. The textbook series is comprised of self-contained books with a broad and comprehensive coverage that are suitable for class as well as for individual self-study. All texts are authored by established experts in their fields and offer a solid methodological background, accompanied by pedagogical materials to serve students such as practical examples, exercises, case studies etc. Textbooks published in the Springer Texts in Education series are addressed to graduate and advanced graduate students, but also to researchers as important resources for their education, knowledge and teaching. Please contact Yoka Janssen at Yoka.Janssen@ springer.com or your regular editorial contact person for queries or to submit your book proposal.

Carla E. Förster Editor

The Power of Assessment in the Classroom Improving Decisions to Promote Learning

Editor Carla E. Förster Universidad de Talca Talca, Chile

ISSN 2366-7672 ISSN 2366-7680 (electronic) Springer Texts in Education ISBN 978-3-031-45837-8 ISBN 978-3-031-45838-5 (eBook) https://doi.org/10.1007/978-3-031-45838-5 Jointly published with EdicionesUC Translation from the Spanish language edition: “El Poder de la Evaluación en el Aula. Mejores Decisiones para Promover Aprendizajes” by Carla E. Förster, © Ediciones UC 2018. Published by Ediciones UC. All Rights Reserved. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023, 2018, 2019 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publishers, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publishers nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publishers remain neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Preface

The UC Education Collection is a contribution of the Faculty of Education of the Pontificia Universidad Católica de Chile and Ediciones UC to the daily work of practicing teachers, born from the conviction of the need to have texts that guide the practical work of educators in the field. Each of these titles is the result of updated interdisciplinary research and the implementation of the concrete proposals they offer. Assessment is undoubtedly one of the most relevant and necessary areas in which teacher training, both initial and in-service, still seems to be in debt. There is much discussion about large, standardized, censored or sample-based assessments, but there is little focus on classroom assessment seen as part of the teaching and learning processes. Therefore, this book focuses on highlighting the crucial role of feedback in the learning process, the necessary considerations for the different formats in which it is presented, as well as the involvement of students in understanding what it implies, with a view to reducing the gap between their current state and the goal they are expected to reach. This is a book designed to support classroom teachers and management teams, with concrete strategies and ideas to implement. It proposes an update of what is understood by assessment in the current context, which in English is called assessment literacy and assumes an assessment-literate teacher, focusing on the complementarity of formative and summative assessment of learning. At the same time, it assigns students the role of assessment agents and not only that of passive participants, which implies giving them responsibilities in their own learning process and that of their peers through assessment. This proposal emphasizes the consequences that teachers’ evaluative practices have on students’ school and university trajectories, since it is essential that teachers are aware of this and that they are aware of the consequences of the evaluative practices they carry out and know how to safeguard the ethical principles associated with evaluation, the criteria of validity and reliability of the evaluative situations they apply. Finally, this text provides a methodological tool for case analysis that allows teachers and management teams to address needs or critical cases that may have an establishment and work the background systematically to raise a proposal for improvement. v

vi

Preface

Through this collection, the UC makes a new contribution to the school community, as part of its public commitment to education in the country, which we hope teachers can take advantage of to enhance these issues and their teaching practices in the classroom. Santiago de Chile, Chile

Lorena Medina Morales Dean of the UC Faculty of Education

Contents

1

Teacher Assessment Literacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carla E. Förster

1

2

Learning and Assessment: What Is Not Assessed, Is Not Learnt . . . . Carla E. Förster and Cristian A. Rojas-Barahona

23

3

Integrated Planning of Teaching and Assessment . . . . . . . . . . . . . . . . . . . Sandra C. Zepeda and Carla E. Förster

53

4

The End Justifies the Means: Purposes of Assessment . . . . . . . . . . . . . . Sandra C. Zepeda

69

5

Effective Feedback and Its Potential to Enhance Learning . . . . . . . . . Sandra C. Zepeda

87

6

Students as Assessment Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Carla E. Förster

7

Learning Assessment Tools: Which One to Use? . . . . . . . . . . . . . . . . . . . . 129 Carla E. Förster, Sandra C. Zepeda, and Claudio Núñez Vega

8

Assessing with Graphic Organisers: How and When to Use Them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Paola Marchant-Araya

9

Quality Criteria for Developing Assessment Tools . . . . . . . . . . . . . . . . . . 207 Carla E. Förster and Cristian A. Rojas-Barahona

10 A Case Analysis Methodology to Guide Decision Making in the Schooling Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Paola Marchant-Araya and Carla E. Förster

vii

Editor and Contributors

About the Editor Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. e-mail: carla.forster@utalca. cl

Contributors Sandra C. Zepeda Pontificia Universidad Católica de Chile, Santiago, Chile Carla E. Förster Universidad de Talca, Talca, Chile Paola Marchant-Araya Pontificia Universidad Católica de Chile, Santiago, Chile Cristian A. Rojas-Barahona Universidad de Talca, Talca, Chile Claudio Núñez Vega Pontificia Universidad Católica de Chile, Santiago de Chile, Chile

ix

1

Teacher Assessment Literacy Carla E. Förster

Abstract

The assessment of and for learning is one of the most critical knots for the teachers. We know that when an assessment task involves more complex and integrated skills, such as the development of a project, research, or a portfolio, if teachers carry out they well as an assessment process and not as isolates tasks, they generate deeper learning in students. The knowledge about assessment, its application in classroom practices and the assurance of the underlying ethical principles, constitute what has been called assessment literacy. In this chapter, the evolution of the conception of evaluation in the classroom is reviewed through time and the ethical principles of evaluation are analyzed through some examples that help to clarify its meaning in the classroom.

1.1

Introduction

There is a consensus that what and how one assesses conditions student learning and the pedagogical decisions that teachers make (Brown, 2011; Himmel, 2003; Salinas, 2002; Shepard, 2006; Xu & Brown, 2016, among others). For all teachers are well-aware of the typical questions students ask when faced with a written assessment: Teacher, what topics will it cover? How is the test going to be? Will the questions be open-ended or multiple choice? These questions lead the student in the way of approaching the evaluation: the strategy they use to study will be

This chapter is based on the results of the Fondecyt Inititation Project No. 11140713, funded by Conicyt, Chile. C. E. Förster (B) Universidad de Talca, Talca, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_1

1

2

C. E. Förster

determined by the amount of “topics to be covered is required” and “the type of questions”. They know that if the test is multiple choice, then they must study without learning the concepts by heart because the answer is in one of the options, but if it is open-ended they have to know some key words by heart to be able to explain or argue their answers, and understand the content better, although there is more flexibility in the way they answer; in the students’ words: “You have to know more when the test is open-ended”, “but you can wing it a little and be safe”. We also know that when an assessment task involves more complex and integrated skills, such as the development of a project, a research project or a portfolio, if they are carried out well as an assessment process, they generate deeper learning in students (Wiggins & McTighe, 2005), but they require greater assessment mastery by teachers, since they must provide feedback on the process (with limited time and much content to address) and develop rubrics to review these tasks. On the other hand, the school system is constantly asking teachers to manage the results of their assessments and to make decisions regarding the effectiveness of teaching practices based on evidence of student learning, but the analysis is always based on grades obtained and not on progress or achievement of expected learning. It is common to see in school practices associated with the percentage of students that have a grade lower than 7 on an assessment, a situation in which the teacher is expected to apply remedial action, which may range from repeating the assessment to doing a new one and averaging the grades for those who had a failing grade, or talking to the students and addressing under achievements in learning without modifying the grade. The specialized literature indicates that teachers who have inadequate knowledge in classroom assessment or in measurement for accountability to the system have less effectiveness in their teaching practices, which results in lower quality learning in their students (Popham, 2009; Rogers & Swanson, 2006; Stiggins, 2004). These competencies in assessment have been internationally referred to as assessment literacy.

1.2

Assessment Literacy

Assessment literacy has been defined as “a teacher’s understanding of the fundamental concepts and procedures of assessment, which he or she is likely to consider in the pedagogical decisions he or she makes” (Popham, 2011, p. 267). The concept of “literacy” was adopted from the idea that a “literate” person, in this case the teacher, has knowledge about a subject (educational assessment), understands it and possesses the skills required to put it into effective operation in their daily work (Adams & Wu, 2003). This is why assessment literacy implies that a teacher is able to construct reliable assessments and then manage and grade them to facilitate valid decisions regarding their teaching, consistent with the educational standards to which the school abides (DeLuca et al., 2016; Popham, 2004, 2008; Stiggins, 2002, 2004).

1 Teacher Assessment Literacy

3

To better understand why a teacher who is “literate” in assessment is so important today, it is necessary to understand the historical evolution of teachers’ conception of assessment. Deneen and Brown (2016) identify three temporal stages: the first, prior to the 1990s, shows that teachers viewed classroom assessment from a logic strongly influenced by educational and psychological measurement theories and practices. Teacher training emphasised the assessment of learning and how to generate tools with high quality standards that would make it possible to distinguish between students who “knew the subject” and those who did not but give them a few tools to conduct formative assessment and to provide their students with quality feedback. The second stage, recognised between the 1990s and the beginning of the twenty-first century, is characterized by teachers focusing on assessment for learning and strengthening the balance between formative and summative assessment; nevertheless, teachers demonstrate a negative predisposition towards measurement, producing a break between teacher training and the requirements of the system related to accountability and standardised tests. The third stage, which represents the present time, shows that today the teachers’ conception of assessment integrates the priorities of the other two stages, that is, to understand it as assessment of and for learning. Therefore, it requires mastery of formative assessment, effective feedback, and summative assessment as key components in the classroom to monitor and certify learning, as well as using and being willing to work with standardised tests, thus negotiating the tensions and demands in terms of assessment, which are increasingly complex (Fig. 1.1). A teacher with adequate assessment literacy requires proficiency in several domains for effective practice: (a) using multiple, high-quality assessments aligned

Fig. 1.1 Evolution of the classroom assessment concept in teacher training

4

C. E. Förster

with their learning goals, which must be precisely defined; (b) interpreting student performance in light of specific forms of assessment and hypothesizing about common discipline-specific errors; (c) managing and scoring assessments appropriately; (d) accurately communicating results to stakeholders; and (e) carrying out all assessment responsibilities legally and ethically (Popham, 2009; Stiggins, 1991; Stiggins et al., 2007). We will elaborate on each below. (a) Valid and Reliable Evaluations It is common to find guidelines, both in ministerial assessment decrees and in school assessment regulations, that emphasise the idea of using multiple assessment tools; however, this diversity is considerably reduced in practice, since only their use associated with grading is considered (Ravela et al., 2014) and, therefore, many of these assessment tools are not used due to their complexity for grading. If we understand assessment as a constant process of monitoring learning, we could consider other assessment situations that are highly effective for gathering evidence of student achievements, such as answering a written synthesis question at the end of a class or putting together a graphic organiser to close the subject, which require little review time and do not need to be graded. On the other hand, it is necessary that these tools or assessment situations meet quality criteria so that the information we collect with them is reliable and the decisions we make are pertinent to student learning. A fundamental quality criterion assumes that these assessments are aligned with the learning objectives that the task is expected to measure. Often these objectives may be broadly stated or not directly observable, so they need to be reworked to account for concrete student performance (see Example 1). As can be seen in Example 1, the precision of an objective is related to its concrete operationalization in order to be able to gather evidence of that learning and make a correct judgment of its level of development. These assessment indicators do not represent the objective in its entirety but are actions that the student can do to demonstrate his or her learning. It is important that the students know these indicators before being evaluated, thus we make our expectations explicit as teachers and the students (and their parents, if they are young) will know how to guide the study or reinforcement of such learning according to what is expected. Example 1:

A goal set for Language Arts in Grade 4 is “To understand and enjoy complete versions of works of literature, narrated or read by an adult, such as: folktales and short stories, poems, fables and legends, chapters of novels.” How can I as a teacher know that one of my students understands the works and enjoys them? Both actions are internal to each student and, moreover, of a different nature (understanding them is cognitive and enjoying them is attitudinal), so in order to assess them we need to make this objective more

1 Teacher Assessment Literacy

5

precise. This implies breaking it down into more specific aspects that will require different assessment tasks. Thus, for the objective associated with the understanding of the works, a proposal of assessment indicators could be: • • • • •

Explain the consequences of the actions of certain characters in the texts. Support their position with examples from the works they have read. And for the objective associated with the enjoyment of reading Asks questions that demonstrate their interest in the work they have read. Reads various works in their complete versions without being scheduled to evaluate them. • Makes comments that demonstrate a positive disposition towards reading. • Comment on specific aspects of complementary works read outside the context of the class. It is also necessary to consider that although a learning objective is precise, it may have particularities that require more than one instance of assessment or more than one instrument to certify that it is actually being achieved. Let’s look at Example 2: Example 2:

If my learning objective is “Identify and compare, through exploration, the solid, liquid and gaseous states of water”, it is implicit that students should not be assessed only with a test on the states of matter, specifically, water. With this instrument we will be able to certify whether they identify and compare the states of water, but we will not be assessing the development of the ability to explore in order to understand the phenomenon, a key skill in science. Thus, if we include in our assessment strategy an activity in which students specifically observe the states of water and, using key questions, they explain what criteria they use to distinguish between the states (hardness, flexibility with respect to the container, temperature, among others), we can verify their process of exploration of the phenomenon and their observation skills. This implies understanding teaching and assessment as one and the same activity that has two moments and not as two separate actions (Montes, 2014). This coherence between what we teach and what we assess is called “instructional validity”. Therefore, when we speak of an instrument that must be valid and reliable, we are assuming that it meets the criteria of technical quality in its development and that it is aligned with the learning objectives that it is intended to assess. The Joint Committee on Standards for Educational Evaluation (2003) defines a set of standards associated with evaluation ethics, which outline the main points

6

C. E. Förster

to be considered in order to ensure quality practice in this area, which will be discussed in the last section of this chapter. (b) Interpreting Student Performance Interpreting student performance in light of specific forms of assessment and hypothesizing about common discipline-specific errors involves understanding how each instrument is scored, the limitations and strengths of each assessment task in assessing different types of learning, and how to construct them to reveal common errors in the content related to the discipline of the subjects. Regarding the strengths and limitations of each instrument or assessment task, it is important to consider that in a written test format, for example, what is collected as evidence is declarative knowledge. This is very efficient for assessing cognitive skills, but not for measuring attitudinal or value aspects or for practical procedures that require manipulation of material or implementation of procedures or techniques. Similarly, a self-assessment rating scale is very useful for measuring the respondent’s perception of a topic, but not necessarily for assessing actual behaviour. The teacher’s mastery of the pedagogical content of the discipline he/she teaches is also fundamental, since in order to address the key ideas with his/her students, he/she needs to know what those ideas are (Shulman, 1987) and identify the difficulties that students have in their reasoning to be able to confront them (Zohar & Schwartzer, 2005). This is valid not only when teaching but also when assessing (Carr et al., 2005; Mishra & Koehler, 2006), since it has been seen that a teacher who has a good command of the pedagogical content interacts with his students by encouraging questions in class to collect evidence of how they are thinking and learning about the subject he is teaching and, in addition, has better assessment practices, which incorporate both formative and summative (Jones & Moreland, 2005) approaches. The following is an example of this: Example 3:

A common mistake in mathematics is that students assume rules from their previous experiences. In fractions, for example, when we write 3¼, there are some students who assume that since there is no mathematical symbol between the 3 and the ¼, then there is a multiplication sign, thinking that this is typical if there is nothing written without understanding that in reality the 3 integers are added with ¼, which shows that they do not understand what happens in the mixed number. In the end, they solve it correctly in a mechanical way by multiplying the integer with the denominator and adding the numerator, without having any idea of the significance of what they are doing. Therefore, they reach the solution, but when they have to explain it, they do it incorrectly because they do not understand what is at the base.

1 Teacher Assessment Literacy

7

As we can see in Example 3, as teachers we must, on the one hand, try to take actions in the classroom that allow us to identify these types of errors, which we will not notice with only an exercise guide, because they will obtain the right result and, on the other hand, we must manage the content to be able to identify what kind of mistakes the students are making to address them properly. (c) Managing and Scoring Assessments Appropriately One of the most common problems that can be seen in the assessment practices of teachers is related to the way in which the results of an assessment are analysed. Most of the time it corresponds to the sum of the partial scores obtained in each part of the assessment instrument, without considering the weight that each one has, or what learning objective it responds to. It has been seen that the evaluative judgments that teachers make about the performance of their students (grades, classifications, ranking) have a considerable impact on their learning experiences and on their educational trajectories (Südkamp et al., 2012). These negative experiences generate in students, on the one hand, demotivation to learn the discipline and, on the other, making decisions based on erroneous assumptions such as assuming that “I am bad at mathematics” or that “women are humanists and men are scientists”. These erroneous or unreliable judgments that we make as teachers about the performance of our students are due to the way in which we manage and grade the evidence of learning that we collect. On many occasions, we have created a guideline for a task that has a series of assessment indicators that constitute formal aspects or the structure of the task, and that end up adding more points than the aspects associated with the content of the discipline; then students who are orderly and follow instructions will have a good grade even when they have not gone as deeply into the content that should be included, but a student who fails in the structure, even if he/she has a better command of the content, will obtain a lower grade. Teachers also tend to associate grades with a student’s behaviour or with his or her personal and family characteristics, making subjective judgments regarding his or her performance (Brookhart, 1994; Kaiser et al., 2013; Stiggins et al., 1989; Südkamp et al., 2012). Thus, for example, a teacher who corrects a test of a student who exhibits cooperative classroom behaviours and generally performs well tends to be more permissive of errors or inaccuracies compared to a student who has disruptive classroom behaviours or whose performance is habitually low, i.e., the application of the correction guideline is inconsistent (Dompnier et al., 2006). The following case exemplifies a typical situation in which the student is at a disadvantage in the teacher’s judgment, since his or her response does not literally conform to the correction guideline, even though his or her knowledge is equal to or higher than expected.

8

C. E. Förster

Example 4:

This is part of a response from a low-performing student in the first grade, who tends to talk a lot and be very restless during class. The purpose of the guide is to assess vocabulary based on graphic interpretation. As can be seen, the word “Saturn” is marked as wrong and it is corrected by pointing out that the correct word is “planet”, without considering that the child’s knowledge is more profound, since the planet that has rings is indeed the one she points out. With this correction, the message given to her is that her knowledge is wrong, forcing her to express herself according to what the teacher believes and without leaving her the option of displaying the richness of her own learning, perhaps obtained outside the classroom

Write the name of the following drawings:

Train

Flower Planet Saturn

Response guideline: 1. Train 2. Flower 3. Planet

It has also been found that teachers have a limited mastery of the use and management of quantitative data (Mandinach & Gummer, 2016). In recent years, the importance of teachers being prepared to use assessment data as part of the “arsenal of tools” that they use during their professional practice has been emphasised for the improvement of teaching practices—and therefore for student learning (National Council for Accreditation of Teacher Education, 2010). This competency is often confused with a teacher’s ability to assess (Greenberg & Walsh, 2012); however, they are different. In practice, a teacher can perform very good assessment tasks to monitor student learning, complying with quality criteria in their creation, but if they do not correctly analyse the evidence they collect, the process is truncated, that is, their decisions will be intuitive, based on a global or even biased perception of the results obtained by their students (Feldman & Tung, 2001; Hamilton et al., 2009; Mandinach & Gummer, 2012; Mandinach & Jackson, 2012). Likewise, erroneous practices are evidenced in the calculation of grades, such as drawing conclusions about the performance of a student or course based on the grades obtained, without considering what learning was being assessed, the level of difficulty with which it was assessed, and the way in which topics were worked on in class, among other aspects. (d) Accurately Communicating Results to Stakeholders Communication of results can be divided into three areas: (1) communication associated with formative assessment, which requires feedback that lets the student

1 Teacher Assessment Literacy

9

know how they did and what aspects they need to improve, a topic we will address in detail in Chap. 5; (2) communication in the context of summative assessment, which, as in formative assessment, requires feedback, but this has other particularities, since the improvement of student learning takes place outside the planned teaching process and the student has greater responsibility for it, and (3) the characteristics for such communication to be effective, considering the way in which these results are communicated to the interested parties, safeguarding their confidentiality and that they are understandable according to the addressee (students, parents, other teachers or principals) (Stiggins et al., 2007). Communication of results in the context of summative assessment usually translates into a grade, so we must ensure that our students have a clear understanding of what that number means in relation to their learning. This communication takes place at the end of a term and consists of providing a judgement on the student’s performance against the learning objectives. This information is translated into a grade that is communicated to the student himself, to parents and to external authorities. It would be desirable that such communication be adapted to the recipient in terms of the format in which the results are presented and, also, in terms of the evidence provided. For example, parents require the grade and supporting evidence, while education authorities only need the final assessment. There is also a need to clearly define the symbols used in the written or verbal reports, whether numbers or letters are used to score and make an overall judgement, users must have a clear understanding of their meaning and that the results are based on a valid and reliable assessment process (Airasian, 1999) and as García et al. (2011) point out, it is important that “the reporting of assessments separates results related to knowledge and skills from participation, effort or other behaviours” (p. 99). This communication of results, in order to be effective, requires the teacher to share with students and parents information about how a particular learning will be assessed and the assessment rationale behind those methods so that all stakeholders understand the purpose behind the assessment process (Rogers & Swanson, 2006). An example of miscommunication is detailed below: Example 5:

In one school, they decided to implement an integrated task in Year 9 between the subjects of Language and Visual Arts, for which they would work on creating and staging a play. This play involved working on the LO associated to Unit 3: Human relations in theatre and literature in Language, and in Visual Arts, in Unit 2: Architecture. Both teachers and students were very enthusiastic, worked for eight weeks on the project and were making progress, and received feedback. When the work was presented and the students were graded, the parents’ expectation was that everyone would get a 10, as it was a “fun activity”. Faced with some different results, such as a student who had a 8 in Language and a 7.8 in Arts, his father went to complain to the school insisting that “this would affect his average to enter

10

C. E. Förster

university” and that he found this grade “unusual”. He could understand the Language component because his son was studying to become a mathematician, but not Arts “because everyone does well and the school is not training actors or set designers”. The situation presented in the example shows little or no effective communication of the student’s learning results to him and his family, since only the grade was given without an explanation of the reasons for the grade and the learning associated with it. Nor was the evidence gathering process behind it made explicit; if it had been, both the student and his parents would have been aware of what he had failed to improve during the preparation and would have understood that the staging had a very minor weighting, as it was not the purpose of the activity to develop a talent for being actors or set designers. In relation to this, the Joint Committee on Standards for Educational Evaluation (2003) defines the communication of results as a critical point from the perspective of evaluation ethics and sets out a standard that includes the main criteria to be considered in order to ensure quality assessment practice in this area. This standard is presented in the following section. (e) Conduct the Assessment Legally and Ethically The last aspect that a teacher who is “literate” in assessment must take into account is to consider and apply the legal and ethical standards associated with any assessment process. The legal standards are provided by current assessment decrees, internal school assessment regulations, and the Curriculum Frameworks and Bases; however, the ethical responsibilities associated with assessment are more diffuse, since they address different areas of teaching. As Moreno (2011) states, all evaluations always have an ethical dimension largely related to the question “Why evaluate?”, but that in the assessment practice of the teacher is blurred, since more important questions arise regarding what and how to evaluate, implicitly assuming a conception of assessment as an objective and technical process in which the student and the teacher himself have no interference and the results obtained through a rigorous “method” of collecting information, are irrefutable regarding the achievement of learning. Considering that assessment practices are essentially subjective, from the moment we select one content over another to include in a test or the type of format we define for an assessment, we must take into account the ethical implications associated with our decisions. Subjectivity does not mean that the assessment is of lower quality or arbitrary, but rather that there is a component in the assessment that is specific to each person, that is, in the assessment process the perceptions, dispositions and decisions made by the teacher are combined, therefore, it is necessary to consider it as a process that is not aseptic and that always has consequences.

1 Teacher Assessment Literacy

11

There are numerous texts that have addressed ethics in practices associated with educational assessment (see, for example, Covacevich, 2014; Canadian Psychological Association [CPA], 1987), however, in most cases, the focus has been on a psychometric paradigm that places assessment in large-scale measurements, such as national tests or individual measurements associated with research conducted at school, but few refer to the ethical principles of assessment in the classroom, which is where teachers carry out their day-to-day actions. The following is a summary of the ethical principles associated with the assessment of learning in the classroom context, based on the documents Principles for fair student assessment practices for education in Canada (Rogers, 1993) and Joint Committee on Standards for Educational Evaluation (2003). It should be noted that these principles and guidelines, while exhaustive and detailed, do not end here, and there may be others that are not explicit that each stakeholder could include. It is also recognised that not all guidelines are equally applicable in all circumstances. Nevertheless, consideration of the set of principles and guidance associated with classroom assessment presented by these standards should help to achieve fairness and equity for students as they are assessed. There are five: I. Development and choice of assessment methods: Assessment methods1 should be appropriate and compatible with the purpose and context of the evaluation. II. Gathering information in assessments: Students should have sufficient and diverse opportunities to demonstrate the knowledge, skills, and attitudes or behaviours they are being assessed on. III. Scoring and grading student performance: Procedures for scoring or judging student performance must be appropriate to the assessment method used and be consistently applied and monitored. IV. Synthesis and interpretation of results: Procedures for summarizing and interpreting assessment results should correspond to accurate and informative representations of a student’s performance in relation to the learning goals and objectives for the reporting period. V. Communication of assessment results: Assessment reports should be clear, accurate and of practical value to the intended audiences. I. Development and Choice of Assessment Methods: Assessment methods should be appropriate and compatible with the purpose and context of the evaluation 1. Assessment methods should be developed or selected so that inferences made about the knowledge, skills, and attitudes or behaviours possessed by individual learners are valid2 and do not lead to misunderstandings.

1

The term ‘assessment method’ refers to the various strategies and techniques that teachers can use to obtain information from students about their progress towards achieving the knowledge, skills, and attitudes or behaviours they are expected to learn. 2 Validity refers to the degree to which inferences drawn from test results are meaningful. Therefore, the development or selection of assessment methods for data collection must be clearly linked to the purposes for which the results are to be used.

12

C. E. Förster

2. Assessment methods should be clearly related to the learning goals and objectives and be compatible with the teaching methods used. Planning assessment design at the same time as teaching is planned will the help to integrate the two in a meaningful way. Such joint planning provides an overview of the knowledge, skills, and attitudes or behaviours that are expected to be learned and assessed, and the contexts in which they will be learned and assessed. 3. When developing or choosing assessment methods, consider the consequences of the decisions that will be made with the information obtained. Thus, the results of some assessments may be more critical than others. For example, misinterpretation of the level of performance on a test may allow a student to pass a grade without having the minimum competencies or learning, which will be detrimental to his or her further progress. 4. More than one method of assessment should be used to ensure that a more complete picture or profile of a student’s knowledge, skills, and attitudes or behaviours can be obtained and that consistent patterns and trends in performance can be discerned. Using more than one method will also help to minimize inconsistency caused by different sources of measurement error. For example, poor performance due to an “absent day”; mismatch between items on a test and the rating scale; student emotional instability over time. 5. Assessment methods need to be appropriate to students’ backgrounds and prior experiences. They should be free from bias caused by student factors external to the purpose of the assessment. These possible factors include culture, developmental stage, ethnicity, gender, socio-economic status, language, special interests, and special educational needs. For example, students’ success in answering questions on a test or oral quiz should not depend on prior cultural knowledge, such as understanding the allusion to a cultural tradition or value, unless such knowledge is within the domain of the content being assessed. All students should have an equal opportunity to demonstrate their strengths. 6. Content and language that is generally considered sensitive, sexist, or offensive should be avoided. Classic examples of this are the association of women (mothers, aunts, grandmothers) with housework or shopping at the mall, and of men with jobs outside the home, which mark outdated stereotypes. 7. The translation or adaptation of assessment tools into a second language or transferred from another context or place must be accompanied by tests that ensure that these tools are valid for the intended purposes. It is often believed that since the original test was in the student’s first language, it is not necessary to make dialect adaptations; nevertheless, we find words such as “flat/apartment” or “windscreen/windshield” that are not in everyday use for the target population, and which constitute a problem of semantic validity or may mislead students’ responses.

1 Teacher Assessment Literacy

13

II. Gathering Information on Assessments: Students should have sufficient and diverse opportunities to demonstrate the knowledge, skills, attitudes, or behaviours they are being assessed on 1. Students should know what the purpose is, what information is being collected, and how that information will be used. For example, if they know that the purpose of the assessment is to diagnose strengths and weaknesses rather than to assign a grade, they will be more willing to reveal weaknesses and not just expose strengths. 2. An assessment procedure should be applied in conditions and ways appropriate to the purpose for which it was constructed. For example, the environmental conditions of the room (light, ventilation, temperature, and environmental noise), the estimated time for the assessment, the format of the instrument (font size, quality of the images, distribution of the items), etc., must be considered. 3. In assessments that involve observations and that use tools such as checklists or rating scales, the number of characteristics included in the instrument to be assessed in parallel must be sufficiently limited and specific so that observations can be made accurately and do not generate a halo effect (assuming that the characteristic was present even though we did not see it because others of the same nature were observed). 4. Instructions given to students should be clear, complete, and appropriate to their ability, age, and grade level. 5. In assessments involving choice items (e.g., true–false, multiple choice), instructions should prompt students to answer all items and make it explicit that there is no penalty for incorrect answers. Sometimes a correction formula (good/bad discount) is used to discourage “guessing” responses. This practice is intended to encourage students to skip items if they do not know the answer, rather than guessing. But scientific evidence indicates that the expected benefits of this correction are not as such, and the use of the formula is not recommended; instead, students should be encouraged to use whatever partial knowledge they have when choosing their answers and responding to all items. 6. When collecting assessment information, interactions with students should be appropriate, fair, and consistent. For example, when assessing oral presentations in groups, questions need to be distributed among everyone so that they have an equal opportunity to demonstrate their knowledge. If it is a written test, clarification of an ambiguous item should be made for the whole course and not just for the student who has the question. 7. Unforeseen circumstances that interfere with the collection of assessment information should be recorded and then taken into account when interpreting the information obtained. For example, events such as a fire drill, an unscheduled assembly, insufficient materials, an earthquake, or power outage. 8. Institutions should have a written policy, developed by the faculty and administrators, to guide decisions about the use of alternative assessment

14

C. E. Förster

procedures for collecting information from students with special educational needs and for students whose language proficiency is insufficient for them to respond as intended. III. Scoring and Grading Student Performance: Procedures for scoring or judging student performance must be appropriate to the assessment method used and be consistently applied and monitored 1. Before using an assessment method, the scoring procedure to be used to judge the quality of a performance or product, the appropriateness of an attitude or behaviour, or the regularity of a response must be clear. For example, in closed-response tests, the item marking guideline should be available in advance, and if the response is open-ended or what is being assessed is a practical performance, the specification of what is expected as student performance should be included and, as far as possible, descriptions of the different levels of performance or the quality of a product. 2. Before using an assessment method, students should be told how they will be judged or what the expected performance is for each criterion. This practice helps to ensure that the expectations of the teacher and students are similar with respect to the desired performance of the assessment activity. For example, if, when explaining an essay writing guide, we present students with some sample essays (at different levels of quality) and discuss their level of performance by explaining why essay A is at the highest level and what errors are made in essay B that places it at a lower level of performance, students will be clearer about the task and what they are expected to do. 3. Care should be taken that the results of an assessment are not influenced by factors that are not relevant to the purpose of that specific assessment. For example, lowering points for spelling if that is not part of the learning objectives being assessed, or personal biases associated with the classroom behaviour of particular students that distort the judgement of their achievement. 4. Providing feedback through written and oral comments, which provide information regarding errors or inconsistencies that the student can correct, should be part of the marking. This feedback needs to be based on the answers given by the students and the comments should be presented in a way that students can understand and use. 5. Any changes made during the scoring process should be based on a problem with the initial scoring procedure. For example, if the marking guideline for a test included a score associated with a question that is incorrectly scored and the guideline needs to be adjusted, all students’ responses need to be rechecked. 6. There should be a process for students to appeal the results of an assessment at the beginning of each school year or course. For example, situations may arise in which a student believes that a score incorrectly reflects his or her level of performance, and there should be an established and socialized

1 Teacher Assessment Literacy

15

protocol in place at the school for resolving them with students and teachers. This procedure may range from checking the sum of scores or other grading errors to requesting that the assessment be reviewed and scored by a second qualified person. IV. Synthesis and Interpretation of Results: Procedures for summarizing and interpreting assessment results should be accurate and informative representations of a student’s performance in relation to the learning goals and objectives for the reporting period 1. Procedures for summarizing and interpreting assessment results for a period (quarter, semester, year) should be guided by written policy. Grades and reports of other aspects of a student’s performance serve a variety of functions, such as informing students and their parents/guardians of their learning progress, and teachers and administrators to guide teaching strategies, determine promotion criteria, identify students who require special attention, and to help students develop future plans. 2. The manner in which reports of other aspects of performance and grades are formulated and interpreted should be explained to students and their parents/guardians. 3. Individual results and the process followed for calculating grades and for reporting other aspects of student performance should be described in sufficient detail so that the meaning is clear to the various stakeholders, indicating the relative emphasis given to each result and the process followed for combining the results. 4. Combining types of information from disparate sources into a single summary of the process should be done with caution. To the greatest extent possible, academic performance, effort, participation, and other behaviours should be rated separately. Thus, the student can know how he or she is performing in specific areas and improve where necessary. 5. The basis for interpreting the results should be known by the education community. In this sense, it is advisable to compare assessment information with a standard or expected learning, since having an external standard as a reference makes it possible to assure the achievement of established curricular goals and not to rely on the performance of a particular group of students. 6. Interpretations of assessment results must take into account students’ knowledge and learning experiences. This is related to the collection of students’ prior knowledge and ideas to understand what their starting point is and thus analyse their progress with evidence and, in addition, to instructional validity, since the results of a classroom assessment will be conditioned through the process and activities that students carried out to learn what was assessed. Poor performance on an assessment may be attributable to a lack of opportunities to learn. 7. Assessment results must be interpreted in relation to the student’s personal and social context. Factors to consider include age, gender, language, motivation, opportunity to learn, self-esteem, socio-economic status, particular

16

C. E. Förster

interests, special educational needs and ‘test-taking’ skills. For example, motivation to do homework, language ability or family background may influence the learning of assessed concepts. This assumes that the teacher knows his or her students and families, is aware of the social dynamics of the class, and is not only concerned with students’ performance in a specific subject area. 8. Assessment results that combine comments and ratings should be stored in a form that ensures their accuracy as they are summarized and interpreted. 9. Interpretations of assessment results should be made considering the constraints of the assessment methods used, problems encountered in the collection of information and limitations that may exist in their analysis. V. Communication of Assessment Results: Assessment reports should be clear, accurate, and of practical value to the intended audiences 1. A school’s information management system should be guided by a written policy. Elements to consider include audiences, format, content, level of detail, frequency, and confidentiality. The policy to guide the preparation of school reports should be developed by teachers or management teams, in consultation with representatives of the audiences entitled to receive the report, such as parent and student centres (e.g., assessment results; termly or half-yearly reports). Co-operative participation not only leads to more appropriate and useful information, but also increases the likelihood that the reports and results will be understood and used by the intended recipients. 2. Written and oral reports should contain a description of the learning objectives and goals to which the assessments refer. A report will be limited by a number of practical considerations, but the core focus should be on the learning objectives and the types of performances that represent the achievement of these objectives. 3. Reports should include descriptions of student strengths and weaknesses in the problem areas addressed. Accuracy in reporting strengths and weaknesses helps reduce the systematic error3 and is essential to encourage and reinforce improved performance. The reports should contain information that will help and guide students, their parents/guardians and teachers to take relevant follow-up actions. Aspects of the student that are not amenable to change should not be reported in these reports; if they are structural or underlying conditions, they should be worked with and not considered a weakness in and of themselves. 4. Assessment reports should provide teachers with information to prepare for parent/guardian conferences. Where appropriate, students should also participate in these meetings. They should be scheduled at regular intervals

3

“Systematic error” is a type of measurement or data collection error that affects all students, so analysing individual results will help reduce this error by including qualitative elements associated with their performance.

1 Teacher Assessment Literacy

17

and, if necessary, at the request of parents/guardians, as an opportunity to discuss the assessment procedures used, to clarify and understand assessment results, grades, and reports, or to work on follow-up plans or actions relevant to the student’s needs. 5. Access to assessment information should be governed by a written policy that is consistent with the law and with basic principles of fairness and human rights. A written policy, developed by the school community, should be used to guide decisions regarding the release of student assessment information. This information should be available to those to whom it applies (students and their parents/guardians and teachers) and it should be used constructively on behalf of the students. In addition, information from an assessment could be made available to others who justify their need for access, subject to the consent of the evaluate and/or his/her proxy (e.g., postsecondary institutions, potential employers, researchers). The issue of informed consent should also be addressed in this written policy. The literature suggests that an evaluator should not only stick to an ethical code (Moreno, 2011), as there are specific situations that may not be covered and having different points of view helps complement the knowledge of the ethical implications of the evaluation. As Stake (2003) points out, it is personal and contextual interpretation that conditions decision-making. For the reason stated above we highlight the contribution made by Beauchamp and Childress (1994), who have defined six general ethical principles associated with ethics in medicine, but that make a lot of sense in education and have been adapted to the school context. If we see the consequences that assessment has in the lives of our students, we must consider that the responsibility of our actions is as important and delicate as that of a doctor; in this sense, we highlight these principles as complementary to the previous ones, which were adapted to address the ethics of assessment from a professional view. These principles are: 1. Beneficence: this is based on the moral norm that the good of the students (or those being assessed) should always be promoted and there is an obligation of the teacher (or assessor) to provide quality service, treat their learners well, and respect conditions, creeds, and ideologies. For this to happen in the classroom, teachers must be prepared and up to date to teach the content and skills, be competent to carry out teaching and assessment, and know the characteristics of the students they serve. This implies that the student should benefit from any act of assessment and that assessments should be a learning opportunity. 2. Non-maleficence: this principle refers to not harming the student, it is the negative formulation of the principle of beneficence. The teacher must analyse the risks/benefits when making decisions, respecting the physical and psychological integrity of the students. It differs from the principle of beneficence in that sometimes the intention may be to do good to the student, but situations may occur that eventually harm him/her, directly or indirectly. This principle should

18

C. E. Förster

be considered integrated with other principles, since sometimes a global benefit, such as a privilege, may at some point in the process become a discomfort to the student. For example, when a student becomes very nervous when presenting orally, one must analyse how to make him/her develop the ability to present his/her ideas in public, without this tension inflicting damage to his/her self-esteem and the assessment being detrimental to him/her. 3. Autonomy: this principle refers to the freedom of students whether to accept participation in an assessment and the obligation of teachers to make explicit the objectives and purposes of the assessments and to request informed consent from participants. However, at the classroom level, autonomy can be understood in a more concrete sense, that is, guaranteeing students access to information in a timely and truthful manner with respect to the assessment process, which is evident in situations such as, for example, self-explanatory correction guidelines that do not require the teacher to understand the score obtained. 4. Justice: this principle is based on equality of opportunity without exclusion or privileges for some. It is based on the moral norm of giving to each one what he/she needs, from which various obligations derive, such as carrying out an adequate distribution of resources, providing each student with an adequate level of attention, and having the indispensable resources to guarantee an appropriate education. Justice and equity are related to education and understood as a fundamental right of every human being that should be guaranteed by society. Therefore, teachers have the obligation to provide their students with opportunities according to the needs of each student (diversified or differentiated assessment, when appropriate), to provide timely and quality feedback regarding the strengths and limitations of their learning, and not to benefit or disadvantage some over others. It is common for teachers to have a different attitude with a student who has good discipline and actively participates in our course than with another student who is on the opposite end of the spectrum in these behaviours. But to the extent that we are aware of our limitations, we can correct our actions and comply with the principle of justice. 5. Privacy: This principle refers to confidentiality in the handling of individual student assessment results (reports, tests, certifications, etc.). While it is assumed that the principles of beneficence and nonmaleficence have been met, the results of an assessment may be complex or difficult for a student to take in and he or she has the right to want to keep them confidential. Common classroom practices such as reading students’ grades aloud while handing out an assignment, posting them on the classroom door for students to look up and see how they did, or ordering tests from the valedictorian and making comments such as “this is as far as I’ve been able to go,” or “I’m not sure how I’ve done”, “these are the passing ones” or “there are only two with a grade of 7” go significantly against this principle. It is also important to consider that comments we may make with other teachers or parents when it is not necessary to disclose the result of a particular student’s assessment is at the very least inappropriate, as it may lead to the stigmatization of that student.

1 Teacher Assessment Literacy

19

6. Integrity: refers to the rectitude, honesty and truthfulness that should permeate the assessment process of its participants. This principle is particularly important not only for compliance with a basic moral norm for healthy coexistence in society, but also because the assessment gathers evidence of student learning and with this information pedagogical decisions are made. If, for example, the information we collect is not reliable because it was the result of plagiarism, we will have poor quality information and we will erroneously assume that our students have learned, which will undoubtedly affect their future learning. This also applies to teachers, since it alludes to the practices we have when we use texts, images or ideas from others and recognise them as our own, which also means that we are modelling bad practice in our students. In summary, in order for a teacher to carry out quality assessment practices, he/ she must be competent not only in the construction of the tools he/she applies, but must also be informed and aware of a series of other elements that make it possible to carry out a quality assessment that complies with ethical principles, which constitutes a complex task in which his/her teaching professionalism and his/her responsibility to teach a discipline and monitor how students are learning is evident. Thus, it is necessary to design assessment practices in the classroom that favour student learning and improve our pedagogical practices.

References Adams, R., & Wu, M. (Eds.) (2003). Programme for international student assessment (PISA) 2000 technical report. OECD. https://doi.org/10.1787/9789264199521-en Airasian, P. W. (1999). Assessment in the classroom: A concise approach (2nd ed.). Mc Graw-Hill Beauchamp, T. L., & Childress, J. F. (1994). Principles of biomedical ethics (2nd ed.). Oxford University Press Brookhart, S. M. (1994). Teachers’ grading: Practice and theory. Applied Measurement in Education, 7(4), 279–301. https://doi.org/10.1207/s15324818ame0704_2 Brown, G. T. L. (2011). Teachers’ conceptions of assessment: comparing primary and secondary teachers in New Zealand. Assessment Matters, 3, 45–70. Canadian Psychological Association [CPA]. (1987). Guidelines for educational and psychological testing. CPA Carr, M., McGee, C., Jones, A., McKinley, E., Bell, B., Barr, H., & Simpson, T. (2005). Strategic research initiatives: The effects of curricula and assessment on pedagogical approaches and on educational outcomes [eBook edition]. Ministry of Education. https://www.educationcounts. govt.nz/__data/assets/pdf_file/0003/9273/The-Effects-of-Curricula-and-Assessment.pdf Covacevich, C. (2014). How to select an instrument for assessing student learning. Technical note 738 [eBook edition]. Inter-American Development Bank. https://publications.iadb.org/public ations/english/viewer/How-to-Select-an-Instrument-for-Assessing-Student-Learning.pdf DeLuca, C., LaPointe-McEwan, D., & Luhanga, U. (2016). Teacher assessment literacy: A review of international standards and measures. Educational Assessment, Evaluation and Accountability, 28(3), 251–272. https://doi.org/10.1007/s11092-015-9233-6 Deneen, C. C., & Brown, G. T. L. (2016). The impact of conceptions of assessment on assessment literacy in a teacher education program. Cogent Education, 3(1), 1–14. https://doi.org/10.1080/ 2331186X.2016.1225380

20

C. E. Förster

Dompnier, B., Pansu, P., & Bressoux, P. (2006). An integrative model of scholastic judgments: Pupils’ characteristics, class context, halo effect and internal attributions. European Journal of Psychology of Education, 21(2), 119–133. https://doi.org/10.1007/BF03173572 Feldman, J., & Tung, R. (2001). Using data-based inquiry and decision making to improve instruction. ERS Spectrum, 19(3), 10–19. García, A. M., Aguilera, M. A., Pérez, M. G., & Muñoz, G. (2011). Evaluación de los aprendizajes en el aula. Opiniones y prácticas de docentes de primaria en México. [Assessment of learning in the classroom. Opinions and practices of primary school teachers in Mexico] [eBook edition]. Instituto Nacional para la Evaluación de la Educación. https://www.inee.edu.mx/wp-content/ uploads/2019/01/P1D410.pdf Greenberg, J., & Walsh, K. (2012). What teacher preparation programs teach about K-12 assessment: A review. ERIC. Retrieved November 25, 2022, from https://files.eric.ed.gov/fulltext/ ED532766.pdf Hamilton, L., Halverson, R., Jackson, S. S., Mandinach, E., Supovitz, J. A., & Wayman, J. C. (2009). Using student achievement data to support instructional decision making [eBook edition]. Institute of Education and Sciences. https://ies.ed.gov/ncee/wwc/Docs/PracticeGuide/ dddm_pg_092909.pdf Himmel, E. (2003). Evaluación de aprendizajes en la educación superior: una reflexión necesaria. [Assessment of learning in higher education: a necessary reflection]. Pensamiento Educativo Revista De Investigación Latinoamericana (PEL), 33(2), 199–211. https://pensamientoeduc ativo.uc.cl/index.php/pel/article/view/26615 Joint Committee on Standards for Educational Evaluation. (2003). The student evaluation standards: How to improve evaluations of students. Corwin Jones, A., & Moreland, J. (2005). The importance of pedagogical content knowledge in assessment for learning practices: A case-study of a whole-school approach. The Curriculum Journal, 16(2), 193–206. https://doi.org/10.1080/09585170500136044 Kaiser, J., Retelsdorf, J., Südkamp, A., & Moller, J. (2013). Achievement and engagement: How student characteristics influence teacher judgments. Learning and Instruction, 28, 73–84. https://doi.org/10.1016/j.learninstruc.2013.06.001 Mandinach, E. B., & Gummer, E. S. (2012). Navigating the landscape of data literacy: It is complex. ERIC. Retrieved November 25, 2022, from https://files.eric.ed.gov/fulltext/ED582807.pdf Mandinach, E. B., & Gummer, E. S. (2016). What does it mean for teachers to be data literate: Laying out the skills, knowledge and dispositions. Teaching and Teacher Education, 60(1), 366–376. https://doi.org/10.1016/j.tate.2016.07.011 Mandinach, E. B., & Jackson, S. S. (2012). Transforming teaching and learning through data driven decision making. Corwin. https://doi.org/10.4135/9781506335568 Mishra, P., & Koehler, M. J. (2006). Technological pedagogical content knowledge: A framework for teacher knowledge. Teachers College Record, 108(6), 1017–1054. https://doi.org/10.1111/ j.1467-9620.2006.00684.x Montes, N. (2014). Introducción. Enseñanza y evaluación: Dos caras de la misma moneda. [Introduction. Teaching and assessment: Two sides of the same coin]. Propuesta Educativa, 41(1), 6–8. Moreno, T. (2011). Consideraciones éticas en la evaluación educativa [Ethical considerations in educational evaluation]. REICE, Revista Iberoamericana Sobre Calidad, Eficacia y Cambio En Educación, 9(2), 130–144. National Council for Accreditation of Teacher Education [NCATE]. (2010). Transforming teacher education through clinical practice: A national strategy to prepare effective teachers. ERIC. Retrieved November 25, 2022, from https://files.eric.ed.gov/fulltext/ED512807.pdf Popham, W. J. (2008). Transformative assessment. Association for Supervision and Curriculum Development. Popham, W. J. (2009). Unlearned lessons: six stumbling blocks to our schools’ success. Harvard Education Press. Popham, W. J. (2004). Why assessment illiteracy is professional suicide. Educational Leadership, 62(1), 82–83.

1 Teacher Assessment Literacy

21

Popham, W. J. (2011). Assessment literacy overlooked: A teacher educator’s confession. Teacher Educator, 46(4), 265–273. https://doi.org/10.1080/08878730.2011.605048 Ravela, P., Leymonié, J., Viñas, J., & Haretche, C. (2014). La evaluación en las aulas de secundaria básica en cuatro países de América Latina [Assessment in basic secondary classrooms in four Latin American countries]. Propuesta Educativa, 41(1), 20–45. Rogers, W. T., & Swanson, M. (2006). Effective student assessment and evaluation in the classroom: Knowledge and Skills and Attributes. Alberta Education. Rogers, W. T. (1993). Principles for fair student assessment practices for education in Canada. Canadian Journal of School Psychology, 9(1), 110–127. https://doi.org/10.1177/082957358500 900111 Salinas, D. (2002). ¡Mañana examen! La evaluación entre la teoría y la realidad [Tomorrow’s exam! Assessment between theory and reality]. Graó. Shepard, L. A. (2006). Classroom assessment. In R. Brennan (Ed.), Educational measurement (4th ed., pp. 623–646). Praeger. Shulman, L. S. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review, 57(1), 1–22. Stake, R. E. (2003). Standards-based and responsive evaluation. Sage Publications. https://doi.org/ 10.4135/9781412985932 Stiggins, R. J., Arter, J. A., Chappuis, J., & Chappuis, S. (2007). Classroom assessment for student learning: Doing it right—using it well. Pearson Education. Stiggins, R. J. (1991). Assessment literacy. Phi Delta Kappan, 72(7), 534–539. Stiggins, R. J. (2002). Assessment crisis: The absence of assessment for learning. Phi Delta Kappan, 83(10), 758–765. https://doi.org/10.1177/003172170208301010 Stiggins, R. J. (2004). New assessment beliefs for a new school mission. Phi Delta Kappan, 86(1), 22–27. https://doi.org/10.1177/003172170408600106 Stiggins, R. J., Frisbie, D. A., & Griswold, P. A. (1989). Inside high school grading practices: Building a research agenda. Educational Measurement: Issues and Practice, 8(2), 5–14. https:// doi.org/10.1111/j.1745-3992.1989.tb00315.x Südkamp, A., Kaiser, J., & Möller, J. (2012). Accuracy of teachers’ judgments of students’ academic achievement: A meta-analysis. Journal of Educational Psychology, 104(3), 743–762. https://doi.org/10.1037/a0027627 Wiggins, G., & McTighe, J. (2005). Understanding by design (2nd ed.). Association for Supervision and Curriculum Development. Xu, Y., & Brown, G. T. L. (2016). Teacher assessment literacy in practice: A reconceptualization. Teaching and Teacher Education, 58, 149–162. https://doi.org/10.1016/j.tate.2016.05.010 Zohar, A., & Schwartzer, N. (2005). Assessing teachers’ pedagogical knowledge in the context of teaching higher-order thinking. Journal of Science Education, 27(13), 1595–1620. https://doi. org/10.1080/09500690500186592

Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. email: carla. [email protected]

2

Learning and Assessment: What Is Not Assessed, Is Not Learnt Carla E. Förster and Cristian A. Rojas-Barahona

Abstract

This chapter presents the conceptual foundations of the relationship between learning and assessment, which underpins decisions about what to assess. The main characteristics of learning progression taxonomies are developed, how them help to develop assessment indicators and some practical examples of how taxonomies are visualized in different subjects. The three domains of Bloom’s taxonomy are reviewed, the cognitive domain reviewed by Anderson et al. (2001), the affective domain oh Krathwohl, Bloom, and Masias’s Taxonomy (1964) and Psychomotor Domain of Simpson’s Taxonomy (1972). Also, the Structure of observed learning outcomes (S.O.L.O.) taxonomy proposed by Biggs and Collins (1982) is presented due to its relevance in the progression of competences. Finally, the most common problems in the formulation of objectives and learning outcomes are analysed, along with suggestions to avoid them.

This chapter is based on the results of the Fondecyt Initiation Project No. 11140713, funded by Conicyt, Chile. C. E. Förster (B) · C. A. Rojas-Barahona Universidad de Talca, Talca, Chile e-mail: [email protected] C. A. Rojas-Barahona e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_2

23

24

2.1

C. E. Förster and C. A. Rojas-Barahona

Introduction

As we pointed out in the previous chapter, assessment conditions student learning, since students study according to the content and format of the assessment. This is why defining what to assess is key to achieving quality learning. The basis is always given by the learning established in the curriculum, but sometimes these guidelines are not so clear and do not contribute to the definition of what is relevant and necessary to monitor. What will contribute to improving clarity is knowledge of how students learn, and then, considering this information together with knowledge of student development, generate indicators that are as concrete as possible. Specifically, as part of what will be developed in this chapter, teachers have taxonomies that facilitate this process. Before presenting the different taxonomies, it is important to highlight the need for the teacher to be aware of both issues mentioned: the learning and development of their students. Both influence each other, learning will have an impact on the child’s development and the child’s development will influence his or her learning. This implies that, if the teacher wants to promote the learning and development of his or her students, he or she must be able to identify the processes involved in the learning and development of the student in all its dimensions: physical, affective, cognitive, social, cultural, moral, and spiritual. Thus, it is up to teachers, through assessment, to monitor whether their teaching is impacting or promoting the learning and development of their students.

2.2

Understanding Learning

We believe it is necessary, for a better understanding of the taxonomies, to recall central elements of the different forms of learning, in coherence with the major theories on this subject. For practical purposes, we will understand by learning as proposed by Shunk (2012): a lasting change in behaviour or in the ability to behave, resulting from practice or other forms of experience. The problem would not be the definition of learning, but the interpretation of its constituent elements. In this sense, it is necessary to present, albeit briefly, some key aspects of the three major visions that provide answers to the question of how learning takes place: behaviourism, cognitivism, and constructivism (for further review, see, for example, Shunk, 2012). The first vision that we will address will be behaviourism, which states that learning happens by association. From the point of view of classical behaviourism, the key is in the association between an unconditioned stimulus and a neutral one, which will happen if they are presented several times in a continuous way (first the neutral one), achieving a behaviour like that of the unconditioned stimulus. From operant behaviourism, the focus will be on a reinforcing stimulus that will strengthen the association between a discriminative stimulus and its response or behaviour. In other words, the teacher must be attentive to not only to the

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

25

expected behaviour of the student, but also to the environmental stimuli, determining which is the cause of that behaviour, in order to associate it with another or so that when the expected behaviour is generated, the appropriate reinforcing stimulus is produced to strengthen the association. For a teacher, the determining task is to be aware of the associations that are happening in the classroom, and to know his or her students in order to deliver the appropriate reinforcing stimuli. Based on associationism, Bandura proposes a theory (called “social cognitivist” or “social-cognitive”) in which he not only recognises the importance of the environment and behaviour (and their reciprocal determination), but of the person who observes the environment (he can learn without the need to receive the reinforcing stimulus or punishment directly) and who is capable of self-regulating his behaviour (Bandura & Cervone, 1983). Thus, he highlights the importance not only of social processes, but also of cognition, emphasizing the active role of the learner. At the same time, he recognises the relevance, in the process of learning by observation, of attention, retention, motor reproduction and motivation, this last point is strongly developed from both the emotional and the cognitive. He also emphasises the importance of the student’s self-efficacy (judgment built from his previous experiences in relation to his performance) in the achievement or nonachievement of learning. Finally, the author states that it is essential to be clear about the goal, and to reach it progressively, especially when it is highly complex. They are undoubtedly the bridge between behaviourism and cognitivism. Cognitivism, based mainly on the theories of information processing, focuses on the importance of information, its proper encoding, classification and organisation, the key to save information in long-term memory, and then retrieve it. Relevant processes in learning are identified, such as attention, the effective use of working memory, and strategies for incorporating and retrieving information. They are able to identify the differences in the processing of a novice versus an expert, the use of mental space and how an expert will occupy less space in their working memory, leaving more space to process information. The teacher has the challenge of working with clearly defined concepts, considering the order of the new knowledge, and seeking to connect it with the knowledge the student already has. From this theoretical approach emerges the well-known and necessary concept of metacognition, highlighting the importance of self-regulation in learning, in which the monitoring and evaluation of new knowledge are crucial to consolidate new learning. Finally, constructivism, an approach represented mainly by Piaget (a more individual view) and Vygotsky (a more social view), focuses learning on the student who actively constructs his or her knowledge, in which the environment will continue to play a relevant role, both for the student to directly experience a cognitive conflict (new knowledge with previous knowledge or schemes) and for his or her own mediators to act through peers or a teacher, facilitating movement in the zone of proximal development (between the real and potential development of each student), that is, stimulating or permitting learning. Here the role of the teacher is key in mediating the interaction of the student with the environment and with people. The importance of language as a tool that facilitates the transformation of the

26

C. E. Förster and C. A. Rojas-Barahona

internal world is also recognised. The teacher must be aware that each student constructs new knowledge based on his or her experiences or previous knowledge. He/ she should even know that previous knowledge (difficult to move) will influence the construction of new knowledge. As stated at the beginning of the chapter, based on the theories presented, the consideration of development and the need to generate clear indicators for assessing learning progress, a series of taxonomies and classifications have emerged that help to place learning on a continuum of complexity and, thus, define whether what we are teaching and assessing is consistent with the age and development of our students.

2.3

Taxonomies of Progress

A taxonomy is a classification framework that reflects the common patterns of one or more aspects of an organism, subject or object and is constructed through the systematization of multiple observations that allow typologies to be generated from these patterns or characteristics (Enghoff, 2009). In the case of progress taxonomies, they describe types of behaviours, learning and performances that we want students to develop or perform; they list generic levels of complexity that can be used to classify and interpret both the requirements of the task that will be asked of the student and the student’s response to that task (Hutchinson et al., 2014). They have the function of determining precisely what learning should be achieved by students and are used to identify different stages of development and learning in order to provide a tool to distinguish particular learning outcomes (O’Neill & Murphy, 2010). The main advantage of these taxonomies is that because of their generic nature they can be used in different situations; however, they can be used inappropriately, misinterpreting developmental progression, and thus limiting student growth. We will emphasise this aspect later. Different taxonomies have been defined in the field of school and university education. The best known and most widely used in assessment are those of Bloom et al. (1956), which addresses three domains (cognitive, affective and psychomotor); the review of the cognitive domain by Anderson et al. (2001); that of Marzano and Kendall (2007, 2008), which considers three domains (knowledge, mental processes and psychomotor processes); and the S.O.L.O. (Biggs & Collins, 1982), which considers a quantitative and qualitative scaling of learning and has been used more to assess competencies in the university context.

2.4

Taxonomy by Bloom et al. (1956) Revised by Anderson et al. (2001)

This taxonomy is based on the original proposal by Bloom et al. (1956) and addresses the cognitive domain by dividing it into two dimensions (cognitive processes and knowledge); on the one hand, it presents six levels associated with the

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

27

cognitive processes that a student executes in an assessment task (see Table 2.1) and, on the other hand, it disaggregates the type of knowledge that is at stake associated with such a task. This distinction makes it possible to develop more precise learning and assessment objectives and, thus, to design assessment tasks that more closely reflect student learning. Although both dimensions are presented in a continuum of complexity, students’ performances may not be linear, that is, a student who performs a task associated with analysis does not necessarily carry out prior application processes, the same with the type of knowledge: a student who uses metacognitive knowledge does not necessarily have mastery of the procedural knowledge associated with the task. In the review carried out by Anderson et al. (2001), it is pointed out that the disaggregation in the dimension of cognitive processes is not sufficient to correctly classify student performance and the dimension of knowledge is added, which presents four categories: factual, conceptual, procedural, and metacognitive knowledge (Table 2.2). This definition of cognitive processes and types of knowledge allows a more precise formulation of the learning objectives or goals that students are expected to achieve, and with this, the development of assessment tasks is also facilitated, thus achieving curricular alignment (Anderson, 2002) or constructive alignment (Biggs & Tang, 2011), which consist of coherence between the established objectives, what is taught and what is evaluated. Table 2.3 presents examples of performances associated with each cognitive dimension and content type. We highlight this because there is a tendency to think that factual knowledge is only associated with the cognitive ability to remember; however, it can be applied to more complex cognitive processes. To use the domains correctly, Hutchinson et al. (2014) point out that the teacher must define one or more hypotheses of progression regarding how the student will advance in the learning goal. These hypotheses correspond to a teacher’s definition of what the progression of learning or performance on a given task should be like based on his or her knowledge of the content, adapting some generic taxonomy to the specific task. The progression can be made using the complete taxonomy or only what the teacher believes will be observed in student performance. An example of a progression hypothesis for letter writing using Bloom et al.’s (1956) taxonomy as revised by Anderson et al. (2001) is presented in Table 2.4.

2.5

Affective Domain of Krathwohl, Bloom, and Masia’s Taxonomy (1964)

This domain was developed by Krathwohl et al. (1964) and, despite the time that has passed, it has not been modified. It addresses the socio-affective and valuebased learning of students, describing attitudinal stages through which they move when they are acquiring new ideas, habits, or behaviours (Hutchinson et al., 2014). This taxonomy posits a graduation of the internalization of an attitude, value, or appreciation that is evidenced in the change in behaviour of the same student.

28

C. E. Förster and C. A. Rojas-Barahona

Table 2.1 Description of the cognitive process dimension levels Cognitive dimensions

Description

Remember

Students locate Recognising relevant knowledge in their long-term memory They recognise or retrieve information, Recalling ideas, and principles in a similar manner to how they learnt them

Understand

Students construct meaning from what is taught, through oral, written, and graphic communication They translate, understand, and interpret information based on previous learning

Categories

Performance example

Verb suggestions

Recognising the dates of important events in the history of their country

Match, distinguish, identify, underline, point out

Recalling the dates Recite, of important events remember, recall, in the history of reproduce their country

Listing

Naming the foods Annotate, of a category in the appoint, record, food pyramid enumerate, label, name, order

Interpreting

Paraphrasing important speeches and documents

Exemplifying

Giving examples of Illustrate, various pictorial exemplify styles

Classifying

Classifying types of Categorize, mental disorders subsume observed or described

Summarising

Writing a summary Extract, of the events generalize portrayed on some video tapes

Inferring

In learning a foreign language, inferring grammatical principles from examples

Comparing

Comparing Contrast, map, historical facts with establish contemporary correspondences situations

Explaining

Explaining the causes of certain important events of the eighteenth century in France

Clarify, paraphrase, represent, translate

Conclude, extrapolate, interpolate, predict

Explain, relate, formulate, establish, manifest (continued)

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

29

Table 2.1 (continued) Cognitive dimensions

Description

Categories

Performance example

Apply

Students carry out or use a procedure in a given situation They select, transfer, and use data and principles to complete a problem or task

Executing

Dividing one whole Apply, carry out number by another, both multi-digits

Implementing

Determining in which situations Newton’s second law is applicable

Apply, use

Students break material down into its constituent parts, determining how they relate to one another and to an overall structure or purpose They distinguish, classify, and relate the assumptions, hypotheses, evidence or the structure of a statement or question

Differentiating

Distinguishing between relevant and irrelevant numbers in a mathematical word problem

Differences, discriminates, distinguishes, focuses, selects, separates

Organising

Structuring the facts of a historical description into facts for and against a particular historical explanation

Integrates, outlines, structures, parses syntactically, orders

Attributing

Determining the Deconstruct, point of view of the distinguish, author of an essay determine based on his political perspective

Checking

Determining whether a scientist’s conclusions follow from observed facts

Check, coordinate, detect, monitor, verify, validate, diagnose

Critiquing

Judging which of two methods is the best to solve a given problem

Judge, qualify, attribute, argue, criticize, convince

Generating hypotheses that explain a certain phenomenon

Hypothesize, integrate, combine, reorganize

Analyse

Evaluate

Create

Students make judgments based on criteria and standards

Students put elements Generating together to form a coherent or functional whole; they rearrange elements into a new Planning pattern or structure They originate, integrate, and combine Producing ideas into a product, plan, or proposal that is new to them

Adapted from Anderson et al. (2001)

Verb suggestions

Planning a research Plan, design, paper on a given prepare historical topic Building habitats for specific species and purposes

Build, invent, elaborate, produce

30

C. E. Förster and C. A. Rojas-Barahona

Table 2.2 Description of the categories of the knowledge dimension Knowledge level

Description

Categories

A. Factual knowledge

Knowledge of terminology, details, and specific elements. These are isolated pieces of information that do not necessarily “fit” within a broader or more systematic disciplinary perspective

A1. Knowledge • Technical vocabulary of terminology • Musical symbols • Mathematical symbols • Chemical structure of a compound

It refers to the “what” of a content. Knowledge of classifications, categories, principles and generalisations, theories, models, and structures These are organised and complex forms of knowledge, the facts or pieces of information are systematically integrated and organised in a disciplinary or useful way It assumes the interrelation of basic elements in a larger structure that enables them to act functionally

B1. Knowledge • Geological periods of • Classification of living classifications things and categories • Types of texts

Knowledge of a series or sequence of steps to follow to do something. It assumes the “how” to do a certain process, discriminating the most appropriate conditions to do it It represents only the knowledge of the processes, not their actual implementation

Ca. Knowledge • Steps to paint with of watercolour subject-specific • Sequence to perform a high jump skills and • Algorithm for division algorithms by integers

B. Conceptual knowledge

C. Procedural knowledge

Examples

A2. Knowledge • Reliable sources of of specific information details and • Author of a work elements • Name of a character

B2. Knowledge • Pythagoras theorem of principles • Law of supply and and demand generalisations • Law of conservation of matter • Spelling rules B3. Knowledge • Darwin’s theory of of theories, evolution models, and • Structure of Congress structures • Atomic models • Structure of a story or news

Cb. Knowledge of subject-specific techniques and methods

• Interview techniques • Scientific method • Location of coordinates on a map • Analysis of historical sources • Production of an argumentative text (continued)

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

31

Table 2.2 (continued) Knowledge level

Description

Categories

Examples

Cc. Knowledge • Criteria used to of criteria for determine when to determining apply a procedure when to use related to Newton’s appropriate second Law • Criteria to determine procedures what scale or type of graph to use according to the data D. Metacognitive knowledge

Knowledge of cognition in general, as well as the awareness and knowledge of their own cognition. It includes knowledge of general strategies for learning, thinking, and problem solving (strategic knowledge), knowledge of the difficulty of a cognitive task and the demands it may make, and knowledge of one’s own strengths and weaknesses in relation to cognition and learning (self-knowledge)

D1. Strategic knowledge

Knowledge of: • Graphic organisers to order the information, • Various mnemonic strategies to memorize information • Planning strategies and setting goals for reading a book

D2. Knowledge about cognitive tasks, including appropriate contextual and conditional knowledge

Knowledge about: • The memory required by the different types of test items • The cognitive demands of an essay or summary • Local and general social, conventional, and cultural norms

D3. • Knowledge of one’s Self-knowledge abilities to perform a specific task according to reality • Awareness of the level of self-knowledge • Knowledge of personal interest in a task

Adapted from Anderson et al. (2001)

This taxonomy assumes that a person acts consistently with internalized values; although we know that this is not always the case in the intermediate stages of acquiring a value, this classification helps to generate assessment tasks in which students demonstrate their level of appropriation. It has been widely described in the literature that affective and motivational components are key in student learning (e.g., Bandura & Cervone, 1983; Morgan et al., 2001), nevertheless, this domain has been delegated to a secondary place in different subjects, relegated to a set of generic skills also called relational or “soft”, whose development is not assessed. The school curriculum proposes in different subjects a breakdown of attitudes and affective skills that students are expected to develop, and this

Advise a classmate who knows less about fractions

Apply Carry out or use a procedure in a given situation

Conceptual Recognize the Classify the animals The interrelationship characteristics of a according to their mammal habitat among the basic elements with a larger structure that enable them to function together

Understand Construct meaning from instructional messages, including oral, written, and graphic communication Respond according to courtesy rules

List primary and secondary colors

Remember Retrieve relevant knowledge from long-term memory

The cognitive process dimension

Associate an author with his work

Factual The basic elements students must know to be acquainted with a discipline or solve problems in it

The knowledge dimension

Difference between a story and a news

Select the list of materials to make an activity

Analyse Break material into constituent parts and determine how parts relate to one another and to an overall structure or purpose

Determine the relevance of data to a problem

Check consistency between sources

Evaluate Make judgement based on criteria and standards

Table 2.3 Examples of performance or tasks that a student can do integrating a cognitive process and a dimension of knowledge

(continued)

Make a solar system with recycled material

Generates a record of his/her daily activities

Create Full elements together to form a coherent whole, reorganize into a new pattern of structure

32 C. E. Förster and C. A. Rojas-Barahona

Identify the strategies used to memorize information

Metacognitive Knowledge of cognition in general as well as awareness and knowledge of one’s own cognition

Adapted to Heer (2012)

Remember how to write a letter

Remember Retrieve relevant knowledge from long-term memory

Predict one’s response to an ecological problem

Explain the instructions of a game

Understand Construct meaning from instructional messages, including oral, written, and graphic communication

The cognitive process dimension

Procedural How to do something, methods or inquiry, and criteria for using skills, algorithms, technologies, and methods

The knowledge dimension

Table 2.3 (continued)

Use the study techniques that he/ she dominates best

Calculate a multiplication of fractions

Apply Carry out or use a procedure in a given situation

Create Full elements together to form a coherent whole, reorganize into a new pattern of structure

Create a portfolio of your learning

Judge a letter made Design a work by a classmate plan to make a classification

Evaluate Make judgement based on criteria and standards

Deconstructs his Reflect on his/her own prejudices about own progress in citizen participation learning

Compare a story with a fable

Analyse Break material into constituent parts and determine how parts relate to one another and to an overall structure or purpose

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt 33

34

C. E. Förster and C. A. Rojas-Barahona

Table 2.4 Letter writing progression hypothesis using Bloom’s taxonomy revised by Anderson et al. (2001) Knowledge dimension

Performance demonstrating the level of knowledge

Create

Writes a letter with a free topic

Evaluate

Checks the quality of his or her letter based on the guidelines given in class, identifying strengths and weaknesses

Analyse

Reviews a letter and recognises the structural elements in it

Apply

Writes a letter with a topic given by the teacher to the class

Understand

Describes the core elements of the structure of a letter

Remember

Lists the parts of a letter

taxonomy contributes to the operationalization of these skills in order to assess them. The authors recognise five levels of progression, which are presented in Table 2.5. The evaluation of the affective domain is not easy, since it is always linked to the cognitive domain through the way in which the student processes the information, he/she receives, and the only way in which the teacher can evaluate it as an external agent is through the observation of his/her behaviours. However, the use of this taxonomy makes it possible to define the progression of behaviours that demonstrate the acquisition and development of values, habits, and attitudes that students are learning and that guide their actions. Let us analyse an example using the attitudinal objective of Natural Sciences: “Demonstrate curiosity and interest in learning about living beings, objects and/or events that make up the natural environment” (Table 2.6). In it we can see that the taxonomy helps us to define more clearly a progression in terms of complexity or internalization of the attitude, and the proposed verbs support the definition of the observable behaviours through which we are going to monitor the development of this attitude. It should be emphasised that the development may have different times, so some (or all) students may not reach the highest level. Here it is the teacher who, depending on the complexity of the attitude and the characteristics and interests of the students, must define what their developmental expectations are for a delimited period of time and communicate these goals to their students. One activity that could motivate them is to stick the progression on the wall of the room and have each student self-evaluate his or her progress, using a marker with his or her name on it.

2.6

Psychomotor Domain of Simpson’s Taxonomy (1972)

Simpson’s (1972) taxonomy was created to address educational aspects of the curriculum that were not covered in the cognitive and affective domains and has been widely used in physical education and in technical and vocational education (Seidel et al., 2005). While there is no single definition of what is meant by psychomotor learning or perceptual-motor skill, there is a consensus that it refers to

Responding

The learner actively participates and interacts with new information or procedures without agreeing or owning these ideas He or she demonstrates complacency, desires, and satisfaction with ideas

First level of active student response Willingness to respond, disposition of the student Pleasure or enthusiasm to respond

Willingness to respond

Satisfaction for answering

The student’s preference for certain aspects of the stimulus, such as aesthetic, humorous, critical

Tolerating or accepting stimuli, willingness to participate

Knowing the importance of the topic or situation, having the ability to take it into account, simple notion

Description

Consent to reply

Controlled or selective attention

The learner is willing to Awareness receive, listen and obtain information and ideas but does not make decisions about the value of the information. He or she Willingness to receive becomes aware of ideas and phenomena

Receiving

Categories

Description

Dimensions

Table 2.5 Dimensions of the affective domain of Krathwohl et al. (1964)

• Participates in class discussions • Asks new ideas and concepts • Completes a set of tasks designed to develop a specific attitude • Agrees with an idea by responding to it

• Listens passively to the teacher and his or her peers • Shows sensitivity or empathy in the face of a problem

Performance example

(continued)

Responds, narrates, executes, reports, selects, follows, explores, displays, conforms, completes, writes, records, approves, volunteers, spends time on

Asks, follows, repeats, accepts, prefers, listens, receives, favours

Verb suggestions

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt 35

Description

The learner is able to see the value of new information and procedures. He or she varies from the acceptance and preference for a value to the acceptance of a commitment. Recognition and motivated attitude

The learner is incorporating new information and procedures into their existing structures Process by which the learner resolves conflicts and begins to internalise values

Dimensions

Valuing

Organisation and conceptualisation

Table 2.5 (continued) Performance example

Relating values acquired and those under acquisition Consistent interrelation of a value set

Organisation of a value system

• Balances freedom and responsibility, accepts different points of view, and builds with them to develop new perspectives and understanding of ideas • Independently organises time and materials to implement a routine or habit

Attributing value to a • Shares perspectives phenomenon, thing, respecting the diverse or behavior opinions of the group and is capable of solving Willingness to seek, problems desire, or seek a value •Is willing to be seen as a Absolute conviction person who “values” an idea of the value of the or material phenomenon or thing

Description

Conceptualisation of a value

Commitment

Preference for a value

Acceptance of a value

Categories

(continued)

Organises, values, associates, determines, compares, defines, formulates, abstracts, finds, correlates, integrates, arranges, compares, takes stock, theorizes

Argues, trusts, renounces, abandons, accepts, recognizes, participates, indicates, decides, helps, assists, denies, protests, provides, defends, initiates

Verb suggestions

36 C. E. Förster and C. A. Rojas-Barahona

Adapted from Affective domain of Krathwohl’s et al. taxonomy (1964).

The learner begins to Generalised set support and defend new information and ideas, has a value system Characterisation related to beliefs, ideas and attitudes that control her behaviour in a predictable and consistent way

Characterization by a value or value set

Categories

Description

Dimensions

Table 2.5 (continued)

Worldview, philosophy of life, essential attitude

Internal consistency with the system of attitudes and values

Description • Shows self-confidence, collaborates positively with other members of the group, and acts ethically, valuing people for who they are and not for how they look • Acts consistently with internalized values • Voluntarily leads an activity associated with the topic worked on

Performance example Is consistent with, is revealed by, sets an example, reviews, changes, avoids, resolves, resists, discriminates, questions, influences

Verb suggestions

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt 37

38

C. E. Förster and C. A. Rojas-Barahona

Table 2.6 Example of progression hypothesis of the affective domain for the learning objective attitude: “Showing curiosity and interest in learning about living beings, objects, and/or events that make up the natural environment” Level

Performance demonstrating the level of the domain

Receiving

The students listen to their teacher when he or she gives information about habitats and their relationship with animal life

Responding

The students fill out the class guides associated with the habitats and ask specific elements to deepen the information

Valuing

The students admit to their classmates that they don’t like that humans are destroying the animals’ habitat and that they wish they could do something

Organisation and conceptualisation

The students discuss with their classmates about ways to learn more about the habitat of endangered animals and agree to investigate one of them

Characterization by a value or value set

The students voluntarily look for information on foundations for the protection of the defined animal and invite their classmates to carry out a support activity (collection, letter, etc.)

actions that involve the student manipulating, moving objects, or controlling body parts and whose performance is reflected in movements (Adams, 1987; Rosenbaum et al., 2001; Seidel et al., 2005; Singer, 1980). These actions and movements may involve gross motor skills, that is, large movements on the part of the student, such as hopping on one foot, touching the shoulders with the hands, running from one place to another, or they may be fine motor skills, which involve coordination and precision in movement, such as fastening a button, painting without going over the edge, putting a specific amount of liquid using a pipette, or drawing a line on graph paper. In any case, motor skills require high cognitive demand in information processing as the student learns and automates them, so cognitive mastery is always present in a psychomotor task (Ackerman & Cianciolo, 2000; Adams, 1987). As the learner automates the skill, the cognitive demand decreases, as they will use heuristics1 relevant to the situation; however, long-term transfer requires the learner to process significant amounts of information to build patterns from their experience in diverse situations and multiple contexts (Seidel et al., 2005). The most common example in which this long-term transfer is visualised is riding a bicycle: in general, one learns as a child and even though a long time has passed, we got on a bicycle and started pedalling without major complications, but to achieve this required many hours of practice and that we used different bicycles throughout our lives and rode on different types of terrain.

1

A heuristic is a mental rule used to simplify reality, which, in practice, becomes a mental shortcut that saves cognitive processes.

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

39

Thus, to automate psychomotor learning requires that the student has multiple opportunities to practice and advance in the acquisition of the skill, since each new situation implies that he or she returns to the initial level in a circular logic of progression. For example, when a student practices a team sport, each match implies new learning, in training we practice predefined moves to make them more and more automatic, but when facing an opponent, there are elements that are contingent, unpredictable before and, therefore, we can return to an initial mastery of the skill in that specific situation. The learning experience of motor skills goes far beyond the school context and we live with it on a daily basis, as when we type on a computer, our typing speed can be increasingly faster the more we practice, even having memorized the keyboard and not needing to look or think where to press when typing, but if we change the computer and the keyboard has another configuration, we will make many mistakes and we will need more concentration (cognitive demand) to remember the keys that are now different and to be able to type without major problems. Simpson’s proposal (1972) with the dimensions of the taxonomy of the psychomotor domain orders the development of this psychomotor learning in a progression of seven levels associated with the automation and complexity of the motor processes (Table 2.7). From this taxonomy, Dave (1975) made a proposal, reducing it to five levels. Imitation: merges the first two dimensions of the original (observation and preparation) and adds that the performance can be of low quality, thus extending from a partial manipulation to a complete accomplishment, but whose result shows a deficient in development, which places it at an initial level (e.g., copying a work of art). Manipulation: similar to the third dimension (guided response), without making modifications; involves performing the task using some support (e.g., creating a work after taking a class or reading instructions). Accuracy: merges the fourth and fifth dimensions (mechanization and complex response) referring to the gradual acquisition of accuracy and decrease of errors (e.g., playing and adjusting a song on the guitar until it is “just right”). Articulation: this is similar to the sixth dimension of the original (adaptation); it refers to the coordination of a series of actions that work harmoniously and with internal consistency (e.g., producing a short film that incorporates images such as photos and videos, has music and colour changes). Naturalization: similar to the last original dimension (creation); it assumes that the performance begins to be natural and the person does not need to think about the motor task they are performing (e.g., a professional dancer, a concert guitarist). Whichever of these taxonomies is used, it aims to describe a progression of mastery of a psychomotor skill, which allows the teacher to grade a learning goal in intermediate performances and thus monitor their students from the time they are beginning to acquire the skill until they are proficient or have automated it.

40

C. E. Förster and C. A. Rojas-Barahona

Table 2.7 Description of the levels of Simpson’s taxonomy (1972) for the psychomotor domain Dimensions

Description

Perception

The student uses his • Observing a procedure or her senses to • Identifying a tool or obtain cues that material • Selecting a material guide action: from a group ranging from the awareness of the stimulus to the translation of the perception of the signal into action

Performance example

Verb suggestions Observes, listens, chooses, describes, detects, differentiates, distinguishes, identifies, isolates, relates, selects, separates

Set

The student is • Handling the material or mentally, physically, tool without using it • Performing parts of the and emotionally action ready to act

Starts, shows, explains, moves, proceeds, reacts, responds

Guided response

Knowledge of the steps required to perform a task; includes imitation and trial and error

• Performing the action following instructions but needing a demonstration or guidance

Copies, follows, reproduces, evidences, reacts, responds

Mechanism

The student performs tasks and actions on a daily basis. It is an intermediate stage in which efficiency, skill, and confidence grow

• Operating tools or materials competently • Assembling laboratory material • Organising the materials to make an engraving • Performing a jump on a pommel horse with inaccuracies in the race

Organises, manipulates, assembles, calibrates, builds, dismantles, unfolds, fixes, grinds, heats, measures, repairs, mixes, arranges, sketches

Complex overt response *The difference with the previous stage is in precision and speed

The student has developed the skill and the execution is automatic, fast, and precise with little hesitation. The actions involve complex patterns of movement

• Operating tools and materials with precision • Performing a jump on the pommel horse correctly

Organises, manipulates, assembles, builds, calibrates, builds, dismantles, displays, fixes, grinds, heats, measures, mixes, organises, draws

Adaptation

The student has developed the ability and can modify movement patterns to account for problematic or new situations without guidance

• Mixing colours to generate the one he or she doesn’t have during an art class • Changing or adding an ingredient in a recipe to enhance the taste • Planning work for a school play

Reorganises, modifies, rearranges, internalises, adapts, alters, rearranges, revises, varies, plans

(continued)

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

41

Table 2.7 (continued) Dimensions

Description

Performance example

Verb suggestions

Origination

The student has internalised the automatic mastery of the skill, creates new movement patterns to account for problematic or new situations, and creates new tasks that incorporate what has been learned

• Assessing the results of a procedure • Incorporating the errors detected in an action for next time • Designing a choreography • Improvising to solve a problem

Composes, builds, designs, initiates, creates, arranges, combines, composes, designs, originates, performs

Adapted from Ferris and Aziz (2005) and Kasilingam et al. (2014)

Table 2.8 Example of psychomotor domain progression hypothesis for music Level

Performance demonstrating the develop of ability

Perception

Visually and auditorily perceives how different percussion instruments are played as a form of vocal accompaniment

Set

Manipulates percussion instruments, conventional and unconventional, without performing a rhythmic execution

Guided response

Follows instructions to play a percussion instrument, conventional or unconventional, and rehearses the use of the instrument according to the indications provided

Mechanism

Plays the percussion instrument and sings at the same time, making mistakes in its execution or failing to perform one of the actions at times

Complex overt response

Plays the percussion instrument and sings at the same time, without errors in execution

Adaptation

Makes small arrangements with the percussion instrument and/or the voice, modifying some of the expressive means

Origination

Creates a melodic line and its respective rhythmic accompaniment to play a percussion instrument and sing at the same time

Let us now look at an example of a progression hypothesis for the Music learning objective “Singing in unison and playing conventional and non-conventional percussion instruments” (Table 2.8).

2.7

Integrating the Three Taxonomic Domains in a Task

If we look at the learning objectives of Natural Sciences, we can see that we expect the development of scientific skills that involve putting into play the three domains described above, but not all of them are presented with the same weight or the same temporality within a concrete activity. Since knowledge in reality is

42

C. E. Förster and C. A. Rojas-Barahona

not separated and since we teachers expect our students to learn the contents in an integrated manner, the ideal is to design assessment tasks that effectively allow students to demonstrate their level of development or learning from a multidimensional perspective. Figure 2.1 shows the structure of an activity that involves first teaching the basic content related to the discipline (scientific concepts to be worked on) and explaining the laboratory activity to be carried out, in which students must work in pairs. Since teamwork is also intended as learning, the aspects that are expected to be observed in the work dynamic (affective domain) should be explained. The teacher must monitor that both learning processes are understood by the students, otherwise the activity may fail. They then carry out the actions of preparing the laboratory material and performing the experiment (psychomotor domain) and synthesize their work in a report that they then present to the course. The cognitive domain is at all stages of the learning process since students will only be able to apply their manual skills well (psychomotor domain) after they have acquired conceptual knowledge and information (cognitive domain). Similarly, the affective domain associated with the learning objective of teamwork will be present throughout the activity. Regarding evaluation, depending on the stage of the activity, the expected learning should be monitored or certified. Figure 2.1 shows the structure and possible moments of evaluation.

2.8

Structure of Observed Learning Outcomes (S.O.L.O.) Taxonomy

This taxonomy was developed by Biggs and Collins (1982) and has been used principally in higher education, even when school education today involves the development of competencies or complex learning, this taxonomy is a good option for ordering student performance. The authors state that as students consolidate their learning, they advance through five levels of increasing structural complexity. Biggs (1992) also points out that there are two types of changes that can be observed in the development of learning: quantitative changes, which refer to the increase in the amount of detail that a student includes in his or her response, and qualitative changes, which refer to the way in which the student integrates these details into a structural model. Quantitative levels occur first and then learning changes qualitatively. Under this premise, the S.O.L.O. (Structural Outcomes Learning Observed) taxonomy is proposed, which provides, on the one hand, a systematic way to rank in increasing order the performance of a student as he or she progresses in the development of learning and, on the other hand, can guide the formulation of learning objectives for teaching and subsequent assessment. Five levels are presented that describe performance from novice to expert (Fig. 2.2). Table 2.8 shows the description of the levels, examples of a progression of performances and a proposal of verbs to guide the formulation of objectives or assessment indicators. As we can see in the example of Table 2.9, the progression of complexity in learning is referred to the answer that a student gives, in this case, to a question. It

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

43

Fig. 2.1 Example of integration of the three domains of the taxonomy by Bloom et al. (1956) in a science laboratory activity

is important to keep in mind that such an answer may seem, in structure, to be of a more complex level than it really is if during the class the relationships between concepts were taught explicitly or an abstraction was made from specific elements to theoretical generalisations or predictions, then we must be especially careful with the assessment task, since students will be repeating an answer that was already elaborated previously in the class and, therefore, the level of complexity will be lower than expected.

2.9

Evaluation Indicators and Definition of Assessment Tasks

Taxonomies do not have a meaning in themselves; they were designed as a way to support the formulation of learning objectives and alignment with the tasks used to teach and assess the achievement of such learning (Anderson, 2005; Biggs & Tang, 2011). Thus, having levels that allow for the development of learning progressions is of great help in formulating objectives, as it allows the teacher to focus on

44

C. E. Förster and C. A. Rojas-Barahona

Fig. 2.2 Information organization levels according to the S.O.L.O. taxonomy

what and how students should learn and not only on the topics they must teach. This shift towards the expected learning outcome is crucial in education today, as it forces teachers to identify the goal to which they want to lead their students, making the content explicit and at what level they expect this content to be learned. Intended learning outcomes “are statements, written from the students’ perspective, indicating the level of understanding and performance they are expected to achieve as a result of engaging in the teaching and learning experience” (Biggs & Tang, 2011, p. 100). The writing structure of an objective is based on a verb + a content + a condition or context (Anderson, 2005; Goff et al., 2015) (the subject has been omitted in the sentence structure, as it is always “the learner”). The verb must be an observable action representing knowledge, skills, or values (Goff et al., 2015) and has the function of informing the learner what they are expected to do with that content related to the discipline (Biggs & Tang, 2011). The content and condition constitute a specific statement of the learning to be demonstrated (Goff et al., 2015). For example, the objective “Explain that a fraction represents part of a whole in a concrete, pictorial, and symbolic way” assumes that not only fraction content must be taught, students must also be given opportunities to learn how to explain what a fraction represents, so it must be taught in different ways (concrete, pictorial, and symbolic) and then assessed using tasks that allow students to demonstrate that they have learned it. In addition, the verb “explain” is placed according to the cognitive processes proposed by Anderson et al. (2001) in the category “understand” and the content “fraction” associated with conditions that place it in

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

45

Table 2.9 S.O.L.O.’s taxonomy (Biggs & Collins, 1982) Change

Levels

Description

Performance example

The student has not really understood the topic or task, uses an overly simple way of doing it, or performs it inappropriately, without knowledge in the area

When asked: “What is the difference between the different narrative genres?” the answer is an emptying of unconnected ideas in which he or she uses disciplinary language in an inappropriate or basic way

Unistructural

The student only knows one relevant aspect of the area or topic addressed

When asked: “What List, name, is the difference memorize, between the different identify narrative genres?” the student only defines what narrative genres are or only refers to one of them (e.g., the novel)

Multistructural

The student knows and uses relevant aspects of the task but treats them independently and additively

When asked: “What Describe, list, is the difference classify, combine between the different narrative genres?” he or she defines each one, evidencing conceptual mastery but does not compare them

Relational

The student integrates relevant aspects of the topic or task into a coherent structure. This level is what is normally understood as an adequate understanding of the subject

When asked: “What Analyse, explain, is the difference integrate between the different narrative genres?” the student presents characteristics that differentiate them, using comparison criteria and connectors such as “while” and “instead”, which denote a relationship between the concepts

Quantitative Prestructural

Qualitative

Verb suggestions

(continued)

46

C. E. Förster and C. A. Rojas-Barahona

Table 2.9 (continued) Change

Levels

Description

Extended abstract The student integrates the relevant aspects of the task into a structure and conceptualizes it at a higher level of abstraction and generalisation to a new topic or content area

Performance example

Verb suggestions

When asked: “What Predict, reflect, is the difference theorise between the different narrative genres?” the student gives a relational response but raises possible contextual explanations for certain subtypes of narrative genres. E.g., Points out that science fiction novels related to postapocalyptic epidemics are related to the advancement of science in genetics and climate change, and argues that a new peak could be associated with current technological development after the COVID-19 pandemic

Adapted from Biggs and Collins (1982)

“conceptual” knowledge, which allows us to visualise the cognitive complexity of the learning that we want to promote. Goff et al. (2015, p. 8) state that there are four elements that constitute the ABCD (Audience Behavior Conditions Degree) to consider in formulating a learning outcome: (a) Audience: Who are the learners?, (b) Behaviour: What will they be able to know, value, or do?, (c) Condition: Under what circumstances/context will the learning occur?, and (d) Degree: How much will be accomplished and to what level?

2.10

Common Problems in the Formulation of Learning Objectives and Outcomes

The objectives or expected learning outcomes, and their breakdown into indicators of achievement, should be written as observable, measurable and achievable behaviours over a set period. Formulating them is not easy, since they require a specific structure and verbs that meet certain criteria, that are sufficiently generic

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

47

to give the teacher flexibility in their teaching and subsequent assessment, but that are not so broad as to be ambiguous. The following is an analysis of the three main problems observed in its formulation: Problem 1: Use of ambiguous verbs. The verbs with which learning objectives or learning outcomes begin must be precise. Using verbs such as “understands” or “reflects” does not orient the student regarding the activity to be carried out or at what level specific learning is expected to take place; furthermore, they are not directly observable, since they allude to internal behaviours, and one would need to read the student’s mind in order to be able to make a judgment about his or her learning. When a teacher is asked “How do you know that your student “understood” a content?”, the most common answer is “He/she explains, makes comparisons, infers, describes…” This is where verbs that reflect observable behaviours appear naturally. Potter and Kustra (2012) have defined a group of verbs and phrases that are commonly used in the formulation of learning objectives, but which are ambiguous and unobservable, and which we should eradicate from our repertoire: (1) (2) (3) (4) (5) (6)

Understand Appreciate Comprehend Grasp Know See

(7) (8) (9) (10) (11)

Accept Have a knowledge of Be aware of Be conscious of Learn

(12) (13) (14) (15) (16)

Perceive Value Get Apprehend Be familiar with

Each taxonomy proposes a list of verbs that can be used to formulate learning objectives and achievement indicators, which constitute a good resource. However, there are some verbs that are repeated in more than one level, and it is here where the teacher must determine which level they correspond to according to the complexity of the content that accompanies the verb in the sentence (Anderson et al., 2001; Biggs & Collins, 1982). Problem 2: In the formulation of objectives, the means are confused with the results that students are expected to achieve. In this case, learning objectives are formulated as teaching activities or assessment tasks. A characteristic of a good learning objective or outcome is that it is sufficiently generic and, therefore, the teacher has multiple ways of teaching and assessing it. Table 2.10 presents some examples of learning outcomes formulated as activities and their proposed learning objective or achievement indicator. In example 1, the Natural Sciences learning outcome “Obtains information on the internet about the contribution of some relevant scientists to the understanding of environmental phenomena” is an activity because the focus is not on demonstrating what they have learned about the contribution of certain people to the development of science, but on searching for information on the internet. Thus, the means or teaching strategy becomes the end, which detracts from the curricular formative purpose, and also restricts the search options for students to only one digital resource and reduces the options for assessment tasks. Stating the learning outcome as “Explain

48

C. E. Förster and C. A. Rojas-Barahona

Table 2.10 Characteristics and examples of an activity and a learning objective Activity

Learning objective, learning outcome, or achievement indicator

Characteristics • It is specific • It has a specific moment • It is the means to highlight the indicator

• It is generic • It is evident at different times • It is the behaviour or performance to be observed

Examples

(1) Gets information from the Internet about the contribution of some relevant scientists to the understanding of environmental phenomena

(1) Explains the provisional nature of scientific knowledge, based on classical and contemporary research

(2) Responds to the previously designed observation guideline

(2) Uses different record keeping systems to observe a natural phenomenon

(3) Participates in a conversation in small groups based on information collected on environmental pollution during class

(3) Interacts according to social conventions in different situations using disciplinary content in his or her argumentation

(4) Formulates questions to clarify some elements of ancient Greek civilization

(4) Recognises aspects of the daily life of ancient Greek civilisation

the provisional nature of scientific knowledge, based on classical and contemporary research” places student performance in the central focus of the subject and is sufficiently generic so that the teacher can use a variety of activities to teach it and multiple assessment tasks. Problem 3: Development of learning objectives or outcomes that are too specific. When the learning objective is very specific or restricted, it can meet the criteria of being precise, observable, and measurable, but it can also become a double-edged sword because of the limited flexibility it gives in teaching and then evaluating it, losing the relevance of the learning of the subject. If we formulate the learning objective as specific as “Recognise aspects of the clothing worn by women of the Greek civilization of Antiquity in Corinth”, it focuses so much on a specific aspect such as clothing in a place in Greece that it becomes a fact, not very relevant, which unnecessarily complicates teaching it and then evaluating it, which also rules out any possibility for the student to have learning of greater cognitive complexity or develop their creativity in the learning process (Potter & Kustra, 2012). To avoid making these or other mistakes when formulating learning objectives or learning outcomes and assessment indicators, here are five questions with which we can check if our formulation is well done (Potter & Kustra, 2012): 1. Does this learning objective or outcome reflect an external, observable, and measurable behaviour or performance?

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

49

2. Is the learning objective or outcome generic enough to be assessed with different tasks? 3. How will the teacher and students know when this learning objective or outcome has been achieved? 4. What evidence given by the learner will I consider minimal to say that this learning objective or outcome has been achieved? 5. What behaviours or performances would you associate with someone who has reached this state? In summary, this chapter clarifies the importance of knowing the key elements of student learning and development in order to carry out an adequate assessment. To this end, we summarize some of the elements that are currently considered in learning, which are basic elements for constructing the various taxonomies of progress. These classifications were created to carry out an orderly and progressive operationalization of learning. We conclude highlighting the role of taxonomies in education and some of the problems that can arise in the formulation of learning objectives and outcomes. In the next chapter we will see how to apply the taxonomies in the planning of teaching and assessment and the relevance for learning that these pedagogical processes are integrated from the beginning.

References Ackerman, P. L., & Cianciolo, A. T. (2000). Cognitive, perceptual-speed and psychomotor determinants of individual differences during skill acquisition. Journal of Experimental Psychology: Applied, 6(4), 259–290. https://doi.org/10.1037/1076-898X.6.4.259 Adams, J. A. (1987). Historical review and appraisal of research on the learning, retention and transfer of human motor skills. Psychological Bulletin, 101(1), 41–74. https://doi.org/10.1037/ 0033-2909.101.1.41 Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., Raths, J., & Wittrock, M. C. (Eds.). (2001). A revision of Bloom’s taxonomy of educational objectives. Longman Publishing Anderson, L. W. (2002). Curricular alignment: A re-examination. Theory into Practice, 41, 255– 260. Anderson, L. W. (2005). Objectives, evaluation and the improvement of education. Studies in Educational Evaluation, 31, 102–113. https://doi.org/10.1016/j.stueduc.2005.05.004 Bandura, A., & Cervone, D. (1983). Self-evaluative and self-efficacy mechanisms governing the motivational effects of goal systems. Journal of Personality and Social Psychology, 45, 1017– 1028. https://doi.org/10.1037/0022-3514.45.5.1017 Biggs, J. B., & Collins, K. F. (1982). Evaluating the quality of learning: The SOLO taxonomy. Academic Press Biggs, J., & Tang, K. (2011). Teaching for quality learning at university (4th ed.). Oxford University Press Biggs, J. B. (1992). A qualitative approach to grading students. HERDSA News, 14(3), 3–6. Bloom, B., Englehart, M., Furst, E., Hill, W., & Krathwohl, D. (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook I: cognitive domain. Longmans, Green and Co

50

C. E. Förster and C. A. Rojas-Barahona

Dave, R. H. (1975). Psychomotor levels. In R. J. Armstrong (Ed.), Developing and writing behavioural objectives (pp. 20–21). Educational Innovators Press. Enghoff, H. (2009). What is taxonomy? An overview with myriapodological examples. Soil Organisms, 81(3), 441–451. Ferris, T., & Aziz, S. M. (2005). A psychomotor skills extension to Bloom’s taxonomy of education objectives for engineering education. Exploring Innovation in Education and Research, 128, 1–5. Goff, L., Potter, M. K., Pierre, E., Carey, T., Gullage, A., Kustra, E., Lee, R., Lopes, V., Marshall, L., Martin, L., Raffoul, J., Siddiqui, A., & Van Gastel, G. (2015). Learning outcomes assessment: A practitioner’s handbook [eBook edition]. Higher Education Quality Council of Ontario. https://www.pathwaysresources.org/wp-content/uploads/2018/04/LearningOutc omesAssessment-A-Practitioners-Handbook.pdf Heer, R. (2012). A model of learning objectives. Iowa State University Center for Excellence in Learning and Teaching. https://www.celt.iastate.edu/wp-content/uploads/2015/09/RevisedBl oomsHandout-1.pdf Hutchinson, D., Francis, M., & Griffin, P. (2014). Developmental teaching and assessment. In P. Griffin (Ed.), Assessment for teaching (pp. 26–57). Cambridge University Press. Kasilingam, G., Ramalingam, M., & Chinnavan, E. (2014). Assessment of learning domains to improve student’s learning in higher education. Journal of Young Pharmacists, 6(4), 27–33. https://doi.org/10.5530/jyp.2014.1.5 Krathwohl, D. R., Bloom, B. S., & Masia, B. B. (1964). Taxonomy of educational objectives the classification of educational goals. Handbook II: Affective domain. Longmans, Green and Co. Marzano, R. J., & Kendall, J. S. (2007). The new taxonomy of educational objectives (2nd ed.). Corwin Press. https://www.ifeet.org/files/The-New-taxonomy-of-Educational-Objectives.pdf Marzano, R. J., & Kendall, J. S. (2008). Designing and assessing educational objectives: Applying the new taxonomy [eBook edition]. Corwin Press. http://dspace.vnbrims.org:13000/xmlui/bit stream/handle/123456789/4577/Designing%20and%20Assessing%20Educational%20Objecti ves%20Applying%20the%20New%20Taxonomy.pdf?sequence=1 Morgan, P., Bourke, S., & Thompson, K. (2001). The influence of personal school physical education experiences on non-specialist teacher attitudes and beliefs about physical education. Annual Conference of the Australian Association for Research in Education, Fremantle. https://www. aare.edu.au/data/publications/2001/mor01297.pdf O’Neill, G., & Murphy, F. (2010). Guide to taxonomies of learning. UCD Teaching and Learning Assessment Resources. http://www.ucd.ie/t4cms/ucdtla0034.pdf Potter, M. K., & Kustra, E. (2012). A primer on learning outcomes and the SOLO taxonomy. Centre for Teaching and Learning, University of Windsor. https://www.uwindsor.ca/ctl/sites/uwindsor. ca.ctl/files/primer-on-learning-outcomes.pdf Rosenbaum, D. A., Carlson, R. A., & Gilmore, R. O. (2001). Acquisition of intellectual and perceptual-motor skills. Annual Review of Psychology, 52, 453–470. https://doi.org/10.1146/ annurev.psych.52.1.453 Seidel, R. J., Perencevich, K. C., & Kett, A. L. (2005). From principles of learning to strategies for instruction. Springer. Shunk, D. H. (2012). Learning theories: an educational perspective (6th ed.). Pearson Simpson, E. J. (1972). The classification of educational objectives in the psychomotor domain. Gryphon House Singer, R. N. (1980). Motor learning and human performance: An application to motor skills and movement behaviors. Macmillan Publishing.

Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile.

2 Learning and Assessment: What Is Not Assessed, Is Not Learnt

51

Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. email: [email protected] Cristian A. Rojas-Barahona is Psychologist. Cristian A. Rojas-Barahona did Ph.D. in Psychology, University of Granada, Spain, and Postdoc at the Developmental Brain Behaviour Laboratory, Academic Unit of Psychology, University of Southampton, the UK. Cristian A. Rojas-Barahona is Associate Professor, Faculty of Psychology, Universidad de Talca, Chile, and Specialist in executive functions and cognitive development.

3

Integrated Planning of Teaching and Assessment Sandra C. Zepeda and Carla E. Förster

Abstract

This chapter discusses the importance of planning teaching and assessment in an integrated manner, emphasizing the need to view assessment and teaching as two sides of the same coin. It is proposed to plan using the backward design of Wiggins and McTighe (Understanding by design. Association for Supervision and Curriculum Development, 1998) and the components of an integrated planning are proposed through an evaluative strategy that implies pedagogical decisions prior to its implementation. Finally, two examples of planning are presented, one for primary and one for secondary.

3.1

Introduction

Educational assessment is not something recent; the task of the teacher in his or her interaction with students in the classroom has always included, as a fundamental dimension, assessing the learning progress of each student (Shepard, 2006). What is recent is the concern for enriching this dimension, since there is a vast literature that argues that in order to enrich the pedagogical practices of teachers it is necessary to improve their assessment practices, given the way in which they evaluate conditions that students learn (Broadfoot et al., 2002; Deneen & Brown, 2016; Gibbs, 2010; Himmel, 2003; Looney, 2011; Salinas, 2002; Stiggins, 2006).

S. C. Zepeda (B) · C. E. Förster Pontificia Universidad Católica de Chile, Santiago, Chile e-mail: [email protected] C. E. Förster e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_3

53

54

S. C. Zepeda and C. E. Förster

Several authors, including Gibbs (2010), argue that the design of assessment strategies and assessment situations have a strong influence on the amount of time students spend studying, what they study and the quality of their engagement with learning. This raises questions about how to achieve student engagement, what strategies facilitate learning, how teaching is prepared and how evidence of learning is collected, at what point to assess, and what role students play in assessment. All of these questions will be part of the issues we will examine in this chapter, in which we will discuss the role of assessment strategy planning and the core components that need to be planned to develop instruction and ensure student learning achievement.

3.2

Planning for Learning

According to the Ministry of Education’s technical guidelines on planning, it can be understood as “a systemic and flexible process in which teaching and learning processes are organised and anticipated, with the purpose of guiding pedagogical practice in order to support students in moving towards the achievement of the expected learning or learning objectives proposed in the national curriculum” (Mineduc, 2016, p. 3). For their part, the performances associated with Domain A of the Framework for Good Teaching are demonstrated principally through planning, and in the effects of these on the development of the teaching and learning process in the classroom. This Domain A indicates that “teachers, based on their pedagogical competencies, knowledge of their students, and mastery of the content they teach, design, select, and organise teaching strategies that give meaning to the content presented; and assessment strategies that make it possible to assess student learning achievement and to provide feedback for their own practices” (Mineduc, 2021). Considering the background of the institutional context, planning represents a key professional tool at the service of student learning. Teachers should deeply analyse national and local curriculum learning in light of the characteristics of their students and, based on all of these elements, make a set of decisions in order to design a well-founded and pertinent didactic proposal that “temporarily” organises both the teaching process and the strategies for monitoring progress in student learning. This vision recognises planning as a professional task inherent to the pedagogical task since it is linked to the reflective work of teachers in developing proposals for the design of strategies to develop and monitor student learning. The vision of planning to which we adhere believes that assessment should be an integral part of instructional planning. It should focus on the assessment criteria from which learning opportunities are planned and strategies for obtaining information about progress towards the proposed goals. It should also plan how students will understand the assessment criteria, how they will receive feedback, how they will self-assess their learning, and how they will be helped to make further progress (Broadfoot et al., 2002).

3 Integrated Planning of Teaching and Assessment

55

The main purpose of planning the teaching and assessment process together is to coherently organise classroom practices for the purpose of having a clear learning goal, along with strategies to monitor its achievement (Mineduc, 2006). Therefore, we should consider planning as a propositional or tentative formulation of what will be done in the classroom. Like any proposal, it can be strained by the practice itself, both during and after its implementation, which means that we can (and should) modify it if we see that something we plan is not producing the results we expect. Given the above background, the importance of planning is evident in developing and evaluating the learning achievement of our students, under the logic that planning should not only consider didactic teaching strategies, but also assessment strategies. Planning should take into account, in an integrated way, both the design of teaching and assessment strategies. Figure 3.1 represents the steps we should take when planning integrated teaching and assessment, as proposed by Wiggins and McTighe (1998) in their backward design model, which proposes the idea of doing a backward design, first seeing where we want to get to and then designing how to get there (teaching strategies). In this model, the first step is to identify the learning that we want students to achieve, and this is directly related to the curriculum; the second step is to define how they will demonstrate that they have achieved that learning, which points to the way in which that achievement will be assessed; and the third step is to design the learning experiences or opportunities that will help students develop the defined learning(s).

• Identify the learning objectives or expected learning of the students. STEP 1

STEP 2

• Design the assessment strategy and determine the necessary evidence.

• Design teaching strategies and determine learning experiences. STEP 3

Fig. 3.1 Steps for conducting integrated planning of teaching and assessment (adapted from Wiggins & McTighe, 1998)

56

3.3

S. C. Zepeda and C. E. Förster

Components of the Assessment Strategy

In general terms, a strategy represents a set of decisions that guide the achievement of a goal or purpose and involves the coordination of actions to achieve that goal. It is a regulable process, which corresponds to criteria to ensure a series of decisions, coherent in each moment of the design of the steps and procedures, to achieve a purpose that, in the educational field, is our students’ achievement of the learning objective. In specific terms, an assessment strategy represents a set of criteria for making decisions associated with the process of assessing student learning. It considers the definition of assessment indicators, the selection or design of procedures to obtain information about progress towards learning objectives, the definition of procedures and tools for students to know and understand the assessment indicators and criteria, the definition of instances and spaces for students to receive timely task-oriented feedback, have opportunities to self-assess their learning and receive support to make further progress. Although there are multiple decisions involved in a planning process, we will describe in detail below those that we feel are essential to designing the assessment strategy: First Decision: definition of learning objectives and assessment criteria or indicators The first and most important decision in designing the assessment strategy involves defining the learning objectives and the assessment criteria or indicators that will be the focus of the strategy. The learning objectives are part of the national curricular definition and can be contextualised and enriched at the local level with the details set out in the educational project of each school. Given this prescriptive condition of learning, it is proposed that the referent in regard to which the assessment strategy is designed is criterion-based. This assumes that, in order to determine the level of achievement obtained, the evidence that we collect during the assessment of each student should be compared with the learning objectives and assessment indicators, and the judgement should not depend on the results of the rest of the students. Assessment indicators are expected to be consistent with the learning objectives. The indicators are “detached” from the objectives and correspond to descriptions that make explicit in evident or observable terms the action(s) that a student should carry out in order to consider that he/she has achieved a given learning objective. As we saw in the previous chapter, the formulation of assessment indicators, like learning objectives and outcomes, involves a syntactic structure that contains a verb + a content + a condition. Thus, this first decision requires having clear knowledge of the learning (cognitive, procedural, and affective) that one wishes to assess in order to define the assessment indicators that will make it possible to collect evidence of their achievement. In the school curriculum we will possibly find proposals for learning objectives that comply with this structure and others that do not fit completely, we will also

3 Integrated Planning of Teaching and Assessment

57

find some that are very broad and, therefore, it is possible to break them down into multiple assessment indicators, and others that will be as narrow as one indicator, in which their disaggregation will not be possible. The important thing as teachers is to have the ability to visualise this diversity and to distinguish what is well thought out and to modify that which seems to us to be improvable in order to use it in the planning of the strategy. It is suggested that we start by working from a base, rather than creating indicators from scratch, so the intention at this point is to make the decision with technical arguments that allow us to justify it to others, and not intuitively. We should remember our responsibility as teachers, since what is at stake in this first decision is what learning we are emphasizing for our students to develop. This first component involves not only deciding on the learning objectives and their assessment criteria, but also designing how to ensure that students understand them and orient their actions towards their achievement. An important part of the success of the planning and especially of the assessment strategy is the students’ understanding and ownership of the proposed learning goals. Decision Two: defining the purpose or intentionality of the assessment It is necessary to discern at what point in the teaching process the assessment will have a formative focus, with emphasis on the detection of initial knowledge, monitoring, improvement, and feedback of student learning; or a summative intention, whose focus will be the determination of the level of achievement of learning obtained by students at the end of the teaching process of a learning objective (Hattie, 2003; Shepard, 2006; Wiliam, 2000). Assessment has an initial formative (or diagnostic) intentionality when the teacher uses the evidence collected to detect students’ prior knowledge, conceptual errors, assessments, beliefs, and experiences that may need to be retrieved in order to consider them at the beginning of the planned teaching process (Shepard, 2006). Assessment has a formative (process) intentionality when the teacher uses the evidence of student learning to monitor the learning process, provide feedback regarding their progress and room for improvement, and consider adjustments in teaching strategies to accompany their achievements (Black & Wiliam, 1998; Wiliam, 2006). The assessment has a summative intention when the teacher uses evidence of learning to account for the achievement of students in the learning objectives developed during a period (a unit, for example). Thus, the teacher must collect sufficient evidence to determine the level of achievement of each of the learning objectives defined and certify them. It is important to note that the same assessment strategy should define more than one intentionality or purpose, since they occur at various points in the teaching process (Looney, 2011). For example, at the beginning of a learning unit, the teacher can define an initial formative assessment, since its purpose is to detect students’ prior knowledge of a certain content associated with a learning objective, through an assessment instrument such as a test. Once the development of the unit

58

S. C. Zepeda and C. E. Förster

is advanced, the same teacher can define a formative assessment, using a graphic organiser as an assessment tool to detect how much the learning in development has improved. In a third phase, the teacher can define a summative assessment to determine the level of achievement of the learning defined for the unit, resorting to the creation of a project as an assessment instrument. This temporality can also be associated with a class, in which the assessment tools or situations should be less identifiable for a student than a formal assessment, but they allow us as teachers to collect evidence of our students’ learning. For a class we establish some learning goals or objectives to achieve, we must make an initial formative assessment, which can be a couple of discussion questions that allow the teacher to account for the learning of the previous class (if it is an intermediate class of the unit) and thus reinforce if something was not clear. Then, during the class, we carry out different activities to teach the content of the subject; we should monitor how they are developing these activities and if they understand. For example, we could review the results of some exercises in the guide that demonstrate different levels of understanding by the students (exercises that represent different levels of complexity) and, finally, we certify the achievement of this specific goal through the elaboration of a concept map that we collect at the end of the class. Third Decision: definition of assessment procedures The third decision refers to the definition of assessment procedures and techniques with which evidence of student learning will be collected. The teacher must ensure that the assessment procedures selected are valid in order to account for the defined assessment indicators or criteria. This means that they must provide key evidence about the learning that was defined. Along with this, the teacher must verify that the assessment procedures selected provide relevant, sufficient, and varied evidence about the learning of all students. Finally, the accuracy of the information generated from the application of the procedures and tools must be ensured, as its use is key to the viability of the strategy’s implementation. Fourth Decision: assessment agents The fourth decision involves discerning who will be the evaluator, whether it is the teacher, each student, or his or her peers. In order to answer the assessment agent or the question “Who assesses?”, basically two stakeholders have been identified in the classroom as being involved in the process: students and teachers. The most traditional form is the assessment conducted by the teacher on the student, which is called heteroevaluation. This type of evaluation is based on the criterion of hierarchical level within the classroom; in this case, the evaluator has a higher hierarchical level or authority than the evaluated. Peer assessment, on the other hand, appeals to the same hierarchical level in the classroom, since it is the students themselves who assess their peers. When students evaluate themselves, it is called self-evaluation.

3 Integrated Planning of Teaching and Assessment

59

Incorporating students as assessment agents represents a very relevant resource for learning because it places them as protagonists not only in the development of learning opportunities, but also as participants in the process of evaluating and monitoring them. The teacher can design the assessment strategy considering a single agent, or a variety of them. The important thing is that his or her decision takes into account as a criterion which actor or agent can provide sufficient and varied evidence about the learning described in the assessment indicators defined for the strategy.

3.4

Organisation of the Assessment Strategy

Since the assessment strategy is an integral part of the planning of actions to develop learning, it is proposed to organise it in conjunction with the teaching strategy, for which a planning matrix could be used that takes a unit (or subunit) as the curricular focus and addresses key aspects in the process of planning teaching and assessment for student learning. Table 3.1 presents a planning matrix that contains the design of an assessment strategy. This table describes the elements to consider in each of the components. An example of a 5th grade planning matrix for the subject of Natural Sciences is presented below (Table 3.2). This matrix focuses on Unit 2 and considers the development of a key learning objective of the subject, together with attitudinal objectives that enrich the assessment strategy. The unit will be carried out over 4 weeks, during which the assessment strategy considers an initial formative assessment to detect the students’ previous knowledge about the indicators of the learning objective. The strategy also considers two instances of formative assessment during the development of the unit: • through a test that collects evidence of understanding of key concepts prior to project development, and • a formative assessment of the project itself. Finally, two instances of summative assessment were designed: • the test is used to account for the level of learning of the conceptual aspects of the objective; and • to the project and product created (game prototype), to account for the level of learning of the conceptual, procedural, and attitudinal aspects of the unit. In this strategy, assessment tools such as tests, project development, scale and a rubric are used. In addition, the teacher will evaluate the core indicators of the learning objective and the peers, the project and the attitudinal aspects deployed during the development of this. An example of a planning matrix for Secondary Education for the subject of History, Geography and Social Sciences is also presented (Table 3.3). This planning matrix focuses on Unit 3 of learning, considers both the development of the learning objective of a thematic axis and the development of the ability to analyse

Contents (What will be taught)

They are the disaggregation of the knowledge expressed in the learning objectives They are expressed as nouns. They are generally disaggregated into three domains associated with the three domains described in Bloom’s taxonomy Conceptual: They refer to concepts, principles, or facts that students should know and that are contained in the learning objectives Procedural: They correspond to sequences of steps that can taken to achieve a goal or techniques that involve a set of ordered actions, aimed at achieving a goal. They can be habits, strategies (algorithms and heuristics), methods, and techniques

Learning objectives

They are descriptions of what students must learn in relation to the contents that will be developed in the unit They represent the knowledge, skills, attitudes, and behaviours that students are expected to achieve during a period of work

They are descriptions that explain in demonstrable or observable terms the action or actions that a student must perform to consider that he or she has achieved a learning objective They allow demarcating what is expressed in the learning objective and in the contents, guiding the behaviours that must be observed in the students to determine the level of achievement that they have reached These specific learnings must be considered both in the procedural evaluation and in the final evaluation

Learning outcomes, achievement indicators or evaluation criteria They are the strategies, mechanisms and activities that will be developed to work on the contents and their learning objectives They can be organized by class or globally by unit

Teaching/ learning strategies They are the materials that will be used or the texts (books and documents, sheets, etc.) that students will work on, among others

Resources

This column will describe the core aspects of the unit’s assessment strategy The evaluation indicators, the purpose, the procedures, and the assessment agent will be indicated. It is assumed that the benchmark is always criterial since it is associated with the current curriculum and not with course performance • Evaluation indicators: indicates the indicator(s) that will be evaluated (continued)

Assessment strategy

UNIT: Indicate the name and number of the curricular unit. You can consider the same unit proposed in the study programs or a unit (or sub-unit) organized by the teacher TIMEFRAME: Indicate from and until what date the planned unit will be developed

Table 3.1 Matrix for planning teaching and assessment for learning

60 S. C. Zepeda and C. E. Förster

Contents (What will be taught)

They respond to contents that imply knowing how to do something in certain contexts and are present in some of the learning objectives Attitudinal: They refer to values, attitudes, and norms that students will work on along with the other two types of content They respond to: In addition to working on conceptual and procedural content, what training aspects of my students am I going to try to promote and develop in my class? They can be associated with a cross-cutting learning objective or with attitudes towards the subject

Learning objectives

They are defined in the study programs, in the official Curricular Bases, under the format of Learning Objectives for each subject and Cross-cutting Learning Objectives that can cover all subjects

They are stated in the same way as the learning objectives (verb + content + context or condition) They must be consistent with the learning objectives and with the contents. In the case of conceptual and procedural content, they are expressed through verbs related to cognitive skills (Anderson et al., 2001) or psychomotor skills (Simpson, 1972) that they are expected to develop

Learning outcomes, achievement indicators or evaluation criteria

Teaching/ learning strategies

Resources

• Evaluation indicators: indicates the indicator(s) that will be evaluated • Purpose (formative or summative). Explain the purpose(s) of the unit’s assessments, indicating when they will be carried out • Type of procedures: Indicates the tool(s) to be used for the unit’s assessments • Agent: Indicates who is going to do the unit’s assessments, the teacher or the students (self or peer assessment)

Assessment strategy

UNIT: Indicate the name and number of the curricular unit. You can consider the same unit proposed in the study programs or a unit (or sub-unit) organized by the teacher TIMEFRAME: Indicate from and until what date the planned unit will be developed

Table 3.1 (continued)

3 Integrated Planning of Teaching and Assessment 61

Contents (What will be taught)

Conceptual: • Components: heart, blood vessels, and blood • Characteristics and components of the heart, blood vessels, and blood • Blood components and their function • Blood groups • Blood pressure • Function: blood transport, food substances, oxygen, and carbon dioxide Procedural: • Interpretation of diagrams and tables • Posing hypotheses about cause-effect in the anomalous functioning of structures of the circulatory system

Learning objectives

Disciplinary learning: Explain the transport function of the circulatory system (food substances and oxygen and carbon dioxide), identifying its basic structures (heart, blood vessels, and blood) Cross-cutting learning: Demonstrates rigorous and collaborative work

1. Identifies the functions, structure, and characteristics of the components of the circulatory system 2. Relates muscular structures of the heart with its continuous work of pumping blood throughout the body 3. Infers the relationship of antibodies and blood groups 4. Recognizes the different blood groups and their status as donors and recipients 5. Compares blood components according to their function 6. Interprets diagrams on the levels of blood components or blood pressure 7. Predicts problems that would be caused by a lack of oxygenated blood or nutrients in any organ

Learning outcomes, achievement indicators or evaluation criteria

5th grade, Sciences Unit: Human body systems Sub-unit: Circulatory system Timeframe: 4 weeks

Students divided into groups must create a project or game prototype that represents the structure, function, and importance of the circulatory system, with questions, riddles, predictions, etc., or of any of its structures or organs To that end, a 3-week timeframe is defined

Teaching/ learning strategies

Table 3.2 Example of planning teaching and assessment for learning in primary education

• Textbook • Guides with activities and content • PPT or papers with images • Blackboard • Cardboard • Scissors • Glue • Paperboard, etc

Resources

(continued)

* Benchmark: Achievement indicators: 1, 2, 4 and 5 * Purpose: Diagnosis (Initial training). It will take place at the beginning of the unit * Type of assessment procedure: Multiple choice and identification test * Agent: Teacher assessment * Benchmark: Achievement indicators: 1, 2, 3, 4, 5, 8, and 9 * Purpose: Formative. It will take place in the middle of the unit (week two) * Type of assessment procedure: T or F item test with justification * Agent: Teacher assessment * Benchmark: Achievement indicators: 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 * Purpose: Summative. It will take place at the end of the unit * Type of assessment procedure: Test with multiple choice and open-ended questions to apply concepts

Assessment strategy

62 S. C. Zepeda and C. E. Förster

• Bibliographic research 8. Illustrates the movement of blood on the circulatory system in the heart and throughout the Attitudinal: body 9. Distinguishes the systemic • Respect for one’s own circulation from the pulmonary work and opinions and one, with their respective functions those of others 10. Hypothesizes about the increase or • Collaborative work, decrease of normal levels of blood effective interactions for components or blood pressure teamwork • Importance of following 11. Designs a project where knowledge about the circulatory rules and procedures that system is applied protect and promote 12. Build a project (applied to a game) personal and collective with the knowledge about the safety circulatory system 13. Applying rules and procedures to safeguard one’s own safety and that of the work team in the development of the project 14. Works collaboratively and responsibly with the team in developing the project

Disciplinary learning: Explain the transport function of the circulatory system (food substances and oxygen and carbon dioxide), identifying its basic structures (heart, blood vessels, and blood) Cross-cutting learning: Demonstrates rigorous and collaborative work

Learning outcomes, achievement indicators or evaluation criteria

Contents (What will be taught)

Learning objectives

5th grade, Sciences Unit: Human body systems Sub-unit: Circulatory system Timeframe: 4 weeks

Table 3.2 (continued)

Students divided into groups must create a project or game prototype that represents the structure, function, and importance of the circulatory system, with questions, riddles, predictions, etc., or of any of its structures or organs To that end, a 3-week timeframe is defined

Teaching/ learning strategies

Resources

* Type of assessment procedure: Test with multiple choice and open-ended questions to apply concepts * Agent: Teacher assessment * Benchmark: Achievement indicators: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, and 14 * Purpose: Formative. It will take place after the submission of the project (game prototype) * Type of assessment procedure: Guide for the development of a project (game prototype) and rubric * Agent: Teacher assessment * Benchmark: Achievement indicators: 11, 13, and 14 * Purpose: Formative. It will take place after the submission of the corrected project (game prototype), (FOLLOWING THE PREVIOUS ONE) * Type of assessment procedure: Assessment scale * Agent: Peer assessment * Benchmark: Achievement indicators: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, and 14 * Purpose: Summative Final submission of the project and its product * Type of assessment procedure: Elaboration of the product, its project, and rubric * Agent: Teacher assessment

Assessment strategy

3 Integrated Planning of Teaching and Assessment 63

64

S. C. Zepeda and C. E. Förster

and critically evaluate sources, along with demonstrating attitudes such as peaceful coexistence, typical of the subject. The unit will be developed in 8–12 teaching hours, which approximately correspond to two or three weeks, during which the assessment strategy considers an initial formative assessment that will collect evidence on the level of knowledge of students about the War of the Pacific. Along with this, the strategy proposes a formative assessment during the development of the unit, which addresses—through group work—core aspects of the critical analysis of the sources and the contrast of views on the factors, causes and effects of the war. This evaluation considers the teacher’s feedback, and a student self-assessment of the aspects consigned in the analysis of the sources and the attitudinal dimension of peaceful relations between countries. Finally, the unit has designed a summative assessment that will be developed in pairs and that resorts to the analysis of sources to evaluate the indicators of the unit. The teacher will evaluate this application test through a rubric. Additionally, a selfassessment will be developed, focusing on the ability to critically analyse sources and the attitudinal dimension of valuing the importance of peaceful relations. This strategy resorts to teacher and student assessment and uses test-type situations in different application modalities: an open question for the initial formative assessment and an application test developed in pairs for the final assessment. In addition, a guide for the formative assessment, a scale for the intermediate self-assessment and a rubric for the final self-assessment have been designed. In summary, throughout this chapter we have examined the weight that the planning of the assessment strategy has in the design of the learning opportunities that we give to students, and that it is strengthened and oriented towards learning when it is planned in an integrated manner with the teaching strategy. Thus, we note the key role that we give to assessment as a permanent process closely linked to teaching. In order for this vision or approach to assessment to unfold, we must adopt an assessment approach that conceives formative and summative assessment as part of the same assessment cycle, and not as opposing assessments. As we were able to review in the examples presented in the chapter, both are part of the same assessment cycle and are necessary in order to ensure student learning. In the next chapter we will examine the purposes of assessments, their rationale, characteristics, and how they can be implemented in the classroom in an integrated manner. This will give us more tools to include these activities naturally when planning.

Conceptual: Economic, 1. Explains the factors social, political, and associated with a war territorial consequences and relates them to of the First World War the European conflict 2. Identifies the Procedural: Analysis consequences of the of various sources First World War in Attitudinal: Importance each sector in the of peaceful relationships country 3. Compares the consequences of the First World War for the countries involved 4. Analyses through various perspectives the causes and consequences that affected the First World War in the countries involved

Analyse the First World War considering the economic conflict between England and Germany, the impact of the war in multiple spheres of European society, and the expansion of the territories. Evaluate its projection in the relations with neighbouring countries

Learning outcomes, achievement indicators or evaluation criteria

Contents (What will be taught)

Learning objectives

The unit begins with the collection and return of initial information about the First World War Subsequently, the students develop a group work in classes using source analysis guides from various historians Explanation of the causes and consequences that the development of the First World War brought to the country in a group and plenary manner, after analysing sources

Teaching/learning strategies • Slideshow • Sources (primary and secondary) • Images and maps • Textbook • Guides with activities

Resources

(continued)

Benchmark: Achievement indicators 1 and 2 Initial Formative Purpose Type of assessment procedure: Test with open-ended questions with intermediate answers Agent: Teacher assessment Benchmark: Achievement indicators 1, 2, 3, 4, 5, 6, and 7 Formative purpose It will take place in the middle of the development of the unit Type of assessment procedure: Source analysis guide with rubric. Scale for self-assessment

Assessment strategy

9th grade, History, Geography, and Social Sciences Unit 3: The formation of the European territory and its geographical dynamics: characterization and impacts of state expansion policies Timeframe: 8 to 12 pedagogical hours

Table 3.3 Example of planning teaching and assessment for learning in secondary education

3 Integrated Planning of Teaching and Assessment 65

Analyse and critically evaluate information from various sources to use it as evidence in arguments about level topics Demonstrate appreciation for life in society, through active commitment to peaceful coexistence, the common good, equality of men and women, and respect for the fundamental rights of all people

Learning objectives

Contents (What will be taught) 5. Contrasts the different views of the sources regarding the causes and consequences of the First World War 6. Evaluates what the war meant in the economic, social, and cultural relationships between the countries involved 7. Recognizes the importance of peaceful relations between countries

Learning outcomes, achievement indicators or evaluation criteria In the following class, the source analysis concludes and the discussions about the factors, causes, and consequences of the First World War are synthesized. Students examine the treaties signed after the war and the changes that they meant in the countries, up to now Discussion about the importance of the country’s international relations with the outside world, especially with neighbouring countries, with special emphasis on peaceful relations

Teaching/learning strategies

Resources

Agent: Teacher assessment Self-assessment (especially for indicators 5, 6, and 7) Benchmark: Indicators of achievement of the entire unit (1,2,3,4,5,6, and 7) Purpose: Summative It will take place at the end of the unit Type of assessment procedure: Application test developed in pairs, containing analysis of sources and an argumentative question. Rubric for testing and for self-assessment, along with open-ended questions Agent: Teacher assessment and self-evaluation

Assessment strategy

9th grade, History, Geography, and Social Sciences Unit 3: The formation of the European territory and its geographical dynamics: characterization and impacts of state expansion policies Timeframe: 8 to 12 pedagogical hours

Table 3.3 (continued)

66 S. C. Zepeda and C. E. Förster

3 Integrated Planning of Teaching and Assessment

67

References Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., Raths, J., & Wittrock, M. C. (Eds.). (2001). A revision of Bloom’s taxonomy of educational objectives. Longman Publishing. Black, P. J., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles Policy and Practice, 5(1), 7–74. https://doi.org/10.1080/0969595980050102 Broadfoot, P. M., Daugherty, R., Gardner, J., Harlen, W., James, M., & Stobart, G. (2002). Assessment for Learning: 10 Principles. Research-based principles to guide classroom practice. Assessment Reform Group. Deneen, C. C., & Brown, G. T. L. (2016). The impact of conceptions of assessment on assessment literacy in a teacher education program. Cogent Education, 3(1), 1–14. https://doi.org/10.1080/ 2331186X.2016.1225380 Gibbs, G. (2010). Using assessment to support student learning. Leeds Met Press. https://core.ac. uk/download/pdf/42413277.pdf Hattie, J. A. C. (2003). Teachers make a difference: What is the research evidence? [Paper]. Building Teacher Quality: What does the research tell us ACER Research Conference, Melbourne, Australia. http://research.acer.edu.au/research_conference_2003/4/ Himmel, E. (2003). Evaluación de aprendizajes en la educación superior: una reflexión necesaria. [Assessment of learning in higher education: a necessary reflection]. Pensamiento Educativo, Revista De Investigación Latinoamericana (PEL), 33(2), 199–211 Looney, J. W. (2011). Integrating formative and summative assessment: Progress toward a seamless system? OECD Education Working Papers 58 [eBook edition]. OECD Publishing. https://doi. org/10.1787/5kghx3kbl734-en Mineduc. (2006). Evaluación para el aprendizaje: Enfoque y materiales prácticos para lograr que sus estudiantes aprendan más y mejor [Assessment for learning: Practical approach and materials to help your students learn more and better] [eBook edition]. Chilean Ministry of Education. https://bibliotecadigital.mineduc.cl/bitstream/handle/20.500.12365/2055/mono-851.pdf Mineduc. (2016). Orientaciones técnicas: La planificación como un proceso sistémico y flexible [Technical Guidelines: Planning as a systemic and flexible process] [eBook edition]. Chilean Ministry of Education. https://bibliotecadigital.mineduc.cl/bitstream/handle/20.500. 12365/2161/mono-988.pdf Mineduc (2021). Marco para la Buena Enseñanza [Framework for Good Teaching] [eBook edition]. Chilean Ministry of Education. https://bibliotecadigital.mineduc.cl/bitstream/handle/20. 500.12365/545/MONO-463.pdf Salinas, D. (2002). ¡Mañana examen! La evaluación entre la teoría y la realidad [Tomorrow’s exam! Assessment between theory and reality]. Graó Shepard, L. A. (2006). Classroom assessment. In R. Brennan (Ed.), Educational Measurement. (4th ed., pp. 623–646). Praeger Simpson, E. J. (1972). The classification of educational objectives in the psychomotor domain. Gryphon House Stiggins, R. (2006). Assessment for learning: A key to motivation and achievement. Edge, 2(2), 1–19. Wiggins, G., & McTighe, J. (1998). Understanding by design. Association for Supervision and Curriculum Development Wiliam, D. (2000). Assessment: Social justice and social consequences [Review of gender and fair assessment; Beyond multiple-choice: Evaluating alternatives to traditional testing for selection; Investigating formative assessment. In S. W. Warren, N. S. Cole, M. D. Hakel, H. Torrance, & J. Pryor (Eds.), British educational research journal, pp. 661–663 Wiliam, D. (2006). Assessment: Learning communities can use it to engineer a bridge connecting teaching and learning. Journal of Staff Development, 27(1), 16–20.

68

S. C. Zepeda and C. E. Förster

Sandra C. Zepeda is social worker. Master in Education Sciences with mention in Evaluation Pontificia Universidad Católica de Chile. PhD (c) in Education Universidad ORT Uruguay. Lecturer at the UC Faculty of Education in undergraduate and postgraduate training programs. Specialist in curriculum development and evaluation for learning in school and higher education. email: [email protected]

4

The End Justifies the Means: Purposes of Assessment Sandra C. Zepeda

Abstract

This chapter focuses on the purpose of assessment and presents the main characteristics of formative and summative evaluation, emphasizing that these are complementary, and both are necessary to carry out a correct follow-up of student learning. The main tensions between summative and formative assessment are discussed, ending with suggestions that constitute good practices, resources and instruments that teachers can use in their classrooms to integrate formative and summative assessment into their assessment practices.

4.1

Introduction

Assessment can serve many purposes and a distinction is often made between formative and summative purposes. On the one hand, summative assessment focuses primarily on the assessment of learning outcomes and, on the other hand, formative assessment assumes the aim of supporting learning through teaching and specific feedback (Stobart, 2008). In formative assessment, a distinction is made between initial (diagnostic) assessment and in-process (continuous) assessment. These assessment purposes are presented in a cycle, in which a summative instance can be considered initial in another learning cycle (Fig. 4.1). Teachers are more familiar with the traditional form of student assessment and its summative nature, which is linked to tests, final exams and grades, and is associated with assessment focused on student certification. However, formative classroom assessment has also taken on an increasingly visible role in recent years because of its potential to develop student learning and guide teaching (Looney, 2011; OECD, 2005).

S. C. Zepeda (B) Pontificia Universidad Católica de Chile, Santiago, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_4

69

70

S. C. Zepeda

Fig. 4.1 Cycle of assessments according to their purposes

Throughout this chapter we will examine the conceptualizations at the base of formative and summative assessment, their common aspects and those that distinguish them. We will also analyse some of the tensions surrounding both purposes and will present visions that integrate them as a way of recognising the role that each plays in the assessment of classroom learning. Finally, we will present the main uses of formative and summative assessment and recommendations for implementing strategies in the classroom.

4.2

Conceptualizations of Summative and Formative Assessments

In general terms, the concept of “summative assessment” is used to describe assessments that certify student learning achievement, while formative assessment is understood as assessments that provide feedback to students on how to improve their learning, and the concept of “diagnostic” is used for assessments that provide information about students’ prior knowledge and ideas. While the terms ‘diagnostic’, ‘formative’ and ‘summative’ are used to describe different types of assessments, the results of the same assessment may serve more than one function or purpose at different times (Wiliam & Black, 1996). As we can see, formative and summative assessments have different purposes. One makes it possible to monitor the teaching process and enhance learning, and the other illustrates achievement and attainment of a learning goal. In tracing the origins of both concepts, we find that in 1967 Scriven distinguished between formative and summative assessments, but Bloom et al. (1971) were the first to define their most widely accepted meaning in education. These authors defined summative assessment as that which is given at the end of units, or of a course, to judge the degree of student learning, for the purpose of classifying, certifying, and evaluating progress or even to investigate the effectiveness of a curriculum.

4 The End Justifies the Means: Purposes of Assessment

4.3

71

Formative Assessment

Understanding formative assessment as the set of instances in which evidence of student learning is collected to foster their development and achievement during the teaching/learning process, we can identify two instances: initial formative assessment, traditionally known as diagnostic, and ongoing formative assessment. We prefer to speak of initial formative assessment because the information that is collected constitutes the first instance of a broader formative process, while diagnosis is associated in the school setting to a single instance, generally at the beginning of the year, restricting the potential of the assessment cycle in the accompaniment that we can give to students.

4.4

Initial Formative Assessment

Initial formative assessment is based on the conviction that prior knowledge is essential for the development of new learning since learning involves making connections and integrating new understanding with existing knowledge. This assessment is developed to obtain an overview of what a student knows, thinks or believes. Gathering evidence about students’ prior knowledge includes formal learning acquired through schooling, as well as the implicit explanations and theories they have about how the world works. Both can facilitate or hinder the integration of new learning, as with conceptual errors (Shepard, 2006). Teachers can use a variety of strategies to gather evidence about learning. The more information we can gather, the more accurate the initial picture we will have of a student and his or her learning in that unit of subject content. Effective teaching strategies consider students’ prior knowledge as a resource. Therefore, the role of the initial formative assessment aims to collect this knowledge at the beginning of new sessions or curricular units, explicitly developing in students the habit of asking themselves about what they know, recognise and have done with respect to a specific learning. This type of assessment provides us with valuable evidence for adapting teaching and support strategies for our students, since it allows us to demonstrate conceptual errors, learning acquired in a complete, partial, incomplete, or incorrect manner, together with their beliefs and visions on the subject. Evidence of prior knowledge is not only the formal knowledge that students have acquired, but also includes patterns of language and ways of thinking that they develop through their social roles and cultural experiences. There are many strategies that can be used to determine where students are at the beginning of a session or unit and what needs to happen in the classroom to successfully meet learning goals; what is important for this type of assessment to serve its purpose is that the information gathered allows for timely decision-making by the teacher and the students.

72

4.5

S. C. Zepeda

Ongoing Formative Assessment

Black and Wiliam (2009, p. 6) provide the following definition of formative assessment: “Practice in a classroom is formative to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence of the evidence that was elicited.“. This definition emphasises decision making to support or enhance learning; thus, formative assessment should be ongoing and focused on student progress in order to identify learning needs and to shape instruction (OECD, 2005). Additionally, formative assessment is framed as an active and intentional learning process that brings together teachers and students to collect evidence of learning constantly and systematically with the explicit goal of improving it (Moss & Brookhart, 2009). This new approach emphasises that formative assessment is considered an integral part of the teaching/learning process, and not an adjunct to it (Deneen & Brown, 2016; Shepard, 2006). Thus, the primary purpose of ongoing formative assessment is to enhance learning while in the act and not to audit it, understanding it as an assessment for learning rather than an assessment of learning; therefore, ongoing formative assessment is primarily used for the benefit of students to support their learning (Moss & Brookhart, 2009). The crucial distinction of this assessment is that it could only be categorised as formative if it supports or structures subsequent learning and when the evidence gathered is actually used to adapt teaching work to meet the needs of these students (Black & Wiliam, 1998; Wiliam, 2009). For ongoing formative assessment to be effectively deployed in the classroom it must: (1) be intrinsically connected to the teaching and learning cycle, allowing for frequent monitoring and adjustment (Sadler, 1989; Shepard, 2000, 2006); (2) share the expected objectives and quality criteria with students; and (3) deliver feedback that allows students to enhance their learning. This third point will be discussed in detail in the next chapter. The formative assessment approach was further explained by Atkin et al. (2001) based on three key questions (Fig. 4.2). The first aims to identify the learning or goal to be achieved, the second identifies where the student’s learning is in relation to the goal, and the third addresses the gap between the goal and the current learning situation, focusing specifically on what is needed to achieve the goal. Ongoing formative assessment uses a variety of strategies to reveal students’ understanding of a topic, allowing teachers to identify and address difficulties in their students’ learning progress. From the formative data, teachers must decide how much and what kind of support and practice a student requires to reach the goal. When formative assessment is used at the beginning, during and at the end of instruction, both teachers and students can be guided towards improvement because they have a tangible measure of progress to draw on (Rasmussen, 2017).

4 The End Justifies the Means: Purposes of Assessment

73

Fig. 4.2 Key questions of formative assessment (Atkin et al., 2001)

Fig. 4.3 Key elements of formative evaluation

Initial formative assessment is undoubtedly the first step in ongoing formative assessment to improve learning achievement (Moss & Brookhart, 2009). The following diagram summarizes the key aspects of formative assessment described in this section (Fig. 4.3).

4.6

Summative Assessment

As described above, summative assessments can be conceived as part of a cycle, or as important ongoing learning milestones that were scaffolded in formative assessments. In this view, summative assessments illustrate a student’s accomplishment and achievement of a learning goal (Shepard, 2006). Summative assessments are intended to account for what students have learned at the end of a unit or teaching period, to ensure that they have met the required

74

S. C. Zepeda

standards and thus obtain certification. Such assessment makes it possible to legitimize a learning process by assigning an assessment or judgment at the end of a period, with the purpose of promoting students’ advancement to the next course or access to other educational or professional systems (OECD, 2005). Summative assessment has more than one use, as there are a variety of ways in which information about student achievement at a given point in time is employed. These can be grouped into two main categories: internal and external to the school community. Uses include tracking student progress in the school; informing parents, students, and the next teacher of what has been achieved; certifying or accrediting learning through the use of regular grades for record keeping, etc. Summative assessment by teachers has been found to have a more positive impact on learning when its results are integrated into the teacher’s daily pedagogical practice than when they are concentrated on one-off occasions (Harlen & Winter, 2004). One of the aspects that is commonly linked to summative assessment is grading, but to relate it only to summative assessment represents a restricted view of the role of grading in the assessment process. Several authors argue that the grade should be consistent with the entire assessment process, that is, it should represent the learning demonstrated by the student. This implies that in order to grade with a mark that is recognised as final, the student must have been monitored by considering an assessment that meets the following characteristics: (1) linked to the learning objectives (which must be known and understood by the students); (2) valid assessment procedures are applied to collect evidence (aligned with the learning goals); (3) feedback is given about performance and the student has opportunities to rework or redo their assessment task; and (4) that culminates in certification of learning (Dyer, 2014; Shepard, 2006). Summative classroom assessment results usually involve some form of data reduction (e.g., averaging over an academic period; assigning a grade based on weighted marks), but which requires that such reduction maintains the determination of the level of achievement of each learning objective assessed. Therefore, time should be devoted to designing reports that are useful for teaching and that include qualitative information alongside the scores obtained for each learning objective or assessment indicator, including, for example, information on correct/ incorrect responses for learning objectives or their assessment indicators, and comments on what an incorrect response implies, supplemented with suggestions for addressing the next steps (Perie et al., 2007).

4.7

Distinctions and Articulations Between Formative and Summative Assessments

At the present time, teachers have contradictory beliefs and practices. On the one hand, the belief that formative and summative assessment should be part of a single assessment system is installed in the discourse but, on the other hand, what

4 The End Justifies the Means: Purposes of Assessment

75

happens in the classroom is the supremacy of summative assessment, making the formative purpose invisible (Deneen & Brown, 2016; Wiliam, 2000). Several authors (see e.g. Hattie, 2003; Rea-Dickins, 2006; Shepard, 2006; Wiliam, 2000) question the dichotomous view between formative and summative assessment and propose a complementary and integrated view of the two processes. Classroom assessment should be summative and formative, since it establishes what students can do at a given moment, monitors their progress, and provides feedback to teaching as a means of supporting learning. From this point of view, assessment should not only be formative, as a teaching tool, or only summative, as a tool to document student achievement, but both should be used together to reorient both teaching and learning. The challenge posed by various authors regarding the purposes and uses of assessment (see e.g., Gibbs, 2010; Shepard, 2006; Wiliam, 2000) is that formative and summative classroom assessment practices should be coherent and mutually supportive, that both should focus on student learning and the improvement of the teaching process. What teachers and students need are formative assessments that monitor partial or ongoin progress in learning and summative assessments that use the same criteria to verify students’ achievement of important milestones in the acquisition of competencies (Shepard, 2006). For Hattie (2003), the distinction between formative and summative assessment relates more to the timing of the interpretations and decisions that are made than to the form or procedures by which evidence is collected. This author warns that if we conduct formative assessments that are less rigorous than summative assessment, both at the beginning and during the learning process, we are more likely to steer our students in the wrong direction or give them recommendations that are not precise enough to bridge the learning gap, undermining the accuracy of the corrections or adjustments that the student can make. Therefore, we call for not minimizing the accuracy of initial and process formative assessment, because it is precisely the feedback given at these times that has a high potential to ensure quality learning. The logic proposed in this section of integrating formative and summative assessments in the same assessment cycle seeks that both teachers and students join together with the purpose of creating a different way of learning, since, from the assessment experience and especially from its results, both change: students make learning their own and teachers can adjust their teaching strategies to improve learning opportunities (Wiliam, 2000).

4.8

Tensions Between Summative and Formative Assessments

In this section we will describe the main confusions, misunderstandings and tensions that are recognised in assessment practices associated with both summative and formative evaluation:

76

S. C. Zepeda

1. Generally, summative assessment is linked only to grading, hiding, first, the core task of this type of assessment, which is the determination and certification of the level of achievement of learning, and second, separating this type of assessment from the complete assessment cycle, in which formative assessment is also an integral part. 2. In assessment practice, distortions are recognised regarding the link between summative assessment and grading, which constitutes a serious threat to the approach and the conviction that assessment contributes significantly to the achievement of learning objectives. • First, tests and graded assignments communicate what is relevant to learn; if they are not coherent with the learning goals being assessed, then students will focus their attention and effort only on graded assignments, which can lead to what has been called “curriculum narrowing” (Wiliam, 2000). • A second practice is the use of grades as a reward or punishment, completely disconnecting them from learning (Shepard, 2006). For example, in situations of misbehaviour, students are threatened with a pop quiz or offered extra points for good behaviour. 3. In regard to formative assessment, teachers often carry out practices that are associated with this purpose, but that are distant from its core premises: • Under the conviction that feedback can have a substantial impact on learning, a number of studies show feedback practices that are judgmental of the learner, overly directive, dishonest, sloppy, vague, undirected, or focused on non-improvable aspects, which are perceived as irrelevant and frustrating by students and ultimately do not lead to improved learning (Hargreaves, 2013; Kay & Knaack, 2009). • A very relevant aspect of formative assessment corresponds to the use of error as a resource for learning; despite this, there are studies that indicate that teachers do not explicitly address conceptual errors common to a certain discipline in feedback (Heitink et al., 2016). In addition, practices are recognised in which the error is penalized, assigning value to it only at the end of the teaching process, when learning is being evaluated to certify it (Moss, 2013). • Regarding the possibility of improving learning, considered one of the key aspects in formative assessment, practices are observed in which teachers analyse with their students the results of a test, especially reviewing the incorrect answers once they have already graded them. With this action, it would not be possible to apply the premise that learning can be improved, since students would not be offered the option to re-correct or re-elaborate (Shepard, 2006). • Regarding the key role of the student in formative assessment, studies report practices in which students “self-grade” and do not self-assess their learning, practices in which students assess themselves and their peers without having fully understood the assessment criteria or having been prepared to give and receive feedback on their own and their peers’ learning (Heitink et al., 2016).

4 The End Justifies the Means: Purposes of Assessment

77

• Finally, and in relation to the design of assessment situations, one of the elements expected through formative assessment instances is that students assume an active role in their learning process. However, studies report assessment practices that do not consider student motivation or involvement in them, which often resort to the application of decontextualised tests and routine, meaningless and inauthentic assessment tasks that generate little motivation and low student involvement (Heitink et al., 2016).

4.9

Good Formative and Summative Assessment Practices in the Classroom

Assuming that summative and formative assessment are parts of the same cycle and, therefore, complement each other, in this section we will examine how classroom assessment practices can be developed that integrate both intentions: First Practice. Clear goals: the learning objectives and their assessment criteria. Both summative and formative assessment are guided by the learning objectives set out in the curriculum. Good assessment practice should first consider the teacher’s clear definition of the learning objectives and their assessment criteria. This definition is expected to be reflected in the planning of their assessment strategy (which was discussed in the previous chapter) and to be consistently maintained throughout the assessment cycle. Secondly, students are expected to achieve a clear understanding of the learning objectives and their assessment criteria; in this sense, it is recommended for the teacher to dedicate time in class for students to understand and assimilate what the learning goal and its quality criteria are. Second Practice. Use of previous knowledge. For the assessment cycle to be successful, it is necessary to implement the initial formative assessment to collect evidence of students’ prior knowledge, both formal and those associated with implicit explanations about new learning. This evidence is key for the teacher to validate or rethink the teaching strategies planned for the unit or class (Shepard, 2006). Various assessment strategies and procedures can be used, but what is most important is that they allow sufficient and varied evidence of prior learning to be collected and that appropriate decisions are made based on this evidence. Third Practice. Using mistakes to learn. Throughout the teaching process, students can make mistakes and learn from them without penalty. Mistakes only become learning opportunities when feedback is provided to close the gap of the goal, as students should have the option to rewrite or re-elaborate their learning. Errors should also be thoroughly analysed by the teacher to detect the presence of conceptual, organisational or execution errors,

78

S. C. Zepeda

and from this analysis make appropriate pedagogical decisions. This assessment practice is basically formative and prepares students to certify their learning (Dyer, 2014). Fourth Practice. Diversified evaluation. The cycle should be designed considering that assessment is an ongoing process that permanently gathers evidence on the progress of student learning until certification. For this reason, assessment strategies should be designed and implemented that consider authentic contexts, varied challenges, and different formats in order to gather valid and sufficient evidence to make judgments about individual student learning, and to make it possible to both adjust and certify it. Fifth Practice. Effective feedback. Feedback is expected to be effective, specific, and timely, i.e., to allow the learner to adjust or rework their learning, therefore, it should be developed prior to certification and marking. Feedback should provide details on how to improve rather than simply indicating whether a student’s work is correct or not (Shute, 2008). General praise or personal comments are often not helpful in improving learning (Rasmussen, 2017). Sixth Practice. The student as protagonist. Classroom assessment practice should be designed and implemented so that students have some control over their learning process and are actively engaged in the learning process. In addition to explicit goals, students should know what they are learning, what products or performances they are expected to produce or demonstrate, and how they will be assessed (Dyer, 2014). It is recommended that self-assessment and peer assessment be used especially in the formative part of this cycle, and that self-assessments or reflective records of learning be used in the summative part of the cycle. This point will be discussed in more detail in the following chapters. Seventh Practice. Consistency between feedback and summative assessment. Classroom assessment practice must ensure coherence between the feedback given at the formative moment and the determination of the level of learning achievement reported at the summative moment. In this classroom practice, we must ensure that the certification of the level of learning achievement considers the feedback given to the student as the main input, based on the same learning goals and assessment criteria for the entire assessment cycle.

4.10

Resources and Tools to Be Used in Classroom Assessment Practices

Below is a selection of resources and tools that could be used in the assessment practices recommended in the previous section.

4 The End Justifies the Means: Purposes of Assessment

79

First Practice. Clear goals: the learning objectives and their assessment criteria. Sadler (1989) proposes that at the beginning of an assessment situation, the teacher should work with students by showing finished products of work or assessment tasks that have been done by students in previous years. • First of all, the teacher should give the students finished products so that they are familiar with them and not just imagine what the product of the task they are being asked to do is. • Secondly, the teacher should ask the students to evaluate what they consider to be good quality products and to distinguish what they consider to be poor quality. Thus, the students themselves come up with emerging quality criteria. • Thirdly, the teacher presents the quality criteria that will be used in this assessment task so that they can understand and visualise them in a concrete situation. • Fourth, students examine only finished products with the highest level of achievement. The teacher must ensure that the products selected by the teacher express a variety of ways of approaching the work. The purpose of this exercise is for students to understand that there are a variety of ways to demonstrate learning achievement and that there is no single expected way to achieve maximum achievement in this assessment situation. This resource allows them to understand the learning goals and their assessment criteria before they begin an assessment situation. Second Practice. Use of previous knowledge. (a) The first assessment task corresponds to the elaboration of a glossary of concepts related to the learning process to be investigated. A glossary corresponds to a group of words or concepts from the same discipline or field of study that are defined, explained, or commented on. For this assessment task, the teacher should ask each student to explain in his or her own words what he or she understands by a limited set of concepts associated with the selected topic. The student may also be asked to explain the concept and give an example that illustrates it. This exercise or assessment task is individual and is developed in writing. It is given to the teacher so that he/she can analyse the previous knowledge and make decisions about it. This same initial glossary can be taken up by the student at a more advanced stage or at the end, so that he/she can rework it according to the learning achieved during the teaching process. (b) The second assessment task corresponds to the creation of a graphic organiser for the learning to be investigated. A graphic organiser is a visual representation of knowledge that shows information about a specific topic, highlighting important aspects of a concept or subject within an outline. This assessment task is individual, and the student is asked to choose a type from among some graphic organisers pre-selected by the teacher (those that

80

S. C. Zepeda

best allow the student to graph the learning in question). We will go deeper into this resource in Chap. 8. Third Practice. Using error for learning. This assessment task considers asking the student to create a specific graphic organiser that compares concepts, theories, or approaches, called a “Venn diagram”. It is suggested to apply this assessment task with a formative purpose during teaching, to monitor learning and detect possible conceptual errors or incomplete learning. The Venn diagram is a type of conceptual organiser that consists of two circles that meet in the middle. It is used to contrast information or points of view, to compare approaches, distinguishing commonalities and differences.

Instructions The student is asked to draw a Venn diagram from the discussion and exposition of the unit, specifically about the differences and similarities between approach A and B (to be defined by the teacher). The student should complete the Venn diagram, writing the distinguishing characteristics of each approach separately and including the commonality in the middle

Fourth Practice. Diversified assessment. The following is a comprehensive assessment strategy that exemplifies the practice of diversified assessment. Presentation: History and social sciences and mathematics teachers have designed an assessment strategy in which 6th grade students will work on a community service group project, in the context of a process evaluation with formative and summative purposes. The teacher and the students have been considered assessment agents; therefore, they will conduct hetero-evaluation, peer evaluation and self-assessments. Learning objectives (LO) for both History and social studies (HI) and Math(M) were defined for this assessment strategy, making it an integrated project. HI06 LO 21: work effectively in a team to carry out research or other project, assigning and assuming roles, fulfilling assigned responsibilities and meeting agreed deadlines, listening to the arguments of others, expressing informed opinions and reaching a common point of view.

4 The End Justifies the Means: Purposes of Assessment

81

HI06 LO 23: participate, through concrete actions, in projects that involve contributions within the school, community and society, such as volunteering and social aid, following a plan and its budget. MLO I: extract information from the environment and represent it mathematically in diagrams, tables, and graphs, interpreting the data. Instructions: the work will be developed in groups; the composition of the teams will be assigned by the students. The project will be developed in three stages and each of them will have a formative and a summative evaluation. The development of a portfolio will be used, which will have two partial deliveries and one complete delivery. Teams may select a topic to develop from a list proposed by the teacher, which is related to concepts worked on in Natural Sciences, Arts or other subjects. The assessment criteria will be: • Argumentation of the chosen topic. • Organisation of tasks, responsibilities, and deadlines. • Research on the chosen topic: people affected, problem or need, information gathering, representation and systematization of the information. • Fulfilment of responsibilities and group tasks. • Reflective peer review at all stages. • Proposal of effective and feasible solutions to the identified problem. • Organisation and execution of the project. • Evaluation of the results of the project and group work. • Incorporation of feedback into group work. • Group performance: equal participation, responsibility, group interactions. • Assessment of the student’s learning process. Stage 1 • The group selects the topic on which it will develop the service project and argues for its choice. • The group organises and plans its tasks, defines responsibilities, deadlines, and activities to be carried out, according to the guideline given by the teacher. • The students in the group inquire about the need or problem they will address through the project: they research the issue, identify the people who are affected by the problem or need, generate interview or survey guidelines and apply them, systematize the information in graphs, diagrams or tables, and generate a brief report about the need they will address. • Attendance at the group’s meetings, its main agreements and topics discussed are recorded. • Each group member answers the following questions: how effective was the planning we did in the development of this stage, what difficulties did we encounter, how did we deal with them? • The group makes a partial formative delivery of the portfolio with the results from Stage 1.

82

S. C. Zepeda

• The group receives the feedback and one week later delivers the portfolio with the results from Stage 1 for summative assessment and grading. Stage 2 • The group brainstorms possible solutions to the problem or need identified in Stage 1. The team organises the solutions, classifying them according to two criteria: effectiveness and feasibility. Effectiveness considers that they respond to the need and/or solve the diagnosed problem. Feasibility considers that it is viable, given the time and resources available. • The group presents both the diagnosis (Stage 1) and the selected solution ideas to the class for feedback. The students evaluate themselves among peers with a rating scale prepared by the teacher. • The group selects the strategy or solution and carries out the project using a guideline given by the teacher. • Attendance at the group’s meetings, its main agreements and topics discussed are recorded. • Each group member answers the following questions: what was the main difficulty we had at this stage and how did we deal with it? • The group makes a partial formative delivery of the portfolio with the results of of Stage 2. • The group receives the feedback and one week later delivers the portfolio with evidence from Stage 2 for summative assessment and subsequent grading. Stage 3 • The group presents the project to its audience for feedback. Ensure that the form and content of the presentation are appropriate to the characteristics of the audience. • The group adjusts the project according to the recommendations of its recipients. • The group organises the implementation of the project using the teacher’s guidelines. • The group executes the project it designed. • The group evaluates the results of the project. • The team evaluates their group work and the execution of the project with an instrument previously given by the teacher. • The group makes a formative delivery of the complete portfolio with results from Stages 1, 2 and 3. • The group receives feedback and submits the completed portfolio one week later for summative assessment and grading. Fifth Practice. Effective feedback. A series of recommendations for effective feedback on an assignment, a partial product delivered, or a student’s performance are presented.

4 The End Justifies the Means: Purposes of Assessment

83

The recommendations are based on the four principles of effective feedback. They refer to: • Descriptive and non-judgmental of learning • That considers aspects achieved and to be improved • That is specific to the task or product it reviews and that guides the learner in reducing the gap between the current level of learning and the goal • That focuses on the aspects that can be changed by the learner Recommendations for the teacher: • Conduct feedback before grading the student’s assessment task. • In your written comments avoid including check marks, grades, or associated scores. • Once the feedback has been given, offer the learner the opportunity to improve some of the aspects of the learning you reviewed so that the learner can effectively rework or rewrite the product before the summative assessment. • Focus on the student’s description or how the work being reviewed was done and avoid making judgments such as “incomplete” or “incorrect” without describing what is incomplete or incorrect. • Consider both the aspects achieved by the student in the work or product reviewed and the aspects to be improved. • Feedback must be accurate, it must be clear to the student how to improve their product or work, so the teacher needs to focus on the specific ideas that the student can work on to improve. • Focus on the aspects of your product or work that the learner can change, not on aspects that are not up to the learner and cannot be changed. Sixth Practice. The student as protagonist. 1. The first suggested resource is to apply the KWL technique (I know, I want to know, I learned) during a class session. This technique arises as a tool to promote learning in the reading of expository texts (Ogle, 1986), but it can also be adapted to be used in other learning situations. C: What do I know? This part asks the student to indicate and demonstrate what he/she knows about the learning goal of the session. It is applied at the beginning of the class. Q: What do I want to learn? In this part you are asked to determine what you want to learn, defining in your own words your own learning goal. It is applied at the beginning of the class after answering the first question. A: What did I learn? At the end of the class the student is asked to identify what he/she has learned during the session in relation to the goal he/she defined. 2. The second suggested resource is to use short tests or constructive quizzes for formative purposes. Short written tests are commonly used for formative assessments, in which students can receive immediate feedback. For this to

84

S. C. Zepeda

happen, the teacher must give them the key sheet, or use a digital resource that self-corrects the short test. And from that moment on, the analysis and revision of the incorrect and correct answers should begin to be developed in the class. A first variant of this resource is to ask students to answer the quiz at the beginning of the class, and after developing it, offer them to review it and give them the option to improve it. A second variation of this resource is to ask students to answer the short test individually and then approach it in pairs, which involves negotiating, arguing, and analysing the test items in light of the learning of each member of the pair. The third suggested resource is to apply a self-assessment with a formative purpose. This assessment consists of the student reflecting on the metacognitive strategies used to approach an assessment situation proposed by the teacher. For example, if the student is to self-assess his or her metacognitive planning strategies, the technique should be applied at the beginning, when the student is presented with the assessment situation to be developed. In that context, the teacher asks the student to plan how he/she will approach the work and then self-assess his/her planning strategies through questions that the student must answer and analyse before the teacher develops the feedback. Some of them are: • • • •

Did I realize what kind of task this is? Did I understand what the objective of this task is? Did I identify what information I need to develop it? Do I recognise the problems that might arise while I am working and know how I might handle them? • Did I identify which strategies can help me develop the work? • Did I identify the resources I have available for it? • Did I define how long it will take me to do this task? Seventh Practice. Consistency between feedback and summative assessment. The main recommendation for implementing this assessment practice is planned by Hattie (2003) and is to ensure, firstly, the accuracy and validity of the formative assessment by providing valid feedback, that is, aligned with the learning objectives and their assessment criteria, and accurate in the recommendations for improvement so that the student reduces the gap with respect to his or her learning goal. Secondly, the teacher must ensure consistency between the feedback given and the subsequent summative assessment applied, since both are based on the same learning goal and maintain the assessment criteria. In summary, throughout the chapter we have developed the proposal of integrating formative and summative assessment in the same assessment cycle because they mutually feed each other and constitute two sides of the same coin in classroom assessment practice. We argue that the integrated vision brings teachers and students together for the purpose of creating a different way of developing learning. Under this approach,

4 The End Justifies the Means: Purposes of Assessment

85

both teachers and students change. Students make learning their own and teachers can adjust their strategies to improve learning opportunities (Wiliam, 2000). Finally, we consider it necessary to describe a set of assessment practices that strengthen this alliance, along with sharing a series of resources that allow and enable the implementation of the integrated approach to formative and summative assessment in the classroom. In the next chapter, we will delve into a core process for developing student learning that is closely linked to the formative assessment we have examined in this chapter, and that is feedback.

References Atkin, J. M., Black, P. J., & Coffey, J. (2001). Classroom assessment and the national science education standards. National Academy Press. https://nap.nationalacademies.org/read/9847/ chapter/1 Black, P. J., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles Policy and Practice, 5(1), 7–74. https://doi.org/10.1080/0969595980050102 Black, P. J., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31. https://doi.org/10.1007/s11092-0089068-5 Bloom, B. S., Hastings, J. T., & Madaus, G. F. (Eds.). (1971). Handbook on the formative and summative evaluation of student learning. McGraw-Hill. https://archive.org/details/handbooko nformat00bloo/page/n7/mode/2up Deneen, C. C., & Brown, G. T. L. (2016). The impact of conceptions of assessment on assessment literacy in a teacher education program. Cogent Education, 3(1), 1–14. https://doi.org/10.1080/ 2331186X.2016.1225380 Dyer, K. (2014). Formative, summative, interim: Putting assessment in context. Retrieved from https://www.nwea.org/blog/2014/formative-summative-interim-putting-assessment-context/ Gibbs, G. (2010). Using assessment to support student learning. Leeds Met Press. https://core.ac. uk/download/pdf/42413277.pdf Hargreaves, E. (2013). Inquiring into children’s experiences of teacher feedback: Reconceptualising assessment for learning. Oxford Review of Education, 39(2), 229–246. https://doi.org/10. 1080/03054985.2013.787922 Harlen, W., & Winter, J. (2004). The development of assessment for learning: Learning from the case of science and mathematics. Language Testing, 21(3), 390–408. https://doi.org/10.1191/ 0265532204lt289oa Hattie, J. A. C. (2003). Formative and summative interpretations of assessment information. University of Auckland. https://assessment.tki.org.nz/Media/Files/DEF-files/Formative-and-Summat ive-Interpretations-of-Assessment-Information Heitink, M. C., Van der Kleij, F. M., Veldkamp, B. P., Schildkamp, K., & Kippers, W. B. (2016). A systematic review of prerequisites for implementing assessment for learning in classroom practice. Educational Research Review, 17, 50–62. https://doi.org/10.1016/j.edurev.2015.12.002 Kay, R., & Knaack, L. (2009). Exploring the use of audience response systems in secondary school science classrooms. Journal of Science Education and Technology, 18(5), 382–392. https://doi. org/10.1007/s10956-009-9153-7 Looney, J. W. (2011). Integrating formative and summative assessment: Progress toward a seamless system? OECD Education Working Papers 58. OECD Publishing. https://doi.org/10.1787/5kg hx3kbl734-en

86

S. C. Zepeda

Moss, C. M., & Brookhart, S. M. (2009). Advancing formative assessment in every classroom: A guide for the instructional leader. ASCD. http://www.daneshnamehicsa.ir/userfiles/files/1/7-% 20Advancing%20Formative%20Assessment%20in%20Every%20Classroom.pdf Moss, C. M. (2013). Research on classroom summative assessment. In J. H. McMillan (Ed.), Sage handbook of research on classroom assessment (pp. 235–255). SAGE. http://www.danesh namehicsa.ir/userfiles/files/1/7-%20SAGE%20Handbook%20of%20Research%20on%20Clas sroom%20Assessment.pdf OECD (2005). Formative assessment: Improving learning in secondary classrooms. OECD Policy Brief . OECD. https://www.oecd.org/education/ceri/35661078.pdf Ogle, D. (1986). K-W-L: A teaching model that develops active reading of expository text. The Reading Teacher, 39(6), 564–570. https://doi.org/10.1598/RT.39.6.11 Perie, M., Marion, S., Gong, B., & Wurtzel, J. (2007). The role of interim assessments in a comprehensive assessment system: A policy brief . Aspen Institute. https://files.eric.ed.gov/fulltext/ ED551318.pdf Rasmussen, J. B. (2017). Formative assessment strategies. Bethel University https://www.bethel. edu/faculty-development/files/formative-assessment-strategies.docx Rea-Dickins, P. (2006). Currents and eddies in the discourse of assessment: A learning-focused interpretation. International Journal of Applied Linguistics, 16(2), 163–188. https://doi.org/10. 1111/j.1473-4192.2006.00112.x Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. https://doi.org/10.1007/BF00117714 Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagne, & M. Scriven (Eds.), Perspectives of curriculum evaluation. American Educational Research Association (AERA) Monograph Series on Curriculum Evaluation (pp. 39–83). Rand McNally Shepard, L. A. (2006). Classroom assessment. In R. Brennan (Ed.), Educational measurement (4th edn., pp. 623–646). Praeger Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14. https://doi.org/10.3102/0013189X029007004 Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189. https://doi.org/10.3102/0034654307313795 Stobart, G. (2008). Testing times: The uses and abuses of assessment. Routledge. https://doi.org/ 10.4324/9780203930502 Wiliam, D. (2000). Integrating formative and summative functions of assessment. Working Group 10 of the International Congress on Mathematics Education, Makuhari, Tokyo. https://www. researchgate.net/publication/311806066_Integrating_summative_and_formative_functions_ of_assessment Wiliam, D. (2009). An integrative summary of the research literature and implications for a new theory of formative assessment. In Andrade, H., & Cizek, G.J. (Eds.), Handbook of formative assessment (pp. 18–40). Routledge. https://doi.org/10.4324/9780203874851 Wiliam, D., & Black, P. (1996). Meanings and consequences: A basis for distinguishing formative and summative functions of assessment? British Educational Research Journal, 22(5), 537– 548. https://doi.org/10.1080/0141192960220502 Wolfel, R. (2009). Classroom assessment: The confusion of many voices. Retrieved from http:// www.usma.edu/cfe/Literature/Wolfelre/Wolfel_09.pdf

Sandra C. Zepeda is Social Worker. Sandra C. Zepeda did Master in Educational Evaluation Pontificia Universidad Católica de Chile. Sandra C. Zepeda is Lecturer at the UC Faculty of Education in undergraduate and postgraduate training programs and Specialist in curriculum development and evaluation for learning in school and higher education.

5

Effective Feedback and Its Potential to Enhance Learning Sandra C. Zepeda

Abstract

This chapter takes a broad look at feedback from its conceptualization, the paradigm shift (from the one who teaches to the one who learns), the characteristics of effective feedback and the power it has in student learning. It delves into the types of feedback and key aspects such as when to provide feedback, what it should contain and the modality, among others. Finally, some practices are suggested to carry out effective feedback that will help teachers to implement them in their classrooms.

5.1

Introduction

Several researchers argue that feedback is a powerful influence in improving student learning (Black & Wiliam, 1998; Hattie & Jaeger, 1998; Hattie et al., 1996). The conclusion of meta-analysis studies is categorical regarding the effect of feedback, as it is among the top ten actions that influence learning achievement (Hattie & Jaeger, 1998; Kluger & DeNisi, 1996). However, it has been observed that the variance of the effects found is considerable, indicating that some types of feedback are more powerful than others (Kluger & DeNisi, 1996). Thus, it would seem that the mere prescription of a large number of comments or the frequency of them does not ensure that learning will take place and that there are elements associated with those comments that condition their effectiveness on students. There is evidence that indicates that although feedback is frequent in the classroom, most of this information is disregarded by students and its use in the revision of their work is scarce, since most of the comments that teachers make S. C. Zepeda (B) Pontificia Universidad Católica de Chile, Santiago, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_5

87

88

S. C. Zepeda

to the course are not considered relevant by their students (Carless, 2006). It has been found that teachers find their feedback much more valuable than what students actually think about it, who say that the feedback they receive is confusing and does not guide them in what they should do to improve their performance (Goldstein, 2006; Nuthall, 2007). Based on the above, the question arises as to how feedback should be given to students in order to achieve learning, when it is most effective, who it depends on, what should be said to him or her and what should not be said to him or her. It seems that we know a lot about the power of feedback to generate effects on student learning, but we do not know how to take advantage of this power and make it work more effectively in the classroom in a sustainable way over time. Throughout this chapter we will examine the above questions and delve into the principles and characteristics of effective feedback and the different levels at which it occurs, critically analysing its contribution to learning. We will review examples to empirically identify effective and ineffective feedback practices.

5.2

Feedback

We will now discuss the core elements of the definition of feedback and, from there, we will examine the evolution of this concept. Ramaprasad (1983, p. 4) argues that the distinguishing feature of feedback is that information generated within a system must have some effect on the system. The author states that “feedback is information about the gap between the actual level and the reference level of a system parameter which is used to alter the gap in some way”. Sadler (1989), on the other hand, argues that an important feature of Ramaprasad’s definition is that information about the distance or gap between the current level and the baseline is considered feedback as long as it is used to reduce that distance. If the information is only recorded or passed on to another person who has neither the knowledge nor the power to change the outcome, or if the information is too deeply encoded to lead to appropriate action (e.g., a grade point average), the implicit information cannot be reconstructed for the feedback to be effective. Both authors show the historical and initial definition of feedback, associated with engineering systems. The strong idea that is gathered from these approaches, for the educational context, is that feedback cannot be separated from its consequences in learning. For this reason, it is worrying that the aspect that has prevailed in everyday language has been the information that is given, and the second part of this historical definition, which speaks of the use of this information for the improvement of the system, has not been considered.

5 Effective Feedback and Its Potential to Enhance Learning

5.3

89

From a Teacher-Centred View to a Learner-Centred View

The first studies and theories on feedback are almost 100 years old and emerged from the behaviourist psychological perspective (Thorndike, 1913 as cited in Brookhart, 2008, p. 3). In this context, positive feedback was conceived as “positive reinforcement” and negative feedback as “punishment” and was considered to affect learning. Therefore, the concern was focused on studying whether it was an effective resource. Based on this tradition, some research has addressed the different applications of feedback, focusing on what the teacher does, that is, when the comments given are effective and when they are ineffective (Bangert-Drowns et al., 1991; Butler & Winne, 1995; Hattie & Timperley, 2007; Kluger & DeNisi, 1996). Other research has focused on characterizing effective feedback, asking about the type of feedback delivered and the consequences for learning (Johnston, 2004; Tunstall & Gipps, 1996). The theoretical bases of education have changed; they no longer have a behaviourist view of learning, associated with stimulus–response connections. The most recent studies recognise the centrality of the student in the feedback process and have focused on knowing the type of feedback given, the context in which it takes place, and on investigating how the feedback information reaches the student, how he/she interprets it and “filters” it through his/her perception, mediated by previous knowledge, experiences, and motivation. The challenge for the learner is to understand the assessment task, to understand and adhere to their learning goal, to receive and interpret the information, and to set themselves on a path to enhance their learning, rather than to respond mechanically to particular stimuli. This process is identified with self-regulation of learning. Butler and Winne’s (1995) research shows that both external feedback (delivered by the teacher, for example) and internal feedback (the student’s own self-assessment) strongly influence the student’s knowledge and metacognitive beliefs and skills. Both sources of feedback allow them to develop self-regulation of their own learning process and, therefore, to create or select tactics and strategies to achieve and produce the required task or challenge. One of the most relevant aspects of the learner-centred focus is to recognise that the comments or information provided by the teacher do not assure that the student does what is necessary to enhance his or her learning, since it is the student who decides and who should mobilize to close the gap. The teacher delivers a commentary that becomes an input and seen as such, feedback depends centrally on what the student does with this input in order to achieve his or her learning. Returning to the original concept of feedback, we note that it is explicitly linked to the learner-centred approach, as it requires ensuring that the feedback loop has been completed (Boud & Molloy, 2012). As Sadler (1989) argues, without providing strategies to enhance learning and without tracking how the feedback subsequently influences student performance, feedback can be seen simply as “dangling data” or comments that do not lead or guide action.

90

S. C. Zepeda

By way of synthesis, from the tour through the concept of feedback we can observe that its origin comes from engineering and later, when it reaches the field of Education, a unique quality of educational feedback is added: the role of the student, since it is not considered a passive object that responds to a stimulus, but a person who thinks, makes decisions and acts. In this approach, feedback is positioned not as an episodic act linked to correcting the evaluated tasks, but as a key aspect that integrates the design of teaching and the teaching/learning process developed in the classroom (Molloy & Boud, 2012).

5.4

The Power of Feedback

Ramsden (2003) argues that effective feedback on students’ work represents one of the key features of quality teaching. Hounsell (2003, p. 67) notes that: “It has long been recognized, by researchers and practitioners alike, that feedback plays a decisive role in learning and development, within and beyond formal educational settings. We learn faster, and much more effectively, when we have a clear sense of how well we are doing, and what we might need to do in order to improve.” In a discussion of the conditions under which assessment supports learning, Gibbs and Simpson (2004) highlight the importance of feedback that is understandable, timely and guides students to action. Yorke (2003), on the other hand, indicates that despite its importance, the literature reveals that students are often dissatisfied with the feedback they receive, in terms of lacking specific advice for improvement (e.g., see Higgins et al., 2001), finding it difficult to interpret (Chanock, 2000) or receiving from it a potentially negative impact on their self-perception and confidence (James, 1992). Shute (2008) describes a controversial aspect of feedback, as it is one of the most instrumentally powerful features of instructional design, yet the least understood. Feedback provided during the teaching/learning process, as part of formative assessment, is key and Wiggins (2012) argues that this ongoing feedback will ultimately increase student achievement by allowing opportunities for students to reshape their performance to better achieve the learning objectives.

5.5

Effective Feedback

In this section we will try to answer the question: when is feedback effective? Studies have shown both positive and negative effects when it is used in the educational environment. Effective feedback is defined as information communicated to the learner with the intention of modifying their thinking for the purpose of improving learning (Shute, 2008), it bridges the gap between the learners’ current level of understanding and the desired learning goal. It helps students understand the relationship between a clearly defined set of criteria or standards and their current level of performance (Clark, 2011).

5 Effective Feedback and Its Potential to Enhance Learning

91

Although the teacher can also receive information that allows him to modify his teaching process, the focus of this feedback is the student, so it is key that it is correct and enables significant improvement in learning processes (Boud & Molloy, 2012). Wiliam (2012) uses an analogy to explain feedback: just as a thermostat adjusts the room temperature, effective feedback helps maintain a supportive environment for student learning. Winne and Butler (1994) argue that feedback is information with which the learner can confirm, add to, overwrite, refine, or restructure information in their work, whether that information pertains to disciplinary knowledge, metacognitive knowledge, beliefs about themselves and about tasks, or cognitive and innovative strategies. For the effect of feedback to be powerful it has to take place in a learning context, as it should be considered an integral part of the teaching process, where it temporarily unfolds after a student has responded to an initial task during teaching (Hattie & Timperley, 2007).

5.6

The Role of Context

Effective feedback must be part of a classroom climate or context where students view and receive feedback, criticism, or recommendations as positive and understand that learning cannot occur without practice or action to improve it. When we are faced with a classroom culture that does not allow room for mistakes and only emphasises doing things right, then it is judged as wrong when a task needs improvement. In contrast, in a classroom culture that values searching, persistent work, and the use of errors to enhance learning, students will be able to use feedback to plan and execute the tasks necessary to enhance their learning (Brookhart, 2008). In this culture of effective feedback, it does not seem appropriate for students to be given feedback and not have opportunities to use it. Nor is it appropriate for students to be given constructive feedback and then for the teacher to use it as a measure for summative assessment by assigning a low final grade.

5.7

Essential Components of Effective Feedback

Sadler (1989) described three essential components of feedback, which include: (1) First, information about the learning objective, its standards and assessment criteria. (2) Second, information on the level achieved in the product, performance, or task being executed. (3) Finally, strategies for addressing the gap between the learning objective and the assessment task being developed.

92

S. C. Zepeda

Feedback is efficient when it focuses on the cognitive nature of the task and its associated metacognitive processes, and is organised to answer three main questions: (1) Where am I going? associated with understanding what the learning goal is. (2) How am I doing, where am I? linked to the current level of progress of the assessment task and, consequently, of student learning. (3) How do I keep moving forward: associated with the strategies or recommendations given to the learner to close the learning gap. The powers of feedback are deployed from the receiver rather than the giver. It is therefore very relevant to examine when, how and what content it should have in order for the effective learning cycle to unfold from the learner’s perspective (Hattie & Gan, 2011).

5.8

Types of Feedback and Their Effect on Learning

The effects of feedback can also be analysed in terms of its nature. Hattie and Timperley (2007) reviewed multiple research studies and were able to synthesize a four-level model of feedback: (1) The first level is associated with task-focused feedback, providing information about aspects of the task, such as whether the answers were correct or incorrect, or giving further instructions for more information, or specifying the steps of the task under evaluation. (2) The second level is associated with feedback on the processing of the task and is linked to the process that is developed to generate and implement the task. For example, feedback is given on the strategies used to address the task, whether they were appropriately selected and accurately applied. (3) The third level is associated with the feedback of the student’s self-regulation, addressing the metacognitive strategies used by the student to face the task and, also, appealing to the student’s self-concept and confidence in his or her learning ability and motivation to achieve it. (4) The fourth level is associated with comments or reinforcement made by the teacher to the student as a person. It considers phrases that approve or disapprove, such as: “you are good”, “you are intelligent”, “you are careless”. The authors state that the levels on which effective feedback focuses correspond to the first, second, and third levels, since all three make it possible to shorten the learning gap. As for the fourth level, they are categorical in maintaining that it does not allow for improving learning nor does it mobilize students to shorten the distance toward the goal, since it focuses on qualities of the person that are not related to the level of current and future learning and its gap. Table 5.1 summarizes these four levels of feedback.

5 Effective Feedback and Its Potential to Enhance Learning

93

Table 5.1 Feedback levels (Hattie & Timperley, 2007) First level

Second level

Third level

• Focused on the task • Focused on the • Focused on the process self-regulation of • Effective in learning supporting learning • Effective in supporting learning • Effective to regulate learning and maintain motivation towards it

Fourth level • Focused on the person • Not effective in guiding learning

We will examine a second classification of feedback, but in this case the criterion answers the question of the content of the feedback, not its level, as in the previous model. This typology classifies all forms of feedback, both effective and ineffective in enhancing learning. It was developed by Tunstall and Gipps (1996) from a study that investigated the different types of feedback given to children in Basic Education in the area of Mathematics in the United Kingdom. There are four types of feedback (A1, A2, B1, B2) on the left side of Table 5.2 that are not effective for learning, as they focus on positive or negative reinforcement. These four types focus on “judging” the learner and do not aim at improving the assessment task. On the right-hand side of the table are four types of feedback (C1, C2, D1, D2) that are effective in improving learning, as they focus on describing the achieved and unachieved aspects of the assessment task, along with suggesting strategies for improvement. These four types focus on “describing” the achievements and the gap with respect to the assessment task.

5.9

Key Aspects Associated with Effective Feedback

A set of aspects that will enable effective feedback to be addressed consistently will be described below.

5.9.1

When Do I Give Feedback?

5.9.1.1 Before the Summative Assessment Feedback provided mainly after summative assessments often comes too late in the process, so students may find it meaningless (Huxham, 2007). According to this recommendation, formative feedback needs to be delivered when students are still aware of the learning outcomes and have time to act on the feedback to improve, rework the assessment task, product, or performance. This may include returning a test or assessment task the next day or giving immediate oral responses to students’ errors or inaccurate concepts (Shute, 2008). It is recommended (Rasmussen, 2017; Shute, 2008) that feedback is delivered to the student during formative assessment. It is argued that student motivation and effort also increase when teachers use formative feedback to address this gap.

94

S. C. Zepeda

Table 5.2 Typology of teacher feedback to students (Tunstall & Gipps, 1996) Positive feedback Achievement feedback Evaluative feedback (+)

Descriptive feedback (+)

A1 Rewarding Positive reinforcement or reward focused entirely on the student’s person (e.g., gift giving)

B1 Approving Verbal and non-verbal approval of the teacher to the person or work of the student (e.g.: “excellent”, approvals, “good girl”)

C1 Specifying attainment Identifies specific aspects of the task that are accomplished, which supports learning (“what you have done … is well accomplished”)

D1 Constructing achievement Identifies (along with the student) the metacognitive strategies used to learn

A2 Punishing Negative reinforcement or comments. Focused entirely on the person of the student (e.g., kicking out of room, destroy work)

B2 Disapproving Verbal and nonverbal teacher disapproval of the student’s person or work (e.g., crosses and dashes at work, “I’m disappointed, I expected more from you, but you’re lazy”)

C2 Specifying improvement It focuses on the achievements and/or errors of the work carried out and its relationship with learning, rather than on people

D2 Constructing the way forward Describes (along with the student) future possibilities in the learning achieved and its links with the transfer to other situations of the same nature

Evaluative feedback (−) Negative feedback

Descriptive feedback (−) ↔

Improvement feedback

Classroom climate becomes a positive element in learning rather than an aspect that generates high grade anxiety (Fluckiger et al., 2010).

5.9.1.2 Timeliness Feedback has been shown to be most effective when it is timely, linked to expectations, and includes specific suggestions on how to enhance performance on learning goals (Looney, 2011). Other authors argue that feedback is most effective when it is provided immediately or at most within a period of days (Wiliam, 2006). On the other hand, for some types of assessment tasks, feedback should not be provided too quickly, as the learner is expected to have the opportunity to reflect on his or her own learning process. Regarding the timeliness of feedback, some authors indicate that it should be immediate when dealing with complex tasks, or in the case of students who have demonstrated low levels of performance. On the other hand, it should be more distant when dealing with tasks that require prior revision or reflection by the student.

5 Effective Feedback and Its Potential to Enhance Learning

95

In summary, it is recommended that feedback be given before learning is graded, so that the student has an opportunity to improve his or her learning. Feedback is advised to be immediate in the case of practical knowledge but is expected to be deferred in time for conceptual knowledge, in order to allow the student to reflect on his or her learning before receiving feedback from the teacher.

5.9.2

What Should Be the Content of Effective Feedback?

5.9.2.1 Information Needed A key concept in understanding what the content of feedback should be is to consider scaffolding, that is, providing as much or as little information as the learner needs to reach the next level of their learning goal. Effective feedback gives students what they need to understand where they are in their learning and leads or guides them on what to do next: the cognitive factor. Once they feel they understand, they develop the perception that they have control over their own learning process, which is associated with a strong motivational factor that keeps them working on the assessment task and keeps them persisting until they reach the learning goal (Brookhart, 2008). Effective feedback contains information that a learner can use, which means they have to be able to hear it (or read it if it is written) and understand it. 5.9.2.2 Task-, Process-, and Self-regulation-Focused From the point of view of the model of levels of feedback already described in the previous section (Hattie & Timperley, 2007), feedback can focus on the task that the student is developing (first level), on the process or strategies that allow the task to be developed (second level), and on the metacognitive strategies that regulate and monitor the improvement of the task and its process (third level). Within the feedback developed for the three levels (task, process, selfregulation) appears in what Narciss (2008) calls “informative tutoring feedback” to refer to the development of feedback strategies that provide elaborated information (guidelines, rubrics, detailed comments, specific recommendations) to guide students towards the successful completion of the evaluative task. Elaborated informative feedback can take various forms, for example: (a) task rules, task constraints, and task requirements, (b) conceptual knowledge linked to the task, (c) errors or mistakes detected, (d) procedural knowledge, and (e) knowledge of metacognitive strategies. This feedback as tutoring focuses on guiding students in detecting errors, overcoming obstacles, and applying more effective strategies to complete the learning task (Narciss, 2008). 5.9.2.3 Prioritize Information Recommendations on what to give feedback on indicate that the important aspects to highlight should be prioritized, choosing those that are directly related to the main learning goals and, finally, the student’s level of development should be considered (Brookhart, 2008).

96

5.9.3

S. C. Zepeda

How to Develop Effective Feedback?

5.9.3.1 Modality Feedback can be delivered in written, oral, or video modalities. Some types of assessment tasks lend themselves better to written feedback, for example, the review of student-produced texts, where written comments are entirely relevant. Oral feedback is more appropriate for observable performances, such as watching and commenting on how students solve mathematic problems through individual work. Some of the best feedback can result from brief interviews with the student. For example, rather than immediately telling you what they have accomplished and what they need to improve in their work, you can begin by asking questions such as: What was the hardest part of this assignment for you, how did you solve it, was there anything that surprised you? These reflective questions have the potential to collaborate productively so that feedback becomes a space where teacher and students have a dialogue about learning (Brookhart, 2008). Decisions about the mode of feedback should first consider the nature of the assessment task, as well as the reading ability of the students and, also, the timing and opportunity for giving feedback. Speaking generally with them is usually better, because a conversation can be developed, but this has the difficulty that there is not enough time to discuss it with each one. Given this restriction, Boud and Molloy (2012) recommend recording short audios or videos (mini videos) in which the teacher can describe to students the achievements and gaps in the assessment task and offer or recommend specific strategies to overcome them.

5.9.4

Who Provides Feedback?

5.9.4.1 Sustainable Retrofitting, a Look into the Future The traditional approach to feedback, based on a unilateral notion in which information is transmitted from the teacher to the student, should move towards a more multilateral approach. This places students as active subjects who generate their own judgments about the assessment task, drawing on information from various other actors (teachers, peers, and themselves) (Boud & Molloy, 2012). Multilateral feedback not only provides the learner with an additional source of data to supplement their learning but seeking and making sense of feedback from multiple sources is also a key practice for lifelong learning. This is why this type of feedback has been called “sustainable feedback”, as it relies not only on the teacher to support learning, but on the subject’s ability to self-regulate their own learning (Boud & Molloy, 2013). For this multi-sided approach to sustainable feedback to be deployed in classroom practices, it must be part of the overall course design and the teacher must organise it so that it has a place in the teaching planning. If feedback only falls at the individual teacher level, it could become an episodic mechanism in the development of teaching, due to the high time demand it requires, thus reducing its

5 Effective Feedback and Its Potential to Enhance Learning

97

potential. On the other hand, if peers and students themselves join this practice, its effect is amplified, developing self-regulation in them, and making them more aware of their goals and of their learning process. Considerations about timing, content, role of students and teacher should be resolved before, the design and planning of the teaching/learning process. Therefore, feedback should not be a complement to teaching activities and assessment tasks, but a feature of the planning of teaching and its subsequent implementation in the classroom. The nature of the information provided needs to focus on its effects on the learner and what they can do with it, rather than analysing the information itself from decontextualised qualities. The conclusion to be drawn from this statement is that no single form, mode, or strategy of feedback is appropriate in itself, but that it must consider different purposes in different contexts, the expectations for improvement, the complexity of the assessment task and the role that each agent will play in its development.

5.10

Practices for Effective Feeding-Back

Based on a series of studies, Nicol and Macfarlane-Dick (2006) and Wiggins (2012) have shown that feedback can lead to substantial improvements in learning. The authors propose seven good practices associated with effective feedback, which we will examine in detail: First Good Practice: Helping to clarify what is good performance involves explicitly defining the learning objectives and, above all, that students understand the assessment criteria and standards expected for the completion of the assessment task, product, or performance. Examples to implement the practice One example shown to be particularly powerful in learning objectives and clarifying criteria has been to provide students with ‘exemplars’ of performance (Orsmond et al., 2002). Exemplars (products produced by students in previous teaching periods) are effective because they make what is required explicit and define a valid standard by which students can understand more empirically what is expected of their work. Other strategies that have proven to be effective in clarifying criteria, standards and learning objectives for students also consider the increasing development of self-regulation: • Provide a better definition of the requirements of the assessed job or task using carefully constructed criteria, developing written descriptions and definitions of the level of performance. • Promote discussion and reflection on the criteria and rules of the class (e.g., before the assessment task).

98

S. C. Zepeda

• Promote student participation in formative assessment exercises, in which they can comment on the work of other students in relation to the defined criteria and standards. • Develop workshops in which students, in collaboration with the teacher, can design or negotiate their own assessment criteria for a piece of work. Second Good Practice: Facilitate the development of self-assessment as an internal reflection of the student about the level that his or her product, work or performance is reaching, based on the standards expected for the fulfilment of the learning objective. Examples to implement the practice Examples of structured reflection and student self-assessment are presented: • Ask them to identify the type of feedback they would like to receive on their work or task. • Ask students to recognise strengths and weaknesses in their own work in relation to the criteria or standards before submitting it for teacher feedback. • Ask them to reflect on their achievements and select evidence of their work for inclusion in a portfolio. • Ask the learner to reflect on performance milestones and progress towards the next stage of the assessment task prior to feedback on the task (Cowan, 1998). An additional method involves providing them with opportunities to assess and provide feedback on their peers’ work. Peer assessment assists in providing opportunities for students to learn to make objective judgements based on the standards; this learning can be transferred when students re-produce and regulate their own work (Boud et al., 1999; Gibbs, 1999). Third Good Practice: Providing high quality feedback to students about their learning. Research shows that teachers play a key role in developing their students’ capacity for self-regulation and are an important source of external feedback. Teacher feedback is a source by which students can assess the progress of their assessment task and allows them to compare their own internal assessments with the objectives, criteria, and standards. Teachers are much more effective at identifying conceptual errors or mistakes in the work or assessment task than students themselves or their peers, so teacher feedback complements the information obtained through self-assessment processes. In terms of what is meant by quality external feedback, we refer to feedback that provides information that helps students to address their own performance and self-correction, i.e., helps them to make decisions to reduce the gap between their intentions and the resulting effects.

5 Effective Feedback and Its Potential to Enhance Learning

99

Examples to implement the practice Some strategies that increase the quality of external feedback provided by teachers are presented: • Ensure that the feedback is valid, which implies that it is based on predefined criteria. • Provide feedback in a timely manner, i.e., before the summative assessment, so that learners have the opportunity to modify or rewrite the assessed work or assignment. • To provide corrective counselling, not only focusing information on strengths and weaknesses, but also supporting them with strategies to improve the difficulties detected. • Prioritise and order (e.g., temporally) the number of aspects to be improved in the assessment task, considering priorities, sequences, and intermediate learning goals. • Provide online testing in which students can access their test results anytime, anywhere, and as often as they need to. Fourth Good Practice: Promote dialogues about learning with the teacher and peers. Under a self-regulatory approach, for external feedback to be effective it must be internalized by the learner before it can be used to make productive improvements in their performance or task. However, in the research literature (Chanock, 2000; Hyland, 2000) there is much evidence that students do not understand externally given feedback (e.g., “your argument is incomplete”, “this essay is not analytical enough”) and therefore cannot use it to take action for improvement, as they do not know what to do to close the gap (regarding examples: they do not know what the argument is missing or how to complete it, they do not know how to craft a more analytical essay). One strategy for increasing the effectiveness of external feedback, and the likelihood that the information provided will be understood and used by students, is to conceive of feedback more as a dialogue than as the transmission of information. Dialogue means that the student not only receives initial feedback information but has the opportunity to engage in conversation with the teacher about learning and his or her assessment task. Unfortunately, class sizes make it difficult to implement this strategy, so the use of information and communication technologies and peer-to-peer dialogue is recommended. Examples to implement the practice The following are examples of dialogic feedback strategies that support student self-regulation: • Provide feedback using the “one-minute response” technique in class (Angelo & Cross, 1993). This is a very short writing activity that can be done at the end

100



• • •

S. C. Zepeda

of class (taking one minute or less to complete) in which the student responds to a question posed by the teacher. This technique leads students to reflect on the day’s session and provides the teacher with very relevant information about the student’s understanding, application, or assessment of learning. Review external feedback: students are asked to read the feedback comments they have been given earlier on an assignment and discuss them with their peers (they may also be asked to suggest strategies for improving performance next time). Ask learners to review the external feedback they receive and highlight one or two examples of what they found useful in it and explain how it helped them to improve their assessment task. Ask students to give each other descriptive peer feedback on their work, based on the presented previously assessment criteria. Review an initially requested project as a group. This is so that students can discuss both the assessment criteria and their standards before the project begins.

Fifth Good Practice: Enhancing and strengthening positive motivational beliefs and self-esteem. As we have already noted, motivation, and self-esteem play an important role in learning and assessment. Dweck’s (1999) studies show that students have different motivational structures depending on their beliefs about learning. These structures or frames of reference about motivation affect students’ responses to external and internal feedback and thus influence their engagement in self-regulated learning. Research in school settings has shown that frequent assessment composed only of marks and grades has a negative impact on motivation for learning that works against preparation for lifelong learning (Harlen & Crick, 2003). Dweck (1999) argues that these assessments cause students to focus only on the achievement of the specific test goal, rather than on the learning objectives. Learners need to understand that feedback is an evaluation not of the individual but of their performance in a specific context. This is true of feedback derived from external sources as well as feedback generated through self-assessment. Examples to implement the practice Examples of strategies that help foster high levels of motivation and self-esteem in students are presented: • Provide marks on written work only after students have responded to feedback comments (Gibbs, 1999). • Allow time for them to rewrite or rework their assessment task or work, as this will help to change the learners’ expectations of the purpose and learning objectives. This timeframe should be realistic. • Apply automated tests that provide immediate feedback.

5 Effective Feedback and Its Potential to Enhance Learning

101

Sixth Good Practice: Provide opportunities to close the gap between current and desired performance. This practice is directly linked to self-regulation, as the underlying question is how feedback influences the behaviour and schoolwork produced by the student. According to Yorke (2003), two questions can be distinguished with respect to external feedback. Firstly, is the feedback of the best quality, and secondly, are changes in student learning being made? Many authors have focused on the first question, but the second is equally important. External feedback provides an opportunity to close a gap between actual performance and the teacher’s expected performance. As Boud and Molloy (2012) point out: the only way to know if the results of feedback enhance learning is for students to display some kind of response to complete the feedback loop. This is one of the most frequently overlooked aspects of formative assessment. Examples to implement the practice Examples of strategies that help students use external feedback to regulate and close the achievement gap are presented: • Provide feedback on work in progress and increase opportunities to rework or rewrite it. • Introduce assessment tasks in two stages of the evaluation cycle: a first formative stage, with feedback, and a second stage in which they incorporate the recommendations and improve their task, work, or performance (Gibbs & Simpson, 2004). • Model strategies that should be used to close a performance gap in class (e.g., teacher models in front of students how to structure an essay from a question). • Specifically and explicitly provide some areas for improvement or courses of action, along with the delivery of normal feedback to the student. • Engage students in group work aimed at identifying their own performance points in class, after having read the comments on their evaluation assignments. This strategy makes it possible to integrate feedback into teaching and to involve students more actively in the generation and intended use of feedback. Seventh Good Practice: Provide information to teachers to help enrich the form and strategies for teaching. This practice seeks to provide good feedback to teachers and as Yorke (2003) points out, assessment has an effect on the teacher, who can adapt their teaching based on the feedback. It enables the teacher to discover students’ difficulties or errors with learning (e.g., conceptual errors), to analyse the effectiveness of teaching methods, to develop frequent assessment tasks, especially initial formative tests, as they give him/her accumulated information about students’ levels of knowledge and ability, so that he/she can adapt his/her teaching accordingly.

102

S. C. Zepeda

Examples to implement the practice Some strategies to exemplify how teachers can use this practice to help them generate and assess the quality of information about their students’ learning are: • Have students ask for the feedback they would like to receive when they turn in an assessment task (e.g., on a chart containing all the assessment criteria submitted at the beginning, mark only those where that require feedback from the teacher). • Ask students to identify where they are having difficulty with the assessment task. • Ask them in groups to come up with a relevant question, based on their previous study of the assessment task, their standards and learning goals, focusing on what they would like to explore in the short term. In summary, throughout this chapter we have explored the trajectory of the concept of feedback, described its main characteristics and analysed when it is effective in ensuring student learning. In addition, we have reviewed the potential of feedback for the self-regulation of student learning, an aspect that highlights the students and their peers as key actors for the development of metacognitive processes and motivational factors that facilitate the improvement of learning. Finally, we present examples of how to implement good assessment and feedback practices in the classroom and, thus, unfold their full formative potential. In the next chapter we will delve into how students are a key agent in making formative assessment and feedback sustainable classroom practices, analysing their role in self- and peer-assessment and strategies for including them as assessment agents.

References Angelo, T. A., & Cross, K. P. (1993). Classroom assessment techniques: A handbook for college teachers (2nd ed.). Jossey-Bass. Bangert-Drowns, R. L., Kulik, C. L. C., Kulik, J. A., & Morgan, M. T. (1991). The instructional effect of feedback in test-like events. Review of Educational Research, 61(2), 213–238. https:// doi.org/10.3102/00346543061002213 Black, P. J., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy and Practice, 5(1), 7–74. https://doi.org/10.1080/0969595980050102 Boud, D., Cohen, R., & Sampson, J. (1999). Peer learning and assessment. Assessment & Evaluation in Higher Education, 24(4), 413–426. https://doi.org/10.1080/0260293990240405 Boud, D., & Molloy, E. K. (2012). Decision-making for feedback. In D. Boud & E. K. Molloy (Eds.), Feedback in higher and professional education: Understanding it and doing it well (1st ed., pp. 202–217). Routledge. https://doi.org/10.4324/9780203074336 Boud, D., & Molloy, E. K. (2013). Rethinking models of feedback for learning: The challenge of design. Assessment & Evaluation in Higher Education, 38(6), 698–712. https://doi.org/10.1080/ 02602938.2012.691462 Brookhart, S. M. (2008). How to give effective feedback to your students. Association for Supervision and Curriculum Development [ASCD].

5 Effective Feedback and Its Potential to Enhance Learning

103

Butler, D. L., & Winne, P. H. (1995). Feedback and self-regulated learning: A theoretical synthesis. Review of Educational Research, 65(3), 245–281. https://doi.org/10.3102/00346543065003245 Carless, D. (2006). Differing perceptions in the feedback process. Studies in Higher Education, 31(2), 219–233. https://doi.org/10.1080/03075070600572132 Chanock, K. (2000). Comments on essays: Do students understand what tutors write? Teaching in Higher Education, 5(1), 95–105. https://doi.org/10.1080/135625100114984 Clark, I. (2011). Formative assessment: Policy, perspectives and practice. ERIC. Retrieved November 25, 2022, from https://files.eric.ed.gov/fulltext/EJ931151.pdf Cowan, J. (1998). On becoming an innovative university teacher. Open University Press. Dweck, C. S. (1999). Self-theories: Their role in motivation, personality and development (1st ed.). Psychology Press. https://doi.org/10.4324/9781315783048 Fluckiger, J., Tixier, Y., Vigil, Y., Pasco, R., & Danielson, K. (2010). Formative feedback: Involving students as partners in assessment to enhance learning. College Teaching, 58(4), 136–140. https://doi.org/10.1080/87567555.2010.484031 Gibbs, G. (1999). Using assessment strategically to change the way students learn. In S. Brown & A. Glasner (Eds.), Assessment matters in higher education: Choosing and using diverse approaches (pp. 41–53). Open University Press. Gibbs, G., & Simpson, C. (2004). Conditions under which assessment supports students’ learning? Learning and Teaching in Higher Education [LATHE], 1, 3–31. https://eprints.glos.ac.uk/3609/ Goldstein, L. (2006). Feedback and revision in second language writing: Contextual, teacher and student variables. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 185–205). Cambridge University Press. https://doi.org/10.1017/CBO978 1139524742.012 Harlen, W., & Crick, R. D. (2003). Testing and motivation for learning. Assessment in Education: Principles, Policy & Practice, 10(2), 169–207. https://doi.org/10.1080/0969594032000121270 Hattie, J., Biggs, J., & Purdie, N. (1996). Effects of learning skills intervention on student learning: A meta-analysis. Review of Educational Research, 66(2), 99–136. https://doi.org/10.2307/117 0605 Hattie, J., & Gan, M. (2011). Instruction based on feedback. In R. E. Mayer & P. A. Alexander (Eds.), Handbook of research on learning and instruction (1st ed., pp. 249–271). Routledge. https://doi.org/10.4324/9780203839089 Hattie, J., & Jaeger, R. (1998). Assessment and classroom learning: A deductive approach. Assessment in Education: Principles, Policy & Practice, 5(1), 111–122. https://doi.org/10.1080/096 9595980050107 Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. https://doi.org/10.3102/003465430298487 Higgins, R., Hartley, P., & Skelton, A. (2001). Getting the message across: The problem of communicating assessment feedback. Teaching in Higher Education, 6(2), 269–274. https://doi.org/ 10.1080/13562510120045230 Hounsell, D. (2003). Student feedback, learning and development. In M. Slowey & D. Watson (Eds.), Higher education and the lifecourse (pp. 67–78). SRHE & Open University Press. Huxham, M. (2007). Fast and effective feedback: Are model answers the answer? Assessment & Evaluation in Higher Education, 32(6), 601–611. https://doi.org/10.1080/02602930601116946 Hyland, P. (2000). Learning from feedback on assessment. In A. Booth & P. Hyland (Eds.), The practice of university history teaching (pp. 233–247). Manchester University Press. James, M. (1992). Assessment for learning [Paper]. Annual Conference of the Association for Supervision and Curriculum Development, New Orleans. https://www.researchgate.net/public ation/272089053_Assessment_for_Learning Johnston, P. H. (2004). Choice words: How our language affects children’s learning. Stenhouse. Kluger, A. N., & DeNisi, A. S. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119(2), 254–284. https://doi.org/10.1037/0033-2909.119.2.254

104

S. C. Zepeda

Looney, J. W. (2011). Integrating formative and summative assessment: Progress toward a seamless system? OECD Education Working Papers 58 [eBook edition]. OECD Publishing. https://doi. org/10.1787/5kghx3kbl734-en Molloy, E. K., & Boud, D. (2012). Changing conceptions of feedback. In D. Boud & E. K. Molloy (Eds.), Feedback in higher and professional education: Understanding it and doing it well (1st ed., pp. 11–32). Routledge. https://doi.org/10.4324/9780203074336 Narciss, S. (2008). Feedback strategies for interactive learning tasks. In J. J. G. van Merrienboer, J. M. Spector, M. D. Merrill, & M. P. Driscoll (Eds.), Handbook of research on educational communications and technology (3rd ed., pp. 125–144). Lawrence Erlbaum. https://doi.org/10. 4324/9780203880869 Nicol, D. J., & Macfarlane-Dick, D. (2006). Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199– 218. https://doi.org/10.1080/03075070600572090 Nuthall, G. (2007). The hidden lives of learners. NZCER Press. Orsmond, P., Merry, S., & Reiling, K. (2002). The use of formative feedback when using student derived marking criteria in peer and self-assessment. Assessment & Evaluation in Higher Education, 27(4), 309–323. https://doi.org/10.1080/0260293022000001337 Ramaprasad, A. (1983). On the definition of feedback. Behavioural Science, 28(1), 4–13. https:// doi.org/10.1002/bs.3830280103 Ramsden, P. (2003). Learning to teach in higher education (2nd ed.). Routledge. Rasmussen, J. B. (2017). Formative assessment strategies. Bethel University. https://www.bethel. edu/faculty-development/files/formative-assessment-strategies.docx Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144. https://doi.org/10.1007/BF00117714 Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189. https://doi.org/10.3102/0034654307313795 Tunstall, P., & Gipps, C. V. (1996). Teacher feedback to young children in formative assessment: A typology. British Educational Research Journal, 22(4), 389–404. https://doi.org/10.1080/014 1192960220402 Wiggins, G. (2012). Seven keys to effective feedback. Educational Leadership, 70(1), 10–16. https://www.ascd.org/el/articles/seven-keys-to-effective-feedback Wiliam, D. (2006). Assessment: Learning communities can use it to engineer a bridge connecting teaching and learning. Journal of Staff Development, 27(1), 16–20. Wiliam, D. (2012). Feedback: Part of a system. Educational Leadership, 70(1), 30–34. https:// www.ascd.org/el/articles/feedback-part-of-a-system Winne, P. H., & Butler, D. L. (1994). Student cognition in learning from teaching. In T. Husen & T. Postlewaite (Eds.), International encyclopedia of education (2nd ed., Vol. 10, pp. 5738–5745). Pergamon. Yorke, M. (2003). Formative assessment in higher education: Moves towards theory and the enhancement of pedagogic practice. Higher Education, 45, 477–501. https://doi.org/10.1023/ A:1023967026413

Sandra C. Zepeda is Social worker and Master in Education Sciences with mention in Evaluation Pontificia Universidad Católica de Chile. PhD (c) in Education Universidad ORT Uruguay. Sandra C. Zepeda is lecturer at the Pontificia Universidad Católica de Chile, in Faculty of Education in undergraduate and postgraduate training programs. Is Specialist in curriculum development and evaluation for learning in school and higher education. email [email protected]

6

Students as Assessment Agents Carla E. Förster

Abstract

This chapter reviews in detail the involvement of students as agents of evaluation through self-assessment and peer assessment. For self-assessment, aspects such as why to incorporate it as an assessment practice, how to implement it, some specific strategies to develop critical and strategic thinking through it, what can and cannot be assessed through self-assessment and its advantages, and limitations. For the peer assessment, the same topics already mentioned are developed, focusing on the suggestion of practices and instruments to carry out the peer assessment in the classroom.

6.1

Introduction

Many times, we have heard students say in the hallways: “I don’t know why the teacher gave me that grade” or “I never understood what I had to do and obviously I did badly” and we wish it wasn’t an experience in our classroom that they are referring to. If we involve our students in the assessment process, they must necessarily know the criteria and what they are expected to do, which brings us closer to two basic elements of a teacher with assessment competency: having clear criteria that are known to the students and involving them in their learning process. Thus, incorporating other agents or actors, other than the teacher, in the assessments we carry out in the classroom is part of a change that is (slowly) taking place

This chapter is based on the results of the Fondecyt Initiation Project No. 11140713, funded by Conicyt, Chile. C. E. Förster (B) Universidad de Talca, Talca, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_6

105

106

C. E. Förster

in the conceptions we have as teachers regarding the teaching/assessment process and how the student achieves the learning we expect. Various authors (Boud, 1995; Noonan & Duncan, 2005; Sebba et al., 2008; Spiller, 2012) point out that the new approaches associated with assessment for learning emphasise students’ active participation in their own learning, their responsibility in the process, the development of metacognitive skills, and the need to implement a dialogic and collaborative model of teaching and learning. It is increasingly evident that teacher-powered assessment limits the learning potential of classroom actions. This idea of incorporating students in their assessment is not new and there have been quite a few reviews addressing the effectiveness of self- and peer assessment on student learning (Boud, 1995; Boud & Falchikov, 1989; Dochy et al., 1999; Falchikov & Goldfinch, 2000; Sebba et al., 2008; Topping, 2009). Three types of impact on students can be distinguished: (1) outcomes related to achievement in the subject, (2) outcomes related to self-esteem, and (3) outcomes related to their learning process (metacognition, self-regulation). It is interesting to know these three impacts because we often expect our innovative actions or changes in assessment strategies to have a direct impact on enhancing the learning process which is taking place at the moment of teaching the subject, which does not always happen. We probably feel frustrated because we had high expectations that it would work, we invested time and enthusiasm in making a change in our assessment practices, and in the end the result was minimal, and we would have achieved the same with less effort. What these studies show us is that in most cases, the results are of types 2 and 3, that is, the students’ self-esteem and thinking skills are improved to face future tasks, while the advances in the specific task of the subject are not so evident. It is in this context that we must be aware of our teaching role in educating wellrounded people (although it may seem cliché) and not only expect immediate results in the subject. A student with better self-esteem will feel more motivated to learn, more capable of learning and of facing challenging tasks; if he/she develops his/her thinking skills, he/she will have a greater capacity to plan, monitor his/her actions, pay selective attention to what is relevant to the task he/she performs, that is, he/she will be a “better” student and this, consequently, will be reflected in the learning results of our subject. The invitation, then, is not to become demotivated and to continue involving students in the assessment processes (Pintrich & Schrauben, 1992). As we have seen in previous chapters, there are two aspects of assessment that define the decisions that are made in this area. The first is to be clear about what the learning goals are, which in the case of our school system are defined in the national or local curriculum, which makes explicit what students are expected to learn (learning objectives). The second aspect corresponds to the judgments that we make about student achievement that are as reliable as possible in regard to the goal (Boud, 1995). It is in this second point that the incorporation of other actors can make a difference and have a smaller gap between what the student really learned (internal) and what we can see in his or her performance (external). Figure 6.1 shows the different assessment agents, which will be analysed in detail later. There are several reasons for incorporating different assessors into the process. First, it makes the assessment culture much more transparent and gives students a

6 Students as Assessment Agents

107

Fig. 6.1 Classification of agents according to their role in the assessment

better idea of what is expected of them and where to focus their efforts (Race et al., 2005). Second, when students feel a sense of ownership of a teaching/assessment activity, they learn more deeply because it gives them a sense of ownership over the actions and decisions associated with the task. Third, “analysing” and “judging” are more complex cognitive skills than “reading” or “listening”, so a class that incorporates these skills using their own or their peers’ products as input generates deeper learning experiences (Race et al., 2005). Last but not least, from the point of view of a teacher who incorporates formative assessment as an ongoing process in the classroom, the active participation of students is essential to make their work sustainable, without having a work overload resulting from multiple revisions and the need for constant feedback (Boud & Falchikov, 2006; Race et al., 2005; Wiliam, 2010). In this chapter we will review the main characteristics of self-assessment and peer assessment, what elements should be considered so that assessments that incorporate students are good practices, and some contextualised examples that can serve as ideas for teachers to use in their classrooms.

6.2

The Self-evaluation

Self-assessment, in general terms, is defined as the process in which students make judgments about their own achievements and learning processes and take part in decisions about actions for further progress (Noonan & Duncan, 2005; Sebba et al., 2008). Andrade and Du (2007, p. 160) add more precise aspects by pointing out that self-assessment “is a process of formative assessment during which students

108

C. E. Förster

reflect on and evaluate the quality of their work and their learning, judge the degree to which they reflect explicitly stated goals or criteria, identify strengths and weaknesses in their work, and revise accordingly”.

6.2.1

Why Incorporate Self-assessment as an Assessment Practice?

Self-assessment puts the focus on student responsibility and making judgments about their own performance, which is a skill they will use throughout their lives (Boud, 1995) and will prepare them to solve problems when faced with a task (Brew, 1995). It also develops the learner’s awareness of what metacognitive strategies to use and when to use them (planning, monitoring, and evaluating responses) (McMillan & Hearn, 2008). In addition, it has been shown that there is a close relationship between students’ ability to self-assess and improvement in their performance, as well as in their motivation and persistence to complete a task (Kitsantis et al., 2004). Self-assessment also provides information that is not always easy to obtain otherwise, such as how much effort a student is expending on a task (Ross, 2006); perceived effort is known to be associated with self-efficacy, so for a teacher, knowing this information allows them to mediate the task in such a way that the student does not give up or invests less time and, therefore, believes they are able to successfully complete their task (Pintrich & De Groot, 1990; Schunk et al., 1996). Spiller (2012) points out that we all have a natural tendency to check our learning progress and making judgements about the progress of our own performance is an integral part of the learning process, so self-assessment encourages reflection on one’s own learning and can promote learner responsibility and independence; it also encourages a process approach that is student centred rather than teacher centred. Thus, self-assessment tasks promote student ownership of learning and change the idea that it is something imposed by someone else. In this sense, students develop complex cognitive skills, as carrying out a self-assessment requires higher order thinking skills, such as interpreting their performance using established criteria (Ross, 2006). Another very important element in our classrooms today is our approach to diversity, and self-assessment can accommodate a heterogeneous classroom in terms of students’ conceptual mastery, experience, and background. In self-assessment, the distinction between formative and summative purposes is generally blurred, since the feedback on their performance is immediate; thus, students do not have to wait for someone to tell them how well or badly they have learned, whereby a specific instance of assessment certifies the achievement up to that point (summative), but since there is the option of continuing to improve, it becomes formative. However, the magnitude of improvement will be conditioned by the understanding and commitment of students to the learning objective or goals to be achieved and, also, by their ability to identify their needs and to take the necessary actions for the next step in their learning. Thus, for self-assessment

6 Students as Assessment Agents

109

to be carried out effectively it is necessary that both the teacher and the learner have a clear understanding of the learning objectives and the criteria to be applied in judging the extent to which they are meeting those objectives.

6.2.2

How to Implement Self-assessment?

In specialized literature (Boud, 1995; Earl & Katz, 2006; Ross, 2006) it is argued that the way in which self-assessment is implemented is fundamental for students to accept it as valid and for it to fulfil its purpose. Five common elements, suggested by different authors, can be distinguished that must be fulfilled for it to be effective: (1) There must be a clear reason for doing it. Self-assessment must be a specific activity that has explicit purposes; it cannot be generic purposes, since they will not be meaningful for students (Boud, 1995). For example, the self-assessment that appears at the end of a lesson in textbooks where the questions “What did I learn today?” and “What do I still need to learn?” become something mechanical that the student will carry out if there is no associated pedagogical action that gives meaning to the self-assessment. (2) Students should be clear about the assessment criteria and what is expected of them. Regardless of the way in which the criteria are established (given by the teacher or with student participation), it is necessary to clarify these criteria so that they can apply them correctly. Students should have an example to guide their reflection on their learning; for the questions “What did I learn today?”, “What do I still need to learn?”, students should know how to answer; it is not the same to make a list of concepts they know and what they still need to learn, than to point out some specific aspect of the subject, for example, “I don’t understand fractions” is less important than “I don’t know how to convert fractions into drawings”. The quality and precision of the students’ answers allows to focus on how to achieve the learning that has not yet been achieved. In this aspect, it is key to model what students are expected to do because they are not experts, they need training, practice, and support in developing self-assessment skills (Earl & Katz, 2006). (3) Establish a safe environment in which students can be honest about their own performance without fear of exposing information that can be used against them. Much of the research associated with self-assessment argues that learning can be most effectively enhanced when this practice does not involve grading (Kirby & Downs, 2007). If self-assessment is graded, the student’s tendency will be to over-assess themselves by over-grading their learning in order to get a good grade, especially if this has consequences such as rewards, ranking or promotion—who would be honest about their learning if it will have no benefit and will actually harm them? The incentive becomes perverse and costly for the student, losing the whole formative purpose of self-assessment. Also of interest is the idea put forward by Earl and Katz (2006), who point out that students should be taught that learning implies that at times we feel

110

C. E. Förster

uncomfortable and that we have uncertainty and insecurity about what we are doing, but the important thing is to solve it and move forward without becoming paralyzed; thus, honest self-assessment will not demotivate the student. When exposing the strengths and less achieved aspects, space should be given for dialogue so that the student can have guidance on how to deal with them, scaffolding his or her autonomy. (4) Trust that other students will do the same, and that cheating or deception will be detected and discouraged. This principle seems strange if self-assessment is considered to be an individual process without consequences for the student, but in the school context there can be a hidden competition among the members of a class that can be reinforced by the teacher (through ignorance) by setting an example of a student who has already achieved the goal or by pointing out those who have yet to achieve it. In addition, at the school stage, students value the teacher’s sense of justice, and if we want self-assessment to be an instance recognised for its contribution to learning, we cannot allow dishonesty (cheating or deceit) in this process. (5) Self-assessment should be designed for specific disciplinary contexts. We often use general guidelines that apparently serve for any task or product, however, if we want it to be useful for the student to improve his or her performance in the subject, then it must be a specific and not a generic evaluation (Boud, 1995). This is one of the main reasons why the results observed in terms of the effectiveness of self-assessment are associated with self-esteem and thinking skills. If what we encourage them to self-assess is precisely that, it is not bad, but it is not what is expected for the student to improve his or her performance in the specific task. For example, if we want our students to self-assess themselves in the production of a narrative text, we should design a guideline that incorporates elements of text production, such as the type of narrator, the time in which the story occurs, the setting, and the characters. If my selfassessment instrument focuses on the general thinking process, for example, whether I had the materials (notebook, pencil, eraser) and whether I planned the steps I would take before writing the text or whether I read it after finishing it, the focus of the self-assessment shifts to the general learning process and moves away from the learning objective “to produce a narrative text”. We must also consider that when we decide to incorporate self-assessment into our classroom activities, factors such as the age of the students and their experience with self-assessment practices may influence how successful we are in doing so. Implicit in these factors is that effective self-assessment will be conditioned by the students’ ability to communicate with others, as they must express their thinking verbally or in writing (Towler & Broadfoot, 1992). We also highlight the four phases proposed by Greenway and Crowther (as cited in Munby et al., 1989) in which we can see how to approach these factors. It has been seen that in children under 5 years of age, self-assessment skills are difficult to work on for two reasons: the development of language, which influences their ability to communicate their thinking, and the temporal clarity to determine what

6 Students as Assessment Agents

111

they did and thought in retrospect. It does not mean that children at early ages cannot do them, but it is more complex for the teacher to collect reliable evidence of learning through this agent. The phases are as follows: (1) Knowledge phase: this stage emphasises that the student remembers the actions he/she has performed in an assessment task and mainly differentiates what he/she “did” from what he/she “liked”. For example, when solving a problem guide in Mathematics, a student who is not used to self-assessing may say: “I liked the balloon problem because it reminded me of my birthday”, and one who is more experienced in self-assessment will say: “The first thing I did was read the guide to know what I had to do”. With this we can see that the novice learner is not self-assessing their work, so the next time they face a similar task they will not have the accumulated experience to tackle it. It is common that in the self-assessment questions that we give to our students, there may be one associated to what they liked the most, but we can see that it is not adequate to develop strategic thinking. (2) Analysis/understanding phase: refers to the search for explanations as to why certain things happened the way they did. It is related to making conscious decisions in the planning and execution of the task. For example, in a Technology task in which students had to design a robot with light eyes that turned on (electrical circuit), we can see different responses: (a) response about what the student learned in terms of positive aspects: “I learned that it is not easy, you have to think about gluing the buttons on the mouth, the box where the batteries go has to be tied with string, to put the battery with the light bulb properly you have to put the wires apart …”. (b) Another student analyses his performance from the aspects not achieved: “we needed help from the teacher because we could not cut the cardboard to make the robot, we also had problems because Oliver was asking for paint all the time and he was distracting us because we had our paint, glue and adhesive tape ready …” (Towler & Broadfoot, 1992). (3) Evaluation phase: in this phase the student must make a judgment about the task performed and break it down, explaining the positive aspects and those that did not work, and then make a judgment about the result. For example, “The map we made at school before the visit to the museum did not help me because …”, “We had all the materials to build the bridge and the design was good, but it did not turn out as well as we designed because in the group we had different ways of working and it took us a lot of time to agree and finally we ran out of time.” It has been seen that as students get older, they are able to make a better analysis of what they learned and what they missed, especially when the assessment is associated with a task that involves applying knowledge or creating something from what they learned (Towler & Broadfoot, 1992). (4) Synthesis phase: this is the final stage of the self-assessment process, and the student is expected to consider what he/she learned, did or did not do when faced with future situations or in other contexts. For example: “When I do an

112

C. E. Förster

experiment again, the first thing I have to do is know what the hypothesis is and identify the variables;” “If I had to start again with the same group, in addition to seeing the materials and the design, I would agree beforehand on who is going to glue the sticks on the bridge and who is going to give the instructions that are in the design.” Also at this stage, it can be seen that the depth of the reflections varies according to age and experience in carrying out these types of self-assessment tasks (Towler & Broadfoot, 1992). For example, a 7-year-old student, faced with the design task, concludes: “What I liked most was when the activity ended because I was very tired” and adds “next time I should make a simpler design.” Some concrete strategies to develop strategic thinking and enhance learning through self-assessment are: Strategy 1: Ask students what they learned in this class session and have them relate it to what they learned in the last session, in the previous week, or in a past unit. This way they will make sense of what they are learning, and it will not be isolated and without association to what they have seen in class. Strategy 2: Try to get students to break down their learning experience into the positive aspects and those that need more support. This can be done by asking which part of the task difficult or tedious, and which part was easier or more enjoyable, always asking them to justify their response and give reasons for their rating. These reasons should be focused on elements of the task and the associated learning. If this is not the case, we as teachers should be attentive to redirect their reflection towards this focus. Strategy 3: It is also important to make students aware of what they do to learn and to share their learning experiences with their peers and teachers so that they can turn to others (people and resources) when they need to. Thus, we can ask them about their behaviours as learners: Do they share their doubts and achievements with their peers, do they talk with their peers about their work, do they read more than what they are given in class in order to complete their work or to complement what they have learned or what they have not learned? These questions should be associated with a why, since what will make the difference is the justification they give for their actions. Strategy 4: Being able to draw on previous successful or unsuccessful experiences allows students to learn from their actions and apply what they have learned in new situations, but for this it is necessary that as teachers we direct our students to recognise those experiences and the specific elements that have made them successful or unsuccessful. Questions such as “Have you identified what went well or not so well and what were the reasons for this result? What could you improve if you had to do this activity again?” or “What would you do again if you had to do an activity like this one?” It is expected that the answers, depending on the task, have a part associated with generic strategies such as planning, teamwork, etc. and another part related to specific disciplinary aspects of the subject.

6 Students as Assessment Agents

113

Strategy 5: Ask students to do the assignment and then pass them a correct model from a book (if the assignment is to make an outline, for example) or give them a revision guideline, ideally a rubric, so that they can see for themselves what level they are at. This strategy can be done before handing in an assignment and ask them to include their self-assessment. The teacher reviews the evaluation and discusses any discrepancies with the student. Strategy 6: Encourage students to project their actions in future tasks. For example, at the end of their papers or guides, pose a question such as the following: What do I need to keep in mind when I do a … (paper, guide, experiment, etc.) again, what actions should I not repeat if I am faced with a task like this again? Strategy 7: Use KPSI (Knowledge and Prior Study Inventory) forms. This instrument consists of a small scale of indicators of expected learning or previous learning required to attain new knowledge, and on which the students must selfassess their proficiency level according to pre-established categories (Young & Tamir, 1977). The teacher should consider that sometimes the students believe that they know something, and the teacher should help them recognise that their self-assessment was not adequate with respect to their real knowledge (Sanmartí, 2007). Box 1 shows an example of a KPSI for learning Natural Sciences: Box 1 Example of KPSI Form for Natural Science Studies

Knowledge and Prior Study Inventory (KPSI) Form Mark with an X the box that corresponds to the level of knowledge that you think you have in each statement. Statements

Water has different states in nature There is a relationship between weather and the water cycle Tidal waves are produced by the wind The original peoples believed that rain was made by the gods

I could explain I understand it it to a classmate well

I have a fair understanding

I don’t know/I don’t understand

114

C. E. Förster

Table 6.1 Example of expected learning achievement matrix Criteria

To meet the standard: what I need to do

Scientific communication/using data

• My data will be in a chart, table, graph, and will be labeled • My data needs to prove my exploration • Someone can read my explanation and understand it

Scientific concepts and related content

• Terms I should use and understand are: ……………………… • Things I need to be sure to observe or pay attention to are: ……………………… • A “big idea” that might help me to connect my learning to other things I know or want to learn more about is: ………………………

Scientific tools and technologies

• These are the tools I need to use to collect data and complete the task (the student must make a list) • I need to check for mistakes

Evidence of what I did

Scientific procedures and • My hypothesis is: ……………………… reasoning strategies • To complete the task, I need to follow these steps: ……………………… • I need to record these data: ……………………… Example adapted from https://exemplars.com/sites/default/files/2020-05/how_do_i_know_i_met_ the_standards.pdf

Strategy 8: How do I know that I achieved the expected learning outcome? This strategy is based on a table where the student fills in for each assessment criterion the actions to be carried out (planning) and then puts the evidence that shows that he/she did everything that was indicated. Table 6.1 shows an example.

6.2.3

What Can Be Assessed Through Self-assessment?

Understanding that self-assessment is not an instrument in itself, but rather the incorporation of the student as an assessment agent of his or her own learning or performance, it can then be integrated into most learning activities. It should be kept in mind that space and guidelines should be created to provide opportunities for students to identify or reflect on their progress against the expected learning goals. The assessment instances associated with practical tasks and more complex performances, such as report writing, research development, source analysis or project design lend themselves more easily to self-assessment in a structured manner, however, during any class activity, students can be invited to monitor their progress in achieving their learning, the important thing is that they are clear and understood the evaluation criteria well.

6 Students as Assessment Agents

115

Self-assessment is not recommended as the only assessment instance in summative processes; however, it is a very good option if used in conjunction with peer and/or teacher assessment.

6.2.4

Advantages and Limitations of Self-assessment

6.2.4.1 Advantages (1) It can provide the teacher with very practical and functional information about what the students are learning, the progress they have made, their difficulties, the actions that have been successful for them in order to support them in specific aspects, focusing on the feedback. (2) It helps students think about their own learning, progress, and problems, and then find ways to improve. When students are able to analyse their own progress they can find ways, methods or strategies to make changes and become better learners, and it empowers them in their role as assessment agents. (3) Some students tend to ignore corrections, suggestions or feedback made by teachers, but when they are forced to correct themselves, there is a greater likelihood that those errors will be analysed and addressed. (4) There is a wide variety of self-assessment techniques, so the teacher can choose the one he/she considers best for the class, taking into account the characteristics of his/her students. (5) It helps students to have a clearer idea of what goals they are trying to achieve and what they are expected to accomplish. (6) It can give feedback to the teacher on the students’ progress without having to correct or check every piece of work or assignment they do, contributing to save time. (7) It has a direct impact on the learning process. When it is well-designed and focused on specific aspects of the task, it can have a direct effect, as the learner improves immediate performance.

6.2.4.2 Limitations (1) If students are unclear about the objectives of the task or are required to selfassess specific aspects of the discipline in which they do not yet have sufficient mastery, the self-assessment process may be invalid, with the result that both the feedback to the student and to the teacher provides erroneous or deficient information. (2) Students need to have a very high degree of awareness of what is expected as a standard so that they are able to analyse the mistakes they have made and their performance during the course or unit.

116

C. E. Förster

(3) Self-assessment can be time-consuming, as space needs to be given to clarify the objectives and to carry out the self-assessment activities. To counteract this, teachers should plan ahead and design a suitable format that does not take too long and is easy to review. (4) Depending on the task, it is sometimes only appropriate when students reach intermediate to higher levels of mastery of the subject matter, because they have the skills to analyse their performance more easily. (5) Perhaps the most significant disadvantage is the lack of maturity of students to undertake self-assessment seriously. Some are not aware of the seriousness or importance of the process, especially if it is associated with a grade; therefore, they tend to assign themselves the maximum grade, which alters the final result. It can also happen the other way around, very self-critical students may undervalue their performance, but this situation is less common. (6) It is complex to “negotiate” a grade for self-assessment. On the one hand, students do not work unless there is an incentive and, on the other hand, it is recommended that self-assessment should not carry a grade. How should we do it, then? As teachers, if we have no choice but to assign a grade, we should make it clear to students that the self-assessment will not weigh heavily in their final grade. (7) It needs to be integrated with other classroom activities. If it is not planned, it remains decontextualised, and the implicit power of feedback and selfregulation is lost. (8) Self-assessment only works if students have been “trained” to self-assess. The reason for this is because doing so in a rigorous way involves knowledge about the discipline, clarity about the learning objective, and communication skills to express their reflection.

6.3

The Peer Assessment

Peer assessment is understood an evaluation of learning products or outcomes by a person who has the same hierarchical level as the person being assessed. It is considered a core element of formative assessment, which incorporates students in the assessment process, giving them collaborative and not individual responsibility, as in self-assessment (Black & Wiliam, 1998; Panadero & Brown, 2017). There is enough evidence of the benefits in student learning. However, the effectiveness of this assessment agent depends on our mediation as teachers. In order to delve deeper into the benefits and difficulties associated with peer assessment, it is necessary to be clear about its conceptualization. There may be different definitions and it is also called “co-assessment”; however, there are some who point out that co-assessment can only be carried out within a work group (Casanova, 1995) and others who extend this agent to the idea that the evaluator and the evaluated have the same hierarchical status in the classroom (Sanmartí, 2007). In our proposal, we have preferred to speak of peer assessment and not

6 Students as Assessment Agents

117

of co-assessment, so as not to restrict student participation only to one type of assessment task (group work), understanding that the suitability of the assessor is given by his or her ability to observe the assessment criteria to be applied and, therefore, it is the teacher’s job to safeguard the validity of the instrument according to whom he or she is assessing. Reinforcing this idea, different authors (see, e.g., Race et al., 2005; Sebba et al., 2008) specify the suitability for peer assessment, pointing out that when what we want to assess is the process of creating a product, only those who have participated in this process can do so (intra-peer assessment), whereas if we want to assess the finished product, all students can do so (inter-peer assessment), as long as the assessment criteria are explicit and there is clear evidence of this result that they can observe. Figure 6.1 presented at the beginning of the chapter, shows this distinction.

6.3.1

Why Incorporate Peer Assessment as an Evaluation Practice?

Different authors (Panadero & Brown, 2017; Race et al., 2005; Topping, 2013) point out a list of reasons why teachers should integrate peer assessment in their teaching/assessment practices. We wanted to reference those that seem to us to illustrate our reality well, and give meaning to the incorporation of this assessment agent: (1) Students are already doing this informally. Every time a student does a piece of work, their friends and classmates give feedback and make comments regarding the assignment, the problem is that they are not considered authoritative voices (compared to the teacher) because the criteria are not always clearly known, so establishing and facilitating formal instances of peer assessment legitimizes student opinion, adds commitment to their own and others’ learning, and increases the feedback a student receives. (2) We cannot do as many formative assessments as we would like. The increase in the number of students we have in the classroom, the teaching load with new challenges, the preparation of materials and less and less time available mean that the instances of assessment to monitor the achievement of learning and support students in their progress are increasingly limited, leaving room only to apply the graded assessments that we must report. This context means that the capacity to provide individual student feedback is very limited, but peer assessment, when well prepared, can be an option to provide more feedback. (3) Peer assessment allows students to learn from each other’s successes and mistakes. Students, while evaluating their peers’ work, cannot help but notice the elements that are somehow better than their own. Similarly, students discover all sorts of mistakes in their peers’ work, many of these mistakes they made themselves, which increases their awareness of what not to do. Thus, when they receive feedback, they also check what they saw in their peers’ work

118

C. E. Förster

(positive aspects and mistakes) and adjust their own, with benefits for their own learning. (4) Peer assessment develops higher-order cognitive skills and metacognitive skills that will serve them well in life beyond school. When a student evaluates the work or performance of his or her peers, he or she brings into play analytical and evaluative skills that are considered complex cognitive skills. In addition, in order to give feedback to their peers, they must organise the information and check their revision so as not to make mistakes, actions that constitute metacognitive skills. This development of thinking in students generates benefits for learning the subject, but also creates a mental structure that will serve them in their future actions (Candy et al., 1994).

6.3.2

How to Implement Peer Assessment?

As already mentioned, the effectiveness of the implementation of peer assessment requires our mediation as teachers in the activity and we must include it in the planning of the teaching process as an instance that has a purpose and has time within the class. Below we will present a series of recommendations that we suggest taking into account when we want to implement assessment activities among students. (1) Providing or agreeing on truly clear assessment criteria. If the criteria are unambiguous and explicit, students are obliged to follow them when evaluating their peers, thus considerably reducing the bias they may have when evaluating their friends or when those being evaluated are peers with whom they have conflicts. It is also advisable to request that any judgment they make must be justified with concrete evidence of the student’s work or performance. This has a double function: on the one hand, to ensure that the student’s judgement is based on reality and, on the other hand, that the feedback is more accurate. (2) Modelling through the analysis of evaluation criteria. In order for a peer assessment to be effective, it is recommended to do a class exercise in which the evaluation criteria to be used are clarified. This is not just explaining the criteria orally, which is what most of the time, due to time constraints, we do; this exercise involves showing students examples of work done in previous years that are good, bad, and fair, and analysing the criteria of the review guidelines (ideally a rubric) by judging those examples (Race et al., 2005). This will give all students a chance to see what is really meant by optimal performance (and its descending categories) on each criterion and will also allow them to adjust their expectations for the product. After doing this exercise, they will be better prepared to evaluate their peers and give them feedback that will help them advance in their work. As we can see, this exercise requires time and resources, which must be planned as a teaching and assessment strategy, otherwise it will not work, and the learning experience will not be successful.

6 Students as Assessment Agents

119

(3) Moderating students’ judgments and comments to their peers. Race et al. (2005) state that although it is important to involve students in the review of assessment tasks, when these are graded, the teacher should moderate the judgments and comments that they make to their peers, since their assessment will vary according to their practice in this type of evaluation and the consequences of the instance for the student being evaluated. However, these authors point out that the greatest benefit is for the student because they will receive more detailed feedback than the teacher (who is always short of time) and for the teacher, since it is always easier to review someone else’s comments than to start from scratch. (4) Managing a climate of trust and respect in the classroom. We must also consider that we must manage a classroom climate that allows peer assessment to be a learning opportunity, and this implies minimizing competition among classmates. For example, we should encourage our students not to work to be “the best” in the class, a situation in which optimal performance means “beating” the rest, but to foster the spirit that everyone can be “the best” if we support each other, that what we are learning is not a “contest”, so that in the classroom we all win (Race et al., 2005). This may seem utopian in an increasingly competitive world, but if we assume that our benchmark is expected learning (criterion benchmark) and we do not personalize it to the performance of the best student in the class (normative benchmark), then students need not feel threatened by supporting their peers. (5) Introducing it gradually. In some schools, the assessment culture is very teacher-centred, so incorporating peer assessment can be difficult to understand, both for colleagues and management teams and for the students themselves and their parents. It is therefore advisable to consider the context and introduce it gradually, on a small scale, until we are sure how it will work. (6) Communicating to students and colleagues your purpose with peer assessment. Keeping students and teachers informed of what you are doing and why, helps them understand the overall strategy and its relationship to the learning process and not perceive it as an isolated or improvised action. (7) Incorporating peer assessment in the calculation of the final grade. Although there is no agreement in the literature regarding the assignment of a grade to peer assessment, it is suggested that if it is to be done, it should be clear how it will be assigned. When the evaluation is peer assessment, an average of all the grades given by the course can be calculated and that this grade makes up a percentage in the final grade, for example, 30% and the other 70% corresponds to the grade given by the teacher. (8) Providing a possibility to appeal the assessment judgment. The student has the right to appeal if he/she considers that the evaluation made by one or more peers was unfair. In this case, we as teachers must arrange a meeting and act as mediators in the situation, they must be able to present their arguments and agree on the classification of their performance; if an agreement is not reached, we will be the ones to give the final resolution.

120

C. E. Förster

(9) Allowing enough time for the evaluation. This point is really important because we usually know the assessment criteria and performance grading categories (if we are using a rubric) and can therefore make the judgement quickly. Students need to read, analyse, and then mark the rubric. If we want the peer assessment to be serious and rigorous, we need to allow time for them to do it this way. For example, if we ask them to evaluate each other in groups to generate a single evaluation, they will have to discuss among themselves the criteria to reach a consensus, and that takes more time. The different ways of engaging the student for which we have argued assume that our classroom activities integrate these actions, and that assessment is a fluid process separate from grading. Several authors have proposed and socialized concrete strategies for working with peer assessment, some of which are presented below (Dunn, 2011; Race et al., 2005): Strategy 1: Two stars and a wish. Students should identify two positive aspects of a partner’s work and then express a suggestion (wish) about what the partner could do next time to improve a specific aspect of the work (Fig. 6.2a). This is especially recommended for addressing writing and text production tasks. For example, “I want to give you a star for the beginning of your story because it is entertaining and makes you want to keep reading and another star for the way you described the house because it meets the fiction criteria. I suggest that you talk more about Rachel because it’s not so clear why she eats the cake”. Before using it massively, we should model this strategy several times, with examples of students’ work from other years, and then ask them to use the strategy in pairs with their work. When we start using it, we should check that they are doing it well and we can ask the pairs who have done well to show it to the whole class. Strategy 2: Plus, minus, and what next? This strategy involves students commenting to their peers on what they did well against the set assessment criteria, and what they could do better. They should also suggest how they could address these weaker aspects. It requires greater mastery of assessment processes, as students analyse their peers’ work in terms of specific criteria associated with the discipline and the task, so it is suggested to replace the previous strategy (Two Stars and a Wish), which fulfils the same function but is less complex. It can also be used as part of a self-assessment, where students use the question “What’s next?” to set a personal learning goal. Strategy 3: Weather gauge. This strategy is based on the feedback of three aspects: (1) positive or well achieved, which would be the warm points, (2) deficient or less achieved, which would be the cold points and (3) advice or suggestions to improve their work or raise the temperature of these cold points. In this strategy, it is suggested to use a booklet format where the evaluation criteria and the writing zones for the student are explicit (Fig. 6.2b).

6 Students as Assessment Agents

121

Fig. 6.2 Examples of strategies for peer assessments

Strategy 4: The traffic light. This strategy is based on making comments or suggestions using the colours of the traffic light, students should mark with green the aspects that are achieved, with yellow those that require improvement but are on the right track and with red that which is incorrect and needs significant changes. Following the idea of incorporating formative assessment, it is recommended to use this strategy in works that are in the development stage, so that students can address these improvements in their final product. It can be implemented with different formats, for example, a traffic light is drawn on a sheet of paper and the student writes next to each colour of light the aspects associated with each level of achievement (Fig. 6.2c). It can also be done with sticky notes of these three colours and the student sticks the mark; this second option requires that the peer evaluator also gives oral feedback to explain why he/she put that mark. Strategy 5: Peer marking. This strategy is considered more advanced in terms of complexity because it consists of the student reviewing the assignment as if he/she were the teacher and, therefore, making comments and marks throughout the work, associated with both disciplinary and formal aspects. It requires the group to have an intermediate mastery of the assessment criteria so that they can apply these criteria and the feedback received by the assessed student is effective. It is recommended to implement this strategy as an intermediate instance, complementing the teacher’s evaluation. It is very useful in the revision of reports or texts.

122

C. E. Förster

Strategy 6: The feedback sandwich. This strategy is similar to “Two stars and a wish” and is based on giving three comments: two positive ones, associated with the best achieved aspects (bread) and one that identifies an aspect that needs to be improved (filler). It can also be done by using the layers of the sandwich to structure different elements of the feedback, putting first a general positive comment, then a question that points to some deficient aspect (why did you use …? why did you decide to do … instead of …?), and then make a suggestion for improvement (Dunn, 2011) (Fig. 6.2d). Which of the two models is used will depend on the age, peer assessment experience, and analytical skills of the students; the first form is recommended more for younger children (first cycle) and the second for students in older grades (second cycle and upwards). The balance and weight of the comments should be taken into account in this strategy, as irrelevant positive comments could be made and the negative comment could be structural or so weighty that it destroys the student’s work or, vice versa, it could be a negative comment on some secondary aspect of the work that does not contribute significantly to its overall improvement. Strategy 7: Use of a rubric. For larger assessment activities conducted over a period of time, the rubric, which has been discussed with students at the beginning of the assignment, can be used as the basis for discussions in pairs about progress on the assignment. If students are clear about the goals and assessment criteria outlined in the rubric, they can provide useful feedback to their peers that will enable them to improve. Strategy 8: Plenary of friends. This technique is done as an activity in pairs or small groups, in which a representative is chosen to tell the rest of the course the judgment and comments of the evaluation made to the work of others at the end of the class. For example, students do individual work, this work is exchanged and evaluated among peers, and then they get together, talk about their evaluation, and in the plenary, they present to the rest of the class some elements that they found interesting to share. This strategy requires a space in the class that should be planned, and resources should be provided for students presenting in plenary to show examples of their work and for their peers to see how they applied the criteria. An interesting and quick way is to use the available technology, for example, to take a photo of their work (with their cell phone) and present it with a projector to the class or send it to the group and see it directly on their cell phones. Photocopies of the work could also be made to hand out to the students, the important thing is that they have access to the evidence in some way. This strategy has the benefit that, by sharing the comments, it increases the examples of performances that the students see and, therefore, they have more learning opportunities in the same instance. Strategy 9: Let’s talk! This strategy consists of asking students to mark a peer’s work by putting only a cross next to any errors, but without identifying what is wrong. Upon completion of the revision, they meet with their peer and should discuss these marks explaining why they consider it to be an error and the assessed peer can argue if they disagree with the judgement. These marks should use the assessment criteria previously discussed for the task, they are

6 Students as Assessment Agents

123

not improvised comments from what each student believes about the other’s performance, they have a standard at the base. Strategy 10: Developing language for peer assessment. It is key to develop skills in students to carry out peer evaluation. Among these skills, the language with which they make comments is key to generate an atmosphere of harmony and trust that enhances and validates this experience and the feedback received. In this sense, the premise: “It is not only what you say, but the way you say it” becomes the crucial element of any evaluation. One option, when the children are younger, is to put up posters in the room with examples of phrases that they should use (Box 2).

Box 2 Examples of sentences to develop appropriate language in peer assessment

“I like this part, but have you thought of …?” “What made you use this word/phrase/connective/simile/metaphor and not another one?” “The best part is when you …” “I think next time you ought to think about …” “I think you’ve achieved these two success criteria, but I’m not sure about the third. What do you think?” Adapted from Dunn (2011)

6.3.3

What Can Be Assessed Through Peer Assessment?

As with self-assessment, virtually all performance tasks that a student performs can be assessed, since peer assessment involves the incorporation of the student as the assessing agent and not an instrument or technique in of itself. What is important is that the teacher selects valid criteria for what is being assessed and that students are clear about these criteria and the performance expectation for the task. Some tasks where peer assessment has been found to have benefits for student learning or considerations for effective implementation are: (1) Thematic oral presentations. Students can assess communicative aspects of a presentation as well as subject content. To incorporate this second element, those acting as evaluators must have sufficient mastery of the content to be able to make judgements about it.

124

C. E. Förster

(2) Reports. Peer assessment in this type of work helps them to recognise good and bad practices in the writing of reports, their structure, internal coherence, and design. As we saw earlier, it allows them to become aware of good and bad aspects addressed by their peers and when they receive their feedback, they incorporate the comments they make and, also, what they saw as evaluators. (3) Essay design and planning. Before writing an essay, students can be asked to design the structure and content of the essay and this planning is submitted for peer review. This allows the assessed student to receive feedback on what they are planning to do, and the comments will help them to refine and adjust their design. This activity has a significant impact in a relatively short time. (4) Practical work. When students are assessed on tasks that involve psychomotor skills or abilities, the presence of the teacher while they are practicing the skill can be intimidating, but they need to receive feedback to know how they are doing. In this context, peer assessment is helpful, as it is low pressure for the one being assessed. (5) Oral performances and presentations. Students can learn a lot about their own performance skills by evaluating the performances or presentations of others. Peer assessment helps them to better understand the criteria to be used and allows them to watch how others apply them in practice, learning by observation and, also, by becoming more aware of the interpretation and application of those criteria.

6.3.4

Peer Assessment Advantages and Limitations

6.3.4.1 Advantages (1) When it comes to assessing the development of practical skills that involve performances that require several trials to achieve, involving students through peer assessment may be less threatening than when it is the teacher who assesses (Race et al., 2005). In general, we as teachers cannot be at a student’s side for an extended period of time to provide feedback on their performance as they progress, so the formative assessment in which we can do “one-on-one” is much more limited than if we have all the students “help” observe each other. Also, having a peer observer creates less anxiety and is less intimidating for the student being assessed. (2) There is greater student ownership of what is expected of an assignment and the criteria by which they will be assessed. (3) The participation and responsibility of students in the learning of their peers is valued. (4) Students develop higher order cognitive skills (analysis and evaluation), as they must apply the evaluation criteria to a peer’s task or performance. This

6 Students as Assessment Agents

125

involves analysing the well-developed and deficient elements in the task, making a judgement, and proposing alternatives for improvement, all complex thinking skills. (5) It reduces the teacher’s workload, as it allows the student to receive timely feedback so that incorporating the logic of formative assessment in the classroom does not imply more work for the teacher.

6.3.4.2 Limitations (1) Implementation requires time, which must be planned into the class session. This may be considered by some teachers as a waste of time or a work overload, as it requires the development of the assessment activity and the necessary resources. (2) Assigning a grade associated with peer evaluation can have reliability problems if the criteria are not explicit and the peer evaluator is not asked to justify his or her judgment, since the grade may be influenced by the degree of friendship, he or she has with the student being evaluated. (3) Students do not always have the knowledge to apply the assessment criteria, so it is the responsibility of the teacher to ensure the appropriateness of the criteria and of the students to assess their peers. (4) Successful peer assessment requires teaching students how to do it, teaching the language to use, and creating an atmosphere of respect and trust in the classroom. These basic conditions may be difficult to achieve in some contexts or specific courses, thus jeopardizing the development of the activity. In summary, as we can see, there are many ways for our students to become effective evaluators of their peers and themselves. It is worth persevering having peers as assessment agents because the benefits for learning in your classroom can be enormous, which also allows us to make formative assessment sustainable over time without increasing our workload. In the following chapter, we will analyse the technical criteria for developing and selecting quality tools that allow for the collection of evidence of student learning.

References Andrade, H., & Du, Y. (2007). Student responses to criteria-referenced self-assessment. Assessment & Evaluation in Higher Education, 32(2), 159–181. https://doi.org/10.1080/026029306 00801928 Black, P. J., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy and Practice, 5(1), 7–74. https://doi.org/10.1080/0969595980050102 Boud, D. (1995). Enhancing learning through self-assessment [eBook edition]. Routledge. https:// doi.org/10.4324/9781315041520

126

C. E. Förster

Boud, D., & Falchikov, N. (1989). Quantitative studies of student self-assessment in higher education: A critical analysis of findings. Higher education, 18(5), 529–549. Boud, D., & Falchikov, N. (2006). Aligning assessment with long-term learning. Assessment and Evaluation in Higher Education, 31(4), 399–413. https://doi.org/10.1080/02602930600679050 Brew, A. (1995). Self assessment in different domains. In D. Boud (Ed.), Enhancing learning through self-assessment (pp. 129–154) [eBook edition]. Routledge. https://doi.org/10.4324/978 1315041520 Candy, P. C., Crebert, G., & O’Leary, J. (1994). Developing lifelong learners through undergraduate education [National Board of Employment, Education and Training Commissioned Report, 28]. Australian Government Publishing Service. http://hdl.voced.edu.au/10707/94444 Casanova, M. A. (1995). Manual de evaluación educativa [Educational Assessment Manual]. La Muralla. Dochy, F., Segers, M., & Sluijsmans, D. (1999). The use of self-, peer and co-assessment in higher education: A review. Studies in Higher Education, 24(3), 331–350. https://doi.org/10.1080/030 75079912331379935 Dunn, D. (2011, September 8). Using peer assessment in the primary classroom. Teach Primary. https://www.teachprimary.com/learning_resources/view/using-peer-assessment-in-theprimary-classroom Earl, L., & Katz, S. (2006). Rethinking classroom assessment with purpose in mind [eBook edition]. Manitoba Education, Citizenship and Youth. https://www.edu.gov.mb.ca/k12/assess/wncp/full_ doc.pdf Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: A metaanalysis comparing peer and teacher marks. Review of educational research, 70(3), 287–322. Kirby, N. F., & Downs, C. T. (2007). Self-assessment and the disadvantaged student: Potential for encouraging self-regulated learning? Assessment & Evaluation in Higher Education, 32(4), 475–494. https://doi.org/10.1080/02602930600896464 Kitsantis, A., Reisner, R. A., & Doster, J. (2004). Developing self-regulated learners: Goal setting, self-evaluation and organisational signals during acquisition of procedural skills. The Journal of Experimental Education, 72(4), 269–288. https://doi.org/10.3200/JEXE.72.4.269-287 McMillan, J. H., & Hearn, J. (2008). Student self-assessment: The key to stronger student motivation and higher achievement. Educational Horizons, 87(1), 40–49. https://files.eric.ed.gov/ful ltext/EJ815370.pdf Munby, S., Phillips, P., & Collinson, R. (1989). Assessing and recording achievement [eBook edition]. Blackwell. https://archive.org/details/assessingrecordi0000munb/page/n9/mode/2up Noonan, B., & Duncan, R. (2005). Peer and self-assessment in high schools. Practical Assessment, Research & Evaluation, 10(17), 1–7. https://doi.org/10.7275/a166-vm41 Panadero, E., & Brown, G. T. (2017). Teachers’ reasons for using peer assessment: Positive experience predicts use. European Journal of Psychology of Education, 32(1), 133–156. https://doi. org/10.1007/s10212-015-0282-5 Pintrich, P. R., & De Groot, E. V. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82(1), 33–40. https:// doi.org/10.1037/0022-0663.82.1.33 Pintrich, P. R., & Schrauben, B. (1992). Students’ motivational beliefs and their cognitive engagement in classroom academic tasks. In D. H. Schunk & J. L. Meece (Eds.), Student perceptions in the classroom (pp. 149–183) [eBook edition]. Routledge. https://doi.org/10.4324/978020305 2532 Race, P., Brown, S., & Smith, B. (2005). 500 tips on assessment [eBook edition]. Routledge. https:// doi.org/10.4324/9780203307359 Ross, J. (2006). The reliability, validity and utility of self-assessment. Practical Assessment, Research & Evaluation, 11(10), 1–13. https://doi.org/10.7275/9wph-vv65 Sanmartí, N. (2007). Evaluar para aprender. 10 ideas clave. [Assess to learn. 10 key ideas]. Graó. Schunk, D. H., Pintrich, P. R., & Meece, J. R. (1996). Motivation in education: Theory, research and applications. Pearson.

6 Students as Assessment Agents

127

Sebba, J., Crick, R. D., Yu, G., Lawson, H., Harlen, W., & Durant, K. (2008). Systematic review of research evidence of the impact on students in secondary schools of self and peer assessment [Technical report 1614T]. EPPI-Centre, University of London. https://eppi.ioe.ac.uk/cms/Por tals/0/PDF%20reviews%20and%20summaries/Self%20Assessment%20report.pdf?ver=200810-30-130834-050 Spiller, D. (2012). Assessment matters: Self-assessment and peer assessment [eBook edition]. Teaching Development Unit, the University of Waikato. https://cei.hkust.edu.hk/files/public/ass essment_matters_self-assessment_peer_assessment.pdf Topping, K. J. (2009). Peer-assessment. Theory into Practice, 48(1), 20–27. https://doi.org/10. 1080/00405840802577569 Topping, K. J. (2013). Peers as a source of formative and summative assessment. In J. H. McMillan (Ed.), SAGE handbook of research on classroom assessment (pp. 395–412) [eBook edition]. Sage. https://doi.org/10.4135/9781452218649 Towler, L., & Broadfoot, P. (1992). Self-assessment in primary school. Educational Review, 44(2), 137–151. https://doi.org/10.1080/0013191920440203 Wiliam, D. (2010). The role of formative assessment in effective learning environments. In H. Dumont, D. Istance, & F. Benavides (Eds.), The nature of learning: Using research to inspire practice (pp. 135–159) [eBook edition]. OECD. https://doi.org/10.1787/9789264086487-en Young, D., & Tamir, P. (1977). Finding out what students know. The Science Teacher, 44(6), 27–28.

Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. email: [email protected]

7

Learning Assessment Tools: Which One to Use? Carla E. Förster, Sandra C. Zepeda, and Claudio Núñez Vega

Abstract

This chapter addresses the main learning assessment instruments used in a school classroom: tests, checklists, assessment scales, and rubrics. In the case of the tests, different types of closed and open response items are presented, pointing out their characteristics, limitations in terms of the cognitive abilities that can be measured, the assignment of scores and examples to better understand the elaboration rules. To evaluate performance through instruments such as checklists, scales, and rubrics, their particularities, construction steps, advantages, and limitations are indicated.

7.1

Introduction

When planning the assessment strategy and then designing or selecting the assessment tools, we are answering the question: with what or how to assess? Thus, the quality and pertinence of these instruments will be directly related to the quality of the information we collect regarding our students’ learning. If the tool has problems of construction or of alignment with learning objectives, then we will make decisions based on these results that will not account for real learning, and thus our pedagogical strategies will not be as effective as we had hoped. In this chapter we will address general aspects associated with the construction of assessment situations in the form of tests and instruments to record and evaluate performance tasks, such as checklists, rating scales and rubrics, analysing their

C. E. Förster (B) · S. C. Zepeda · C. Núñez Vega Universidad de Talca, Talca, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_7

129

130

C. E. Förster et al.

uses, advantages, and limitations. We hope that the topics discussed here will provide guidelines that contribute to the development of tools that are valid, reliable, and objective in the context of classroom assessment.

7.2

Test Situations

The most traditional form of summative assessment in the school context is through tests, most often written, although we can also find oral questioning. Whatever the format, what we expect is that the items that constitute this type of assessment be well- structured and that they be directly related to the learning objectives. In each section, we will give details or particularities to consider when the items are used individually (outside of a test) with a formative intention.

7.3

Planning the Development of a Test

To create a test, the first thing we must do is design it. This implies constructing the table of specifications that will guide the construction of the items of our test. This first step is relevant for four reasons: (1) it allows us to identify which of the learning objectives we are going to assess and what relevance or weight we are going to give them, (2) it makes it possible to visualise the cognitive complexity we want to assess in this test, (3) it helps to give coherence to the type of item we will create to assess each learning objective, and (4) it allows us to organise the analysis of the results after applying the test. This table of specifications, although it is intended to design a test, is also useful when we build an exercise or practice guide for students to develop in class or at home, since this way we take care that in the exercise we are including all the learning objectives we expect to achieve, thus safeguarding the instructional validity. Table 7.1 includes an example of a specification table format with a description of what should go in each column. With the specifications table defined, we will have a more precise idea of the test or exercise guide that we want to construct and thus we will not deviate from our purpose, which is to collect evidence of our students’ learning achievement. Let us now look at the characteristics and construction rules of the different types of items that can be incorporated into a test-type situation. Figure 7.1 shows a classification of them.

7.3.1

Close-Ended Response Items

7.3.1.1 Multiple Choice Multiple-choice items are those in which the student must select a correct option from among several alternatives provided, which completes the statement or

Contents

They correspond to the topics addressed in the LO that will be assessed. In general, the conceptual or procedural contents in their cognitive dimension are included here. Attitudinal content could also be included, but we must bear in mind that we will be evaluating it only at a declarative and not a behavioural level

Learning objective (LO)

In this column is the LO(s) that will be evaluated in the test or guide. These LOs are the same included in the planning of the unit, they do not correspond to a new elaboration

They are the indicators defined in the planning of the unit, it is possible to include others or reduce them if it is considered that the way in which the LO was addressed in classes is not consistent with the evaluation indicator

Assessment or achievement indicator It corresponds to the name of the category of the dimension of cognitive processes (e.g., understand, analyse, create). The purpose of this column is to check that the test or guide is addressing a diversity of cognitive abilities, so the global category is needed and not the verb of the indicator, since there may be different indicators that point to the same cognitive process

Cognitive ability

Table 7.1 Specification table format for a test or exercise guide

In this column, you put the type or types of items with which each indicator will be evaluated. Remember that this must be consistent with the taxonomic category, e.g., if my evaluation indicator points to designing something, I can’t choose a multiple-choice item. To that end, it is important to be clear about the characteristics and limitations of each type of item

Item type

This column indicates how many items I will have of each type and how many are associated with each indicator. Thus, I can keep the assessment emphasis I am making in this test or guide. It is important here to remember that, depending on the type of items, the score is different, so the number itself does not represent the weight. That is what the next column is for

Item number

Approximate percentage that the group of items represents in that indicator compared to the total. This helps ensure the relevance that I am giving to the indicator in the assessment and check that it is consistent with the emphasis and time allocated during the classes

Weight or percentage

This column is filled when I already have the tool designed and assembled, that is, when I already have the final version ready to administer it. It corresponds to the number and/ or letter of each item in the instrument (e.g., I. 1, 1. a, 2. c). Remember that the paired terms constitute a single item

Item number on the tool

7 Learning Assessment Tools: Which One to Use? 131

132

C. E. Förster et al.

Fig. 7.1 Classification of the different types of assessment tools and techniques

Fig. 7.2 Structure formats of a multiple-choice item

answers the question posed in the item heading. They can have a simple or a compound structure. As shown in the example (Fig. 7.2), the simple structure involves a question and its answer options, while the composite structure has the question, a list of possible answers and then the options that combine the answers presented. Some elements to consider in its construction are:

7 Learning Assessment Tools: Which One to Use?

133

Grammatical agreement. Grammatical agreement must be maintained between the statement and the options and avoid giving clues that may help the student to discard options without having knowledge about the evaluated topic. Quality of the question. When a statement is formulated, it must be posed as a question; many times, they are presented as a completion of the sentence, but this form complicates the understanding of what is requested to the student. For example, if we formulate the following statement “Pythagoras’ theorem is:…,” the grammatical agreement with the options will be difficult, and we also lose focus on what it is that we want the student to demonstrate as learning. By formulating it as a question, we specify what we want as evidence of learning: “What does the Pythagorean theorem state?” Quality of the context or stimulus. We can also put before the question some stimulus or context, which can be a text or source, a graphic, an image. What we have to take into account here is how much this stimulus contributes to the item, sometimes we write an introduction that only adds reading load to the student or adds a distraction, if it is an image, because the question could be answered without that information. The exercise is simple: cover that part of the item with your hand and leave only the question, if you can answer it, then it is unnecessary, and you should delete it. Quality of the options. The first thing is to ensure that the item has only one correct answer among the options and if it has more than one correct option, to point it out to the student so that he/she can choose the best one. In this case, there should be intermediate scores where the most correct one has the maximum score, and the partially correct ones have intermediate scores different from the incorrect choices. With regard to the wording of the options, the following are some suggestions: • The options should be as short as possible, if the same phrase is repeated in all of them, it should be included in the question to avoid redundancy in the answer options. • Avoid using the option “all of the above” if it is the correct one. The reason is that it is enough for the student to identify two options as possible for them to mark it. Similarly, it is enough for them to identify one that is not correct for them to discard it as an option. • Avoid using the option “none of the above” if it is the correct one, since we will not know if the student’s answer really reflects mastery of the content. For example, in situations involving a calculation, both those who reach the correct result and those who make a mistake can mark that option because their result is not in the options. • The options should be homogeneous in terms of the content being evaluated. For example, if the question is about a musical work, all the options should allude to the names of musical works and not be mixed with composers. • If the options are numerical, such as dates, mathematical calculations, measurements, it is suggested to arrange them in ascending or descending order to make it easier for the student to find the option visually.

134

C. E. Förster et al.

• It is suggested to maintain approximately the same length in all the options, avoiding the correct answer being the longest one. • Use only distractors or plausible incorrect options that account for conceptual errors or preconceptions of the content to be assessed. The student should think it is a real option and not dismiss it as obvious or because it is too distant from what is being asked. • Avoid writing statements and options in double negation, e.g. Which of the following is not false? The number of response options. There is a belief that the correct thing to do is to have 4 options in Basic Education and 5 in Secondary Education in order to reduce the probability of answering correctly by chance. This rule comes from a psychometric logic that is not valid for assessing learning at the classroom level because we do not have a team of item constructors nor do we carry out pilot studies before applying a test; we construct it in a short time and, therefore, making a fifth option that is of quality can be very difficult or sometimes impossible. For this reason, our proposal is that the options should be 3 or 4 depending on the age of the students and that these should be of quality in terms of their construction. Incorporating pictures. It can always be attractive to young children to put pictures in a test, especially when they have not fully developed text reading skills. Also, a picture can allow to increase the cognitive complexity of a close-ended question, but we must remember that the print in most schools is in black and white and photocopying it can end up being a big unreadable stain. Likewise, do not ask for colours in images that will not have them, e.g. What type of blood circulation does the red part of the diagram represent? What does the author mean by the word that is marked in green in the text? Why did the poster incorporate purple and not red? Which of the following fruits is red? Score allocation. In these types of items, the norm is that there is one correct option, and 1 point is assigned if it is correct, and 0 points if it is incorrect. However, there are occasions when the aim is to measure intermediate developments in learning, and the student may be asked to mark the “most appropriate” for a specific situation, and in this case, differentiated scores could be assigned between the partially correct options. Such an allocation implies that these questions are considered to carry more weight than others that have no such logic. What should not happen is for an item to have 2 points if it is correct and 0 if it is incorrect, as this would not respect the continuity of scores since no student would have the possibility of having the intermediate score (1 point). Instructions. If several multiple-choice items are presented in a section, common instructions should be given on how to respond. For example: “Read each of the following items carefully and circle the letter that corresponds to the correct choice” or “Below are 15 multiple-choice items. Read each item and select the correct option by marking the appropriate letter in the box on the answer sheet.”

7 Learning Assessment Tools: Which One to Use?

135

Box 1 Advantages • They allow covering a large number of contents • The review is easy, fast, and objective (with a low probability of bias) by the teacher and/or student • The degree of difficulty of the item can be regulated from the similarity or complexity of the options used • It favours the quick and easy analysis of test results • Patterns of incorrect responses can be analysed, and pedagogical decisions can be made if the distractors are common errors • It allows generating a question bank and then combining questions to assemble new tests by discarding those with problems in their application Limitations • Asking good questions takes time and skill • They commonly assess cognitive abilities of a low taxonomic level, like recognizing and understanding. If there is context, they could include the ability to analyse • It does not allow evaluating the students’ ability to organize and express ideas or their creativity • It is easy to get side-tracked and ask specific and irrelevant questions of content • Results may be biased by the student’s reading comprehension ability and expertise in this type of item • It provides limited feedback of the student’s understanding

7.3.1.2 True or False It consists of a statement or sentence on which the student must make a dichotomous judgement, i.e., answer Yes or No, True or False, Correct or Incorrect. Writing the statements. Since the item is a sentence, the way it is presented is the key to a good item. Some suggestions for wording are: • Avoid making the statement long and complex, as this unnecessarily adds to the demand for reading comprehension. For example, “A story is a brief narration of an imaginary event for moral or recreational purposes, with expectant content, whose action is intensified and clarified in its denouement.” • Avoid making the statement trivial or obvious. For example, “Einstein was a scientist”; “fish live in an aquatic environment.” • Make sure that the affirmation is direct, without double negations (e.g., “Fungi never obtain nutritive substances from other living beings”). • Avoid having two or more statements within a single statement (e.g., “Tourism activities will soon surpass hospitality activities, which have increased in recent years”). • Ensure that statements do not contain “trap” words that are false. For example, in the sentence “In the battle of Trafalgar, the French navy had two frigates that became famous: Revenant and Confiance”; the word “frigates” does not correspond because these were corvettes.

136

C. E. Förster et al.

• The affirmation must be exhaustive, that is to say, there must be no possibility of interpretation. • Avoid having the statement be partially correct or having the answer imply “it depends.” For example, in the sentence “Water boils at 100 °C,” if students know that height influences the boiling temperature of liquids, they are likely to answer False, while a student who has less knowledge will answer True. • Words like “more,” “little,” “great” or “good” should be avoided because they are not precise enough. It should be noted that these suggestions are appropriate when the items are to be incorporated into a test and, therefore, care should be taken to ensure that they have only one correct answer. If we wanted to use this format to stimulate discussion and the students’ capacity for argumentation, it is not necessary to comply with these rules; on the contrary, we would expect ambiguous sentences and both answer options to be correct depending on the way in which they are justified. Instructions. When this type of item is used, it is necessary to include precise instructions for answering. It is necessary to ensure that there is coherence between the instruction and the format of the item, for example, if we say, “Circle T or F, as appropriate,” these letters should be at the beginning of the statement, even if it seems obvious. Sometimes we put a line for the student to fill in with a letter and this generates the inconsistency. Change the item from close-ended to mixed. Often the phrase “justify your answer” or “justify the false ones” is used as a way of reducing chance; when we do this, the item is no longer closed and the cognitive ability to which it is associated changes, so make sure that this is reflected in the specification table. Scoring. This type of item is dichotomous and should be scored 0 if the answer is incorrect and 1 if it is correct. When it becomes mixed, a guideline must be defined to check this second part and an additional score must be assigned to this justification. The problem occurs when only the false ones are justified because students see the total score for the section and can calculate how many there are and then answer based on that. Box 2 Advantages and limitations of true or false items Advantages • They allow evaluating a large amount of content • Correction is very easy and makes it possible to obtain and process the information in a short time • It is a good option to use in situations of formative assessment, since it allows identifying the learning of students in specific aspects of a certain content • The reading load is low and allows responding within a short time

7 Learning Assessment Tools: Which One to Use?

137

Limitations • The items assess cognitive abilities with a low level of complexity • It is difficult to develop good items, many times they evaluate details or specific information of little global importance • It has a high probability of a random answer (50%), so the student feels confident to guess. To compensate and achieve greater reliability, a greater number of items must be available. More than 20 is suggested if the test does not include another type of item, ensuring that it measures one or two learning objectives at most to guarantee reliability

7.3.1.3 Matching This type of item has a two-column structure and consists of “a series of statements called premises which are listed down the left-hand column of an exam and a series of responses listed down the right-hand column” (Cunningham, 1998, p. 90). They are useful for assessing students’ recall and understanding of a topic through the association between terms and their meanings; the complexity of the cognitive process they perform will depend on the type of premises and responses used. For example, authors and their literary works, dates and facts, scientists and their discoveries, concepts and their definitions, organs and their functions, and names of compounds and their chemical formulas. Cunningham (1998) points out that matching items function like a multiple-choice item with more options in that correct answers to one pair become distractors for the other pairs. The item structure includes an instruction, the premise column, and the response column. This format can be adapted if the children are young so that instead of filling in the answer space in the premise, they join the items in both columns with a line. Among the suggestions for their construction are: Response instructions. The instructions are a key aspect to guide the student’s response, we must indicate what we expect them to associate and how we expect them to respond. For example, “Find the corresponding pair by matching the characters to their story with a line,” “Match each part of the digestive system with its function. To do this, write the corresponding letter (column B) in the space available in column A,” “In the space in parentheses next to each battle, write the corresponding date.” Although it is not desirable, in the case of putting answer options that can be repeated, we must explicitly place it in the instructions. For example, “Put the number of the artistic movement to which each work belongs in the corresponding space. You can repeat any of the movements.” Information put in each column • To identify the pairs, the premises are numbered, and the answers are identified by letters.

138

C. E. Förster et al.

• The contents should be homogeneous. Each column must include content of the same type (e.g., authors and their works, scientists and their discoveries, historical facts and their dates, definitions and their concepts, organs and their functions). When the contents are different, the focus of the assessment is changed, and what we measure is not the student’s understanding of specific learning, but his or her ability to discriminate and discard matching possibilities. • The answer column must have a greater number of distractors to choose from than the number of premises to answer; this prevents the student from answering randomly or making a pair by discarding. • Put a maximum of 10 premises in an item. When more are included, the task becomes unnecessarily complex due to the load on the working memory and, in addition, it becomes more difficult to construct them so that the content remains homogeneous. • Check that each answer has only one correspondence or make it explicit in the instructions that it can be repeated. • The answers should be short sentences, and in case of being definitions, it is recommended that they do not exceed two lines. • Present the content of the answer column in a logical or chronological order. The effort that our students invest in answering the item is expected to focus on the content and not on secondary aspects such as the search for the answer. For this you can put the names in alphabetical order and the dates in chronological order. • Ensure that the item is complete on the same page and that it is not cut off in its final version for printing. Let’s see some examples of items that have problems in their construction (Boxs 3 and 4): Box 3 Example of matching item 1. Match the clothes with the corresponding season of the year with a line.

7 Learning Assessment Tools: Which One to Use? Some construction rules that are met in this item are: • The instructions that indicate what is to be done • The presence of the elements in two columns, in which one groups the seasons and the other the clothes The problem arises in the drawings, a prototypical representation is used and not the knowledge or experiences of the students: • The flowers and the sun do not sufficiently discriminate between spring and summer • On the clothing side, the sweater can be worn in all seasons, depending on the geographical location, in summer it is cold on the beach in the afternoon, for example • Beachwear includes items that are not clothing (buckets and sunglasses) • The columns do not have titles that indicate what they contain • The columns have the same number of elements, so they can be matched by discarding

Box 4 Example of matching item Instruction: Place in the spaces of column A the letters of column B that correspond to the definition of each concept Column A 1. ___Milky way 2. ___Planet 3. ___Star 4. ___Comet 5. ___Natural satellite

Column B A. Large spherical bodies that do not have their own light and that revolve around a star B. Set of planets that revolve around a star C. Spheres of gases at high temperatures that release enormous amounts of energy in the form of light and heat D. Place where our solar system is located, and which is shaped like a spiral E. Small body without its own light that revolves around the planets F. A body that has a core made up of rocks and ice, and a tail that becomes visible as it approaches the sun

In this example, the elements that are fulfilled are: • Instruction is clear • Each column has homogeneous contents • Definitions are of adequate length • In column B there is one more answer than the premises in column A What is not fulfilled is: • The columns do not have a title that indicates what they contain

139

140

C. E. Förster et al.

Box 5 Advantages and limitations of matching items Advantages • It allows assessing the degree of association, classification, and relationship established by students in a certain content • They are quick items to review • A wide coverage of content can be encompassed in a short time • The review has no evaluator bias • There is a lower probability of guessing the correct answer compared to other types of items Limitations • Cognitive skills of a low level of complexity (remembering and understanding) are evaluated • Construction is limited to contents that allow enough variability to build both columns • If too many response options are included, the item may fail to assess the relevant knowledge of the subject and may be measuring the ability to remember what has been read

7.3.2

Open-Ended Items

These are items in which students must construct an answer, usually in written form. They are called open-ended response or constructed-response items because, in most cases, there is more than one correct answer, and they require students to “construct” or develop their own answers without the support of response options (Tankersley, 2007). These types of items can range from asking for a simple oneword response or a couple of sentences to requiring writing an essay. Depending on the type of item, we can have a variety of response alternatives (there can be as many answers as students respond), so they are considered a good option for assessing both mastery of the discipline and creativity. They can be more difficult and cognitively complex than close-ended response items because the student must draw on his or her knowledge to develop an answer, unlike a close-ended response item, in which he or she can evoke the information or recognise it among the options (Popham, 2003). There are two main categories in this type of item: short-answer items, when students respond with a word or phrase, and long-answer items, when students produce a text, diagram, drawing, etc. The latter involve greater cognitive complexity in their responses.

7 Learning Assessment Tools: Which One to Use?

141

7.3.2.1 Short-Answer Items In general, these are items with a simple structure for which a response of a few words is expected. This type of item, although constructed-response, measures low complexity cognitive skills and has similar characteristics to close-ended response items. The main advantages of these items are that they avoid excessive use of words by the student, which facilitates correction and helps to make it more objective; they avoid the “chance” factor because the student must give an answer instead of choosing it, and they allow the evaluation of a wide range of content or specific details of these. Among the limitations of this type of item is the ease of falling into memorization and repetition of information and, therefore, they evaluate low cognitive complexity. Cunningham (1998) points out that the first thing to do when developing this type of item is to select the learning objectives that can be assessed with it. The questions should be formatted in a way that avoids trivia, whereby the correct answer must correlate to some degree of competence with the learning objective, and to ensure that the item will have only one possible interpretation by the learners. There are different types: simple, identification, completion, association, ordering, and hierarchical. Simple: they correspond to simple questions that imply a punctual answer, for example, a name, a date, a number. A recommendation given by Cunningham (1998) in relation to questions that involve a date as an answer is to specify the unit in which the answer is expected, for example, if I expect them to tell me the year of an event, I should ask “in which year…” instead of “when…,” as the answer could be ambiguous (e.g., they could answer “last century” instead of “in 1945”). Some examples of this type of item are: • • • • • •

What are the chemical elements that make up water? Name four artistic expressive media. What are the components of sentence? Who was the first Roman emperor? What is the sum of the interior and exterior angles of a triangle? Name three physicochemical properties of water.

Identification: this type of item consists of presenting the student with information, usually graphic or auditory, and asking him/her to identify what it corresponds to, answering in the space provided.

142

C. E. Förster et al.

In these items, the student’s response should not exceed two or three words for each item to be identified and it is necessary to ensure that the space is sufficient. It should be remembered that this format evaluates simple cognitive skills, so concepts that are relevant to the learning of the subject matter should be selected. One example from Natural Science and one from Language Arts are discussed below (Box 6). Box 6 Example of identification item In the following diagram of the human skeleton, indicate the name of the structures that are labeled.

Some errors are evident in this item: • There is not enough space to answer • The scheme does not allow to clearly visualize which bone is the one that is marked • The instruction is not understandable for the age of the students

7 Learning Assessment Tools: Which One to Use?

143

Box 7 Example identification item Write next to each arrow the name of the part of the corresponding part of the news:

In this item, the construction criteria are met, allowing to assess knowledge of the structure of a type of text. Based on that evidence, we can then continue advancing in the elaboration of a text with this format. If we see that students are not clear about the concepts, we should reinforce them so that they understand the instructions for more complex tasks associated with future learning

Completion: consists of the presentation of sentences in which spaces have been left for the student to complete with a word that gives meaning to the statement. The spaces should correspond to concepts or words that are key to the students’ learning and that allow them to assess whether they have achieved it. If these sentences are well-elaborated, they can be a good tool for assessing understanding of the concepts and not just remembering some information. One consideration is that in the instructions, it should be made explicit whether the space corresponds to one word or can be more than one (the latter should be avoided). Cunningham (1998) states in relation to these types of items, a common but unacceptable practice is to extract sentences from the textbook and delete one or two words, since taking it out of context tends to lose its meaning and, in addition, assumes that the student should memorize the text, a situation that does not make any sense. The examples below show this difference (Table 7.2). Association: these are items that are structured in such a way that students must associate concepts with each other, providing evidence of learning that allows them to evaluate their abilities to relate elements of a specific content. For their construction, items of the same type must be considered in each column. An example is presented below (Fig. 7.3). Ordering and hierarchy: it consists of a list of elements (concepts, facts, names, characteristics, etc.) and the student is asked to order them using some criterion that is made explicit in the instructions. The requirements and care we must take

144

C. E. Förster et al.

Table 7.2 Examples of short open-ended response completion items A good example: in this case, the missing words are associated with key aspects of Byzantine art; enough information is given in the text not to leave room for another interpretation

Complete the following statements, writing in each space the missing word or words: Byzantine art developed in the city of (Byzantium) and reached its peak in the (mosaic) technique, with the most outstanding being those of Justinian and Theodora, which are located in the mausoleum of (San Vitale / Ravenna)

A bad example: in this item, the chosen sentence does not reveal the importance of this historical fact, since it is located in the context of the history of Chile; therefore, it should be stated leaving the date or event blank (indigenous slavery) or adding something associated with the institution that decreed it (Royal Audience)

Fill in the missing word in the space The slavery of the natives of (Chile) is established in 1608; this reflects the economic interests that are woven around the (colony)

Fig. 7.3 Example of association item for Language

in its construction are similar to those already mentioned, some specificities are presented below: Instruction: in the instructions guide the student’s response, we must indicate how we expect the student to respond. For example, “Write the order in which the different stages of mitosis occur in the space provided. 1 is the first stage and 4 is the last stage.” Item construction • Ensure consistency between the expected learning or indicator of achievement and the item being developed. • The sequence must have only one possible correct order. • The list should not exceed 10 items, as the cognitive load to retain the information complexifies the item by shifting the focus from the disciplinary domain to a general cognitive domain. • The list should include at least 4 items to decrease the chance of randomly answering correctly.

7 Learning Assessment Tools: Which One to Use?

145

• If figures are included to be ordered, for example, from larger to smaller, they should be placed on the same line so that the student does not think that a figure is smaller because it is further back.

Box 8 Example of ordering and hierarchy item Let’s look at an example: 1. Order the different political-geographical administrations of Chile according to their level of complexity. Write the number in the corresponding space. 1 is the most complex level and 4 is the simplest level a. ___Region b. ___Province c. ___District d. ___Country

Box 9 Advantages and limitations of ordering and hierarchy items Advantages • It allows evaluating the degree of understanding that students have regarding a certain content or procedure • Items are quick to review • The review has little evaluator bias • The student has a very low writing load • It is useful to assess sequence in History and Language Limitations • Cognitive skills of a low level of complexity are evaluated • The construction is limited to contents that allow establishing hierarchies or following a sequence • If the student gets one answer wrong, then he or she will get the whole item wrong even though he or she knows most of the sequence

7.3.2.2 Extended Response Questions Extended response questions, also called “essay” questions, are used to measure students’ knowledge and complex reasoning skills (Livingston, 2009). There are no word limits or structure constraints on the student’s response. The breadth of the question must be taken into account, so that the answer is in line with what is desired. We will now review some criteria to consider when creating open-ended or constructed response items: Instruction: the instructions give general guidelines to guide the student in the way he/she is expected to answer. In the case of items whose instruction is included

146

C. E. Förster et al.

in the question, it is not necessary to write a separate instruction that will be redundant. The important thing is to make it as clear as possible to the student what we expect them to do to demonstrate what they have learned. Quality of the question • The development of the items, considering the planning made in the specifications table, allows us to have clarity of the assessment plan and the cognitive ability to be assessed. It is important to keep in mind that each question evaluates one or more indicators. This will make it easier for us to know what the item we construct should contain. • Write the question or task as clearly and specifically as possible with regard to the level of detail expected in the answers. You should clearly specify the task the learner is asked to do, i.e., it should detail the number of arguments/substantiation or aspects required. For example, sentences like “Explain 2 effects caused by…” are more precise than “Explain some effects of…”; similarly, when we say “Describe 3 characteristics of…” we intend the number of elements and the depth with which we expect you to do it (describe), while saying “What are the characteristics of…” implies that you should point out all of them, but not describe them. • The wording and vocabulary of the items should be clear and appropriate to the age and characteristics of the students to whom they will be applied. • The use of questions should be preferred to sentences to be completed. For example, “What is the main cause of global warming?” is better than “Global warming is caused by:” because it narrows the student’s response to what we expect. • The question should involve higher cognitive thinking skills. It is necessary to ensure that the answers are not rote-learned information, as the potential of this type of item is lost. • The question should not include biased information or information that leads to an incorrect answer, because with that we are deceiving the student and there will be no gain for anyone, the teacher collects poor quality information, and the student does not demonstrate what he/she knows about the topic. In these “traps,” the students who have greater mastery tend to get hurt because they know more about the subject and, therefore, they think about the exceptions that may exist. • The items should aim at answers elaborated by the students from their learning experiences, information, and available texts. It is unethical and unsatisfactory for instructional validity if questions address topics that students did not have the opportunity to learn. • Avoid starting questions with “who” “what” “when” “where” “mention” or “list,” as these terms limit responses to those of low cognitive level; prefer question starters or words such as “why” “describe” “explain” “compare” “analyse”

7 Learning Assessment Tools: Which One to Use?

147

“critique” “evaluate.” These beginnings should also relate directly to the assessment indicator. Examples of good questions that can be adapted are those of the Socratic method. • The item should be phrased in such a way that the question is open-ended and has multiple possible answers that represent the students’ personal elaboration. Response time • Open-ended response items are intended to assess higher-order cognitive skills, so we must give the student enough time to think and write. • A student takes three times longer to complete an assessment than the teacher who developed it. The Eberly Center for Teaching Excellence and Educational Innovation (2022) recommends answering the assessment earlier and considering this time by adjusting the length or difficulty of the items if the calculated time is longer than the available time. • Letting students know how much time they have to answer will help them use it more efficiently. We can give them some recommendations on how long each item should take so that they will be able to answer everything. Correction and scoring guideline • When drafting the items, you need to think about what possible answers you expect to be correct. This allows you to adjust what you are asking for in the item and, also, allows you to outline the review rubric when writing the question. Both rubrics and other review guidelines should consider the range of possible student responses and be consistent with the context and the statement. • For the correction, it is suggested to list the aspects that are expected to be contained in the answer and then create a rubric that describes the levels of performance. By doing this exercise together with the construction of the items, we can see if the scores we are thinking of assigning to each criterion are coherent with the depth and number of elements we expect in the answer. For example, I may ask the student to describe 3 characteristics of something, and I assign 4 points for that, but when I do the performance grading, I realize that the division of the score will be complex, since if he/she answers 1 or 2 of the characteristics correctly, I would have to assign scores with decimals (1.33 points each). So, I may decide to assign 3 or 6 points to the item according to the weight I want in the total of the test and with that I generate a better distribution of the partial scores, assigning integer scores to each description. Another distribution option is that I decide to give 2 points to the named characteristics (1 point if they are not all correct) and 3 points to the quality of the description (1 point each), so the maximum score for the item should be 5 points. As you can see, there is no single way of scoring, the important thing is that this decision is consistent with what I am measuring in the item. • It is not recommended that students be able to choose the questions they want to answer, unless it is required by what we want to assess. When students can

148

C. E. Förster et al.

choose only some of the questions, they will answer only those in which they feel more prepared and that does not allow us to comparatively evaluate their answers to make later pedagogical decisions. The only valid option would be to generate pairs of items that are considered parallel in terms of what they are assessing, and that the student can choose one of these pairs, but this means an additional effort for the teacher, since he/she must do twice as many items of good quality. Livingston (2009) suggests some examples of assessment situations in different subjects that, in constructed-response items, are a good option for assessing complex cognitive skills, which are presented in Box 10. Box 10 Examples of useful tasks as extended open-ended response items Examples Language: Write an essay comparing and contrasting two poems, stories, or roles of characters from literary works Mathematics: Write a mathematical equation to solve a problem presented with words or diagrams Natural Sciences: Describe how some biological process occurs in a plant or explain the ability of plants to survive in different environmental conditions Music: Listen to a melody or chord progression and write it correctly in musical notation Social Sciences: Write an essay comparing two instances of social or political processes that occurred in different times and regions of the world Adapted from Livingston (2009)

Box 11 Advantages and limitations of constructed response items Advantages • Creating a response is a task of greater cognitive complexity than selecting it from a list of options • The learning outcome of the students constitutes a closer approximation to the actual domain of knowledge of the discipline • They allow the free organization of the answer by the student and that gives him or her more flexibility to respond • They avoid the random factor in the students’ response • They are easier to construct than a close-ended question

7 Learning Assessment Tools: Which One to Use?

149

Limitations • Determining if the answers are correct involves a higher degree of subjectivity. To avoid this, the use of correction guidelines, and especially rubrics, is suggested • The formulation of questions can be ambiguous. Requiring a broad response can result in imprecision regarding what you want the student to answer • Review requires more time than close-ended or short open-ended items

7.3.3

Mixed Items

They consist of items that incorporate aspects of both close-ended and open-ended responses. It is common to find this item as an adaptation of True or False items, and in Multiple Choice items, in which the student is asked to justify his or her decision or, in the case of Mathematics, is asked to develop the exercise next to it to see the procedure he or she carried out to arrive at that result. Strictly speaking, these items are transformed into an open-ended response (brief or extensive) depending on what is requested, and the same criteria and suggestions mentioned above should be followed.

7.4

Checklists

They are also called control lists, guidelines or checklists. Checklists are an assessment tool that consist of the enumeration of a series of qualitative or quantitative elements whose presence or absence is to be verified. It corresponds to a list of characteristics or behaviours that the student is expected to fulfil in the execution or application of a process, skill, or product, through which the achievement of learning is verified. The purpose of the checklist is to collect information about the student’s behaviour through observation. The purpose of checklists as an assessment tool in the service of improving our students’ learning is focused on: • Providing a tool for systematic recording of observations. • Providing the student with a tool for self-evaluation. • Providing the learner with the assessment criteria to be considered in the review of their performance or work. These should be provided in advance for the learner to consider in the preparation of their work and not just as a means of feedback on their achievements. • Clarifying the necessary adjustments in the learning opportunities we give our students based on concrete evidence of their current achievement.

150

C. E. Förster et al.

Developing checklists is not easy since we must consider several criteria for the tool to be of quality and provide reliable evidence of our students’ learning. Some of these criteria are shared with the assessment scales, so we will review the specific ones below, the general ones we will only name and develop them in more detail after looking at the scales. Planning the list: this is essential; to make a good list we must be clear about the learning objective we want to assess and the task the student will do to demonstrate it. • The performance descriptors to be included in the checklist should be consistent and observable with the task. To ensure this consistency it is relevant to develop a table of specifications. • Identifying what kind of information or characteristics will be useful for the purposes we have helps to achieve greater coherence and to make the tool valid. Some typical content that is incorporated into checklists are: student language use, work habits, learning strategies, and classroom interaction. • The key to planning is deciding who we will observe (individual students or groups), how often we will apply the checklist (once or more than once), and when we will do it (in which classes or on which specific occasions). Characteristics of the descriptors: the descriptors are the list of characteristics that will be checked in the student’s performance or product, therefore: • They must be directly observable. • The list must be complete, that is, it must include all the observable characteristics of the object to be evaluated, avoiding the insufficiency of aspects (which reduces reliability) and excessive thoroughness. This last point is particularly important because when the aspects to be evaluated are very specific, it is difficult for the evaluator to observe them, so he/she tends to mark the same in all the aspects that are similar, which is called the “halo effect” (Corbetta, 2003; Himmel et al., 1999). • It is suggested that the list be drawn up together with other people and that a consensus be reached on the aspects to be evaluated. • If lists taken from the Internet are used, it is important to verify the source and check the basic construction criteria, as there are a large number of “checklists” that are not and that correspond to scales of appreciation or do not have a theoretical basis to support them, therefore, they do not constitute a reliable or valid element. However, they can be very useful for extracting ideas and adapting them to one’s own requirements and needs. • Descriptors should have clear and precise language, i.e., avoid interpretation by the user. When descriptors are vague (e.g., “teamwork”) or have language that is inappropriate for the user, it can lead to interpretations that decrease the reliability and validity of the tool. It is suggested that the expertise or familiarity of the person who will apply the instrument be considered.

7 Learning Assessment Tools: Which One to Use?

151

• The descriptors must evaluate only one aspect, since the presence or absence is marked, therefore, by including more than one element, it is not possible to differentiate whether the mark corresponds to the absence of the characteristic or to a partial performance of the descriptor. • The descriptors should be grammatically coherent in their wording with the answer options. If the descriptors are of a different nature, they can be separated into groups and mini lists can be made using different answer options. • The wording should be as short as possible without sacrificing comprehension of the content. • The presentation of the descriptors should have a logical sequence that can be associated with the temporal observation of the feature in performance, i.e., if I am reviewing a written work, the descriptors should be in a similar order to their appearance in the text, thus facilitating application. Regarding application: A checklist can be of quality in its construction but lose its validity and reliability if misapplied. Some suggestions for application are given below: • The information collected is individual, as it is expected to gather information on the performance of each student. If used to assess group products, the group functions as an individual unit and the criteria should only focus on the group and not on aspects of individual members. • They can be used during one session or over several sessions, allowing for systematic, planned, and targeted observation. • In order to ensure the validity of the information, it is suggested that it be applied more than once before making an evaluative judgment, since what is observed is in a logic of presence/absence. Presentation formats and response options: There are different formats for checklists, which go beyond the basic three-column structure: (1) descriptors, (2) presence and (3) absence. Some response options and variations to the traditional format are: • Response options should always be dichotomous, e.g., Present—Absent, Yes— No, Achieved—Not Achieved, Observed—Not Observed. • Not Observed column: ticked if it is not possible to observe the aspect at that time. • Observations or comments column: a space is added in front of each descriptor or group of descriptors that allows the evaluator to write comments associated with what he/she is observing. This practice allows for more precise feedback to be given to the student and, also, allows the teacher to reconstruct a past event, if necessary. In Fig. 7.4, we have an example of a checklist created to evaluate the application of the tempera painting technique. As can be seen, descriptor 1 considers for its

152

C. E. Förster et al.

Fig. 7.4 Checklist for assessing the application of the tempera painting technique

fulfilment a series of elements that the student must bring. If he/she does not bring the materials, the other descriptors cannot be assessed, but what happens if he/she brings only one brush instead of two? Strictly speaking, it should be marked “No,” since it does not meet the criterion in its entirety, but it is very likely that if we were to apply this in reality, we would put “Yes,” thinking that what was missing is not so relevant to perform the task. This flexibility is what makes the criterion ambiguous, since it will depend on the person who applies it, if he/she makes this exception or another. That is the reason why the descriptors only respond to one element. The same happens with descriptor 2. Descriptor 3, on the other hand, is a good example, because although the wording is ambiguous on the face of it “Uses well...,” it defines what this means (homogeneous mixing of tempera and water); the same is evident in descriptor 4. Descriptors 8 and 9 add the phrase “without difficulty,” which makes the content ambiguous, since the degree of difficulty is a subjective perception of a person facing a task, therefore, that an external observer, in this case the teacher, makes a judgment about this, does not correspond; in addition, it implicitly alludes to the “talent” of the student. If, on the other hand, it were to say, “Applies strokes correctly: uses paint diluted in water to draw a sketch of what he or she will paint,” it would refer to technique and this could be assessed. Similarly, in descriptor 9, what we are interested in assessing is that the student manages to mix and obtain the colours he/she wants. What is implicit in this “facility” is the understanding of the primary, secondary and tertiary colours, whoever masters them will arrive more quickly to the colour they want. This descriptor could be “Obtains the intended colours” or “Uses the theoretical concepts to obtain secondary and tertiary colours.”

7 Learning Assessment Tools: Which One to Use?

153

Another element to consider in the analysis of this tool is that the first two descriptors obey formal aspects and not directly to the achievement indicator, since they are not part of learning but elements of responsibility. Although they can be included in the list, to give a message to the student of their importance, in the analysis of the results, when making a judgement on the achievement of the indicator, they should be excluded so as not to distort the evidence.

7.5

Rating Scales

Rating or assessment scales are recording tools composed of a coherent set of descriptors that are considered to be indicative of a more general concept (Corbetta, 2003). Scales allow teachers to make a judgement about the degree or frequency with which a student exhibits certain behaviours, skills, and strategies. They also establish the criteria that will be assessed and generally present between three and five response options to describe the quality or frequency of the student’s work. These elements or criteria correspond to an “operationalization” of an abstract concept that cannot be directly observed, to which the teacher and/or the student indicate the degree or frequency in which each of them is evidenced in a task or performance, using a pre-established assessment code. An example of this operationalization is the way in which “respect for others” is assessed. Respect is an abstract concept that cannot be measured directly; however, there are observable and measurable behaviours that can be considered elements of respect, such as being silent while another person speaks or taking turns in a conversation. It is necessary to be clear that these elements do not cover the concept in its entirety, but they allow us to assess the concept, to a greater or lesser extent, depending on the quality of our descriptors. The scales arise as a response to the need to measure complex concepts that are not always observable, such as love, self-esteem, happiness, religiosity, respect, or anxiety, allowing an approximation to these constructs through the observation of behaviours and opinions, and thus infer that these acts depend on the characteristics of people (Corbetta, 2003). Although they arise from psychology, they were adopted in the educational field and today are a very useful tool for evaluating attitudes, behaviours, and performance in specific tasks. The purposes of rating scales as an assessment tool in the service of learning improvement are similar to those of the checklists mentioned above: • They provide a tool for the systematic recording of observations, but with a graded, non-dichotomous judgement. • They provide the student with a tool for self-evaluation, especially in the quality of their work. • They provide the learner with the assessment criteria to be considered in the review of their performance. These should be provided in advance so that the

154

C. E. Förster et al.

learner can consider them in preparing their work and not just as a means of feedback on their achievements. • They allow us to identify what adjustments need to be made in teaching in order to generate learning opportunities for our students based on concrete evidence of their current achievements.

7.6

Types of Scales

Scales can be numerical, graphical, or descriptive depending on how the response categories are expressed.

7.6.1

Numerical scales

In this type of scales, the person indicates the degree or persistence with which the descriptor is expressed, marking a number in a range that usually goes from 3 to 5 points. In the instructions it should be indicated that the evaluated characteristic is in increasing degree, that is, the higher the number on the scale, the more present or relevant the characteristic of the descriptor is. For example, if we wanted to evaluate the general concept of “coexistence,” we could use a scale and observe how the students behave during recess and generate a profile for each one. In Fig. 7.5 we can see that all the descriptors, with the exception of 1, have the maximum desirable score, since they describe positive characteristics or behaviour, while descriptor 1 indicates the minimum desirable score, since what is described is negative behaviour. While there is no standard for this, it is important in the analysis to consider that the ideal behavioural ‘profile’ is not at 5 for the whole scale. It is important to point out that in these scales, the numbers do not have a descriptive meaning, that is, the fact that the student is at 4 does not mean that he/ she almost always performs this behaviour, but that on a scale of 1–5, that child is close to the maximum intensity of the evaluated action.

7.6.2

Graphic scales

Graphic scales have two forms of representation: (1) in a graphic profile of a numerical or descriptive scale (Fig. 7.6) or (2) for a descriptor, there is a scale whose categories are images that represent this gradation, and the student or teacher evaluates the degree of execution of a task by marking on the icon or image that represents their appreciation (Fig. 7.7).

7 Learning Assessment Tools: Which One to Use?

7.6.3

155

Descriptive scales

These assess the performance of a task or opinion on a topic using as response categories expressions such as “always,” “usually,” “sometimes” and “never,” relevant to the aspect being assessed. Understanding that there is always room for improvement in learning, descriptive scales are a good tool for students to identify their specific strengths and needs using these categories. The more accurate and descriptive the words are for each item on the scale, the more reliable the tool will be. Rating scales are most effective in providing feedback to the learner when descriptors with clearly understandable measures, such as frequency, are used. When talking about “quality,” scales are less effective because the categories do not contain enough information about what criteria are indicated in each of these items and, therefore, it is recommended to use rubrics that describe these partial performances in detail. Gradation can indicate frequency, intensity or quality of a given task; some examples of response categories are presented in Table 7.3. Let us look at the scale presented in Fig. 7.8; the instructions are addressed to elementary school students, but the language used seems to be adult oriented. As noted in the instructions, the purpose of this scale is to identify the learning strategies that students use when preparing for an assessment, but descriptors 2 and 5 focus on how students struggle when faced with an assessment and not on their strategies for studying prior to it. However, in terms of the construction of the descriptors, descriptor 2 could be stated in fewer words (e.g., “During an assessment I make an effort to get a good grade”), and descriptor 3 alludes to a

Fig. 7.5 Numerical scale to assess school coexistence

156

C. E. Förster et al.

Fig. 7.6 Graphic profile of the student in the assessment of his or her behaviour in coexistence

Fig. 7.7 Example of graphic scale

prior step, saying that using the underlined words implies that you read the text or your notebook beforehand and underlined key words. The response categories are grammatically consistent with the descriptors, but the abbreviations put in the table could confuse students or waste their time going back to the instruction, especially if they are in the first years of elementary school.

7 Learning Assessment Tools: Which One to Use?

157

Table 7.3 Examples of response categories for rating scales Category type

Examples

Frequency

• Always, almost always, often, almost never, never • Usually, generally, sporadically • Often, sometimes, a few times, rarely

Intensity

• A lot, quite a bit, little, nothing • Strongly agree, agree, disagree, strongly disagree • Very satisfied, satisfied, neutral, dissatisfied, very dissatisfied

Task quality

• Poor, sufficient, good, excellent • Not achieved, partially achieved, achieved, very well achieved

Fig. 7.8 Descriptive assessment scale to assess study strategies in elementary education students

7.7

Recommendations for the Construction of Checklists and Rating Scales

Develop a table of specifications: just as for the development of a test, these instruments require planning their structure, dimensions, and their conceptualization. Unlike tests, the cognitive skills associated with the task are not included here because the tool does not have that purpose, but it is relevant to define what is understood by each dimension or subdimension. Table 7.4 presents the explanation of each column and Table 7.5 exemplifies this structure.

158

C. E. Förster et al.

Table 7.4 Table of specifications for preparing a checklist or an assessment scale Dimension

Subdimension

Conceptualization

Descriptor number on the tool

The name of the dimension is placed in this column. It must be manageable, so it is recommended that it be short but representative of the content

This column is optional, only if the dimension is very broad and it is necessary to specify some distinctions that are considered relevant, then dimension-specific categories can be added

This column is expected to define briefly but clearly what will be understood under the name of the dimension. Although the name is representative, this column gives the theoretical basis or frame of reference on which the dimension is based

This column is filled once the tool has been finalized and is ready for your application. You can put here the number of the descriptors that belong to each dimension and sub-dimension, or the full descriptors can be placed. It is important to consider that with this column, the original allocation of the descriptors to a dimension is identified, facilitating the analysis of the collected information

Table 7.5 Table of specifications to develop a scale of perception of violence in adolescents Dimension

Description

Descriptor number on the tool

Peer violence

Justification of violence between equals as a reaction and courage

1–5–8–11–13–14–16

Domestic violence

Sexist beliefs and justification of domestic violence

2–4–7–9–15–18–21–22

Violence against minorities

Intolerance and justification of violence towards minorities as punishment for being that way

3–6–10–12–17–19–20

Construct the introductory paragraph and response instructions considering relevance. For this take into account: • The indications for the application of the instrument (if it is to observe performances, it requires an application protocol). • The purpose of the application. • The response instructions. • How the information will be analysed. • Explanation of the concepts that make up the answer options (when descriptive).

7 Learning Assessment Tools: Which One to Use?

159

Develop the descriptors or statements considering the language and wording requirements of each type of tool. Consider here also how it will be applied (selfreport or applied by the teacher or a peer), the purpose of the instrument and the definitions set out in the specifications table. Select the response category most appropriate to the purpose and the assessment situation. First, define whether it is intensity, frequency or quality and then which categories are most appropriate. In general, 4 categories are placed when the core or neutral item does not provide relevant information and this intermediate category is placed when we are interested in knowing if the student has a position on a topic or not. In the case of the checklists, there are only two response categories. In both types of tools, the grammatical coherence of the descriptors with the response categories must be safeguarded. Share your work with others. It is always important to have a peer look at our work and act as an expert judge. When we have done a lot of work on an instrument, we tend to overlook words or details that could be improved. When it is seen by someone who has not been involved in the development, they can provide us with questions or comments that will enrich our tool. Relevance of the analysis of the information collected. Depending on the type of scale, there are different analyses that can be carried out. For example, if it assesses more than one expected learning, the analysis should be done for each learning outcome or dimension, and not pool all the responses and make an analysis of the scale as a whole. To avoid “falling into this temptation,” it is recommended to construct unidimensional mini scales, that is, that everything that is assessed in them corresponds to a single learning outcome; the same can be done with checklists. Rating scales are very useful for assessing students’ performance of tasks and their attitudes toward a particular subject; they are relatively easy to construct, their application requires little time, and their correction and analysis is of low complexity. However, they have limitations due to their appreciative nature, which necessarily implies subjectivity of the information collected. For this reason, this counter-time must be taken into account when constructing the instrument, trying to achieve objectivity and reliability of the results, and also to qualify and make decisions based on the results obtained.

7.8

The Rubric

When we have to evaluate practical performances such as an oral presentation, a work of visual art or a musical performance, we face the problem of how to safeguard objectivity. We usually use a rating scale or a checklist and assign scores to a set of criteria or aspects that we will consider. But what happens when the guideline does not adjust to what we are observing, what happens when the final product is more than the parts we are evaluating with the guideline? Another important element, and one that is becoming more and more intentional, is peer assessment, but how do we ensure that all students are using the same assessment criteria and that

160

C. E. Förster et al.

they are grading the complexity of the performance equally? Again, the answer is to use a good assessment guideline, however, if an indicator were “Voice volume is adequate” in an oral presentation, I must ensure that all auditors (or most, at least) assign the same score, and that implies that it is clear to everyone what “adequate” means in that context, for students of a certain age, etc. If we have to make this clarification for each indicator, the task can be endless and therefore unfeasible to be applied in a classroom activity. In this context, the rubric seems to be a more robust assessment instrument than checklists and rating scales, since it allows for explicitly describing the expected performances and the graduation of these performances by associating them with a category that can be quantitative (a score) or qualitative (a level). Although the current trend is to assume the rubric as the assessment tool par excellence, given its versatility and pedagogical potential (Blanco, 2008), it is necessary to take into consideration its scope, since it can condition the teacher in terms of the elements evaluated. Thus, a rubric is not a universal instrument; it requires adaptations according to the context of the task to be evaluated and the characteristics of the students to whom it will be applied (Spence, 2010).

7.8.1

What is a Rubric

The term “rubric” comes from the Latin rubricae or red ochre. This colour indicates the relevance of an element or aspect that must be explained, as it is considered important (Balch et al., 2016). Thus, a rubric would indicate or value that which is very relevant. Rubrics are evaluation matrices or double-entry tables that have on one side the criteria to be evaluated and on the other, the grading of the quality or development of the expected performance for each criterion (Mertler, 2001; Montgomery, 2000). Such grading corresponds to a specific description of student performance in a project, presentation, research, portfolio, or any task that involves the observation of different levels of student performance. These descriptions are defined as a guide for assigning a score or category to the student with respect to his or her performance on a task (Blanco, 2008; Mertler, 2001). The rubric is a concrete narrative description of the criteria considered in making a judgment about a student’s performance on an assessment task. These criteria are set at various levels of proficiency, ranging from not meeting the requirement or criterion to proficiently demonstrating mastery (Balch et al., 2016). Rubrics are explicit matrix schemes used to classify student products or behaviours (evidence) into categories that vary along a continuum (Allen, 2014). Rubrics provide the criteria for assessing student work. Allen (2014) and Andrade (2000) argue that they can be used to assess a multiplicity of products or performances, including essays, research reports, portfolios, artwork, recitals, oral presentations, performances, and group activities. We can also use them to be applied by the students themselves or their peers, in addition to the teacher (Andrade, 2000).

7 Learning Assessment Tools: Which One to Use?

161

Rubrics are versatile instruments because they can be used for a variety of purposes. On the one hand, they can be used to clarify students’ expectations about the learning goals and standards of an assessment task, and on the other hand, to provide formative feedback to students and, also, to grade their assessment tasks and/or to assess courses and programmes (Allen & Tanner, 2006; Balch et al., 2016; Brookhart, 2013). Churches (2015) suggests that rubrics are tools that can be used to provide students with effective feedback under a formative assessment approach. It is argued that rubrics refer to criteria and not to standards. Those who assess with rubrics ask themselves at what level of achievement is the student’s performance or product, with respect to the defined standard, and not how it compares to other students (Allen, 2014). This assessment tool meets the requirements of reliability (indicating explicit criteria and performance indicators), validity (consistency with stated learning outcomes) and effectiveness (transparency, consistency, and detailed feedback) (Balch et al., 2016).

7.8.2

Contribution of the Rubrics

Throughout this book we have examined that learning unfolds from the development of formative assessment, which provides meaningful and effective feedback to the learner. In this context, the rubric represents a tool that contributes to the development of such a learning environment. Below we will examine some of its advantages (Allen, 2014; Andrade, 2000; Brookhart, 2013): • They can be used for formative and summative assessment. In the former, they provide feedback to students to enrich their learning, and in the latter, they classify evidence and assign scores in a valid and reliable manner. • Clarify expectations for students. They learn more when they are clear about what is expected and understand what they must do or learn to achieve the satisfactory level. • Motivate and involve students to self-monitor their own learning from the development of the assessment task. • They make the process of classifying evidence and assigning scores (and subsequent grading) clearer for both the teacher and the student, so that: – They speed up the grading of assessment tasks by teachers. – Reduce student complaints about the correctness of assessment tasks. • They help the pedagogical team to create better tasks or assessment situations that ensure that students show what they are required to display. • They help teachers adapt teaching to meet the needs and challenges identified in the development of assessment tasks. • They give each student explicit feedback on the main dimensions of the assessed performance or product. This way he/she knows what he/she is doing well and what he/she needs to improve.

162

C. E. Förster et al.

• In conclusion, we could argue that rubrics represent a powerful tool for teaching and assessment. They contribute to improving student learning, as they help to define the “quality” of the learning goal and lead them towards its achievement and, in turn, provide the teacher with relevant information to guide teaching and the design of assessment tasks for their students.

7.8.3

Types of Rubrics

The literature describes various types of rubrics (Allen & Tanner, 2006; Andrade, 2008; Brookhart, 2013), some are the holistic and analytic rubrics, while another typology distinguishes between generic and task specific. The use of one or the other should be coherent with the aspect to be evaluated and its objective, as it depends on what is being evaluated and for what purpose.

7.8.3.1 Generic and Specific Rubrics A generic rubric contains criteria that are general to a particular assessment task and can be used for similar tasks or performances. The criteria are evaluated separately and are therefore associated with an analytic rubric. A task-specific rubric assesses criteria unique to that particular task. In this rubric, its criteria and descriptors reflect the specific characteristics of the performance or product being assessed. The division between generic and specific rubrics is blurred, as the task-specific rubric is often based on generic rubrics. It is also possible to design hybrid rubrics that combine characteristics or criteria of both types. Generic rubrics can be applied to a number of different tasks. For example, there are those that are used to assess a global skill in any subject, such as speaking or interpersonal relations. Therefore, a generic rubric could be applied to any assessment task that is designed for the same skill or learning. Hybrid rubrics which combine features of generic and task-specific rubrics -are very useful in classroom assessment because they provide feedback to students on both the broad dimensions of learning and their performance on the specific learning competencies and knowledge being assessed, aligned to a single tool. An example of a generic rubric for formative and summative assessment of teamwork is presented below. In order to develop it, we considered the crosscutting learning objective set out in the curricular bases from 7th grade to 2nd grade: “To work in teams in a responsible manner, building cooperative relationships based on mutual trust, and adequately resolving conflicts.” The rubric presented below contains four criteria or dimensions to be evaluated (which are analytical) and whose performance has been graded into four levels with their respective descriptors.

7 Learning Assessment Tools: Which One to Use?

163

Rubric to assess and provide feedback on teamwork Criterion or dimension

Performance categories Excellent

Satisfying

Progressing

Unsatisfactory

Organization of group work

The group members organize the tasks for the fulfilment of the goal

The group members organize the tasks for the fulfilment of the goal

The group members divide the tasks to be carried out

The group members do not divide tasks, they all do everything or ask the teacher for help dividing the tasks

The group assigns roles to achieve the goal, ensuring that the workload is equitably defined among its members

The group assigns roles equally among all its members, depending on the goal to be achieved

The group assigns roles based on the tasks that were divided without considering the goal to be achieved

The group assigns roles without considering the defined tasks or the goal to be achieved

The group sets and commits to the deadlines and dates for carrying out each of the tasks and organizes them into stages and partial goals

The group specifies the deadlines and dates for carrying out each of the tasks based on the goal to be achieved

The group sets the deadlines and dates for completing each of the tasks

The group does not define the deadlines and dates for carrying out each of the tasks, or it does so incompletely, or the dates or deadlines are not viable

Collective responsibility: The work team develops all the planned tasks. the team meets the defined deadlines

Collective responsibility: The work team develops the majority (more than 75%) of the planned tasks The team meets the defined deadlines

Collective responsibility: The work team develops < 75% of the planned tasks The team meets the defined deadlines most of the time

Collective responsibility: The work team develops half or less of the planned tasks

Collective and individual responsibility (compliance with proposals and commitments)

164

C. E. Förster et al.

Rubric to assess and provide feedback on teamwork Criterion or dimension

Collective and individual responsibility (compliance with proposals and commitments)

Conflict resolution (how problems and difficulties are resolved during the development of the group work)

Performance categories Excellent

Satisfying

Progressing

Unsatisfactory

Individual responsibility: Each student attends the work sessions defined for the team (between 90 and 100%)

Individual responsibility: Each student attends the majority (over 75%) of the sessions

Individual responsibility: Each student attends < 75% of the sessions

Individual responsibility: Each student attends half of the sessions

Each student shows up on time for all sessions

Not every student shows up on time for all sessions

Each student turns in assigned tasks in full

Each student completes the assigned tasks most of the time (75%)

Each student seldom (less than half) completes the assigned tasks

Every student turns in most assigned tasks incompletely or not at all

Each student submits the assigned tasks in advance

Each student turns in the assigned tasks on time

Each student submits the assigned tasks on time most of the time (more than 75%)

Each student submits assigned tasks after the deadline or relies on others to develop their work

Team members recognize a conflict, difficulty, or difference of opinion, and each actively seeks and suggests solutions

Team members recognize a conflict, difficulty, or difference of opinion and several of them seek and suggest solutions

Team members do not recognize a conflict, difficulty, or difference of opinion and very few seek and suggest solutions

Team members do not recognize a conflict, difficulty, or difference of opinion and do not seek or suggest solutions to address it

Team members make the decision that generates greater adherence and at the same time allows the conflict to be resolved

Team members make the decision that allows the conflict to be resolved

Team members partially resolve the conflict using strategies that generate little or no support from their members

Team members do not resolve the conflict but turn to others outside the group to deal with it

7 Learning Assessment Tools: Which One to Use?

165

Rubric to assess and provide feedback on teamwork Criterion or dimension

Performance categories Excellent

Satisfying

Progressing

Unsatisfactory

Quality of interaction (interpersonal relationships)

The student always respects speaking turns t, relates respectfully in group interactions, and is capable of assessing the different points of view provided, recognizes errors, and accepts suggestions

The student respects speaking turns and maintains a respectful attitude within the group most of the time. However, the student has difficulty recognizing errors and giving suggestions

The student has difficulties in respecting speaking turns and maintaining a respectful attitude towards group interactions. Likewise, he fails to assess different points of view, nor does he recognize errors or accept suggestions

The student does not respect speaking turns, does not relate respectfully in group interactions and is not able to assess the different points of view provided, does not recognize errors, or accept suggestions

7.8.3.2 Holistic Rubrics Holistic rubrics are matrices in which the description of the grades includes a group of specific criteria that seek to represent the globality of the action (Mertler, 2001). They are suitable for the assessment of tasks where it is difficult to disaggregate them into parts to assign a score, and where the grade should reflect an overall achievement of the objectives (Truemper, 2004). Holistic rubrics are usually used when errors in some part of the process can be tolerated if they do not affect the overall quality (Mertler, 2001). An example of this type of rubric is presented in Fig. 7.9. The advantage of a holistic rubric is that it focuses on the overall quality of the product, skill or understanding of a specific content or skill (unidimensional assessment), which results in a faster assessment process, since the teacher has to examine the student’s product only once to assess it in a general sense (Mertler, 2001). As a result, this type of rubric is mainly used in summative assessments. Its main disadvantage is that an overall description of performance has less precision and may leave in the same category both students who are at near-full performance in that category and students who are at incipient performance in that category. In addition, students receive less detailed feedback and, therefore, have less guidance as to what they need to address to improve their performance on similar tasks.

7.8.3.3 Analytic Rubrics The analytic rubric consists of a double-entry matrix in which the product or performance to be evaluated can be disaggregated into parts and what is of interest is to gather evidence of the student’s performance in these specific dimensions, for which there is a gradation of descriptions of their own. Thus, there is no global

166

C. E. Förster et al.

score, but rather a set of partial scores that can be analyseanalysed in a disaggregated manner or added together to generate a total score (Blanco, 2008; Truemper, 2004). Analytic rubrics are mainly used when a multidimensional assessment is expected, which occurs in tasks in which the parts of the process are relevant, and a limited range of possible answers is expected from students (Mertler, 2001). The teamwork rubric is an example of this type of rubric, another example is presented in Fig. 7.10. Among the advantages of this type of rubric are the quality of the feedback to students, who can know with greater precision what they are having difficulties with and what elements they have already achieved in their work, making them a good instrument for formative or process assessments (Arter & McTighe, 2001). In addition, it is possible to create a “profile” of students’ specific strengths and weaknesses and work on them in class. The main disadvantages of this type of rubric are related to time, as the assessment process can be substantially slower in both construction and application because it requires the teacher to examine the product several times to assess different skills or characteristics individually.

7.8.4

Steps for Designing and Developing a Rubric

This assessment tool requires a planning and development time that can be quite long, however, once constructed, it significantly facilitates the correction or assessment time of the students. The steps described below are a compilation of the proposal of Mertler (2001) and Mineduc (2006):

Fig. 7.9 Holistic rubric to assess analysis and application skills of a specific disciplinary content

7 Learning Assessment Tools: Which One to Use?

167

Fig. 7.10 Analytic rubric for assessing learning associated with concept mapping

Step 1: Preliminarily define which dimensions of learning will be assessed or evaluated. It involves re-examining the learning objectives that the assignment is expected to achieve. This allows the teacher to match the scoring structure to the objectives and instructions. It is recommended that the dimensions or aspects to be assessed are arranged in an order that can range from most important to least important. Step 2: Identify specific observable characteristics that you want (as well as those that you do not want) students to demonstrate in their products or processes. You need to specify the characteristics, skills, or behaviours that should be evident in student performances or products, as well as common errors that should not be present. Step 3: Brainstorm characteristics that describe each attribute. Identify ways to describe performances above the mean, at the mean, and below the mean for each observable characteristic identified in Step 2. Step 4a: For holistic rubrics, it is suggested to write detailed narrative descriptions of excellent and poor work, incorporating each characteristic or attribute in the rubric. The highest and lowest level of performance should be made explicit by combining the descriptions of all characteristics. Step 4b: For analytic rubrics, write detailed narrative descriptions of excellent and poor work for each individual characteristic or attribute. Describe the highest and lowest level of performance using the descriptors for each characteristic or attribute separately. Step 5a: For holistic rubrics, complete the rubric with descriptions of other levels on the continuum from excellent to poor for the collective characteristics. It is suggested that descriptions be written for all intermediate levels of performance. Step 5b: For analytic rubrics, the rubric should be completed by describing further levels of the continuum from excellent to poor for each characteristic. It is recommended that descriptions be written for all intermediate levels of performance and for each characteristic or attribute separately. Step 6: Choose examples of students’ work that account for each level. This helps the teacher to assign scores and in the future such work can represent milestones or exemplifications of the levels.

168

C. E. Förster et al.

Step 7: Revise the rubric as needed. The teacher should be prepared to reflect on the effectiveness of the rubric and revise it before implementing it again.

7.8.5

Rubric Application Process

The following are key aspects to consider before using the rubric as an assessment tool: a. Define and briefly describe a task you want to evaluate with a rubric. b. State the learning objectives involved in the assignment and consider all the learning that the assignment will assess. c. Using the construction criteria already described, construct a rubric aligned with the learning objectives. d. Evaluate the rubric before using it. Brookhart (2013) recommends doing so by considering the following quality criteria: • When examining the dimensions, the following aspects should be assessed: – Does each dimension cover the important elements of student performance? – Does the dimension capture any key themes in your teaching? – Are the dimensions clear? – Are the dimensions distinctively different from each other? – Do the dimensions represent skills the student already possesses (such as organisation, analysis, using conventions)? • When examining the descriptors or descriptions of each dimension, the following aspects should be evaluated: – Do the descriptions match the dimensions? – Are the descriptions clear and different from each other? – If you use points, is there a clear rationale for assigning them to each dimension? – If you use a rubric with performance levels, are the descriptions appropriate and equally weighted across those levels? • When examining the full scale of the rubric, the following aspects should be evaluated: – Do the descriptors under each level truly represent that level of performance? – Are the scales (such as exemplary, proficient, beginner) encouraging and yet informative without being negative or demotivating?

7 Learning Assessment Tools: Which One to Use?

7.9

169

Suggestions for the Use of Rubrics in the Classroom

We highlight the approaches of Brookhart (2013), Andrade (2000) and Allen (2014), who point out a series of recommendations for implementing the use of rubrics in the classroom, increasing their potential to collaborate with student learning. Grade using a rubric. Provide it at the beginning of the assessment situation so that students know what your expectations are and how they will be assessed. Set aside time to discuss the rubric with your students before they begin to develop their work. Develop a rubric with your students for an assessment task such as a group project. Students can then monitor themselves and their peers using agreed-upon criteria that they helped develop. Many teachers have found that students develop assessment criteria, and their standards are higher for themselves than the teacher would define. Ask students to apply their rubric to some sample products before they create them. Teachers report that students are very accurate in doing this, and this process tends to help them evaluate their own products as they are developing them. Their ability to evaluate, edit, and improve drafts is very important. Ask students to exchange drafts of their work and provide feedback to their peers using the rubric. Then give students a few days to rework their drafts based on the feedback before turning in the final product to the teacher. Ask students to self-assess their products using the rubric and turn in the selfassessment with the assignment. The teacher then assesses the final paper, and they can compare the gaps between the assessments generated by the two. In summary, the type of instrument used must be relevant to the learning objectives to be assessed. As teachers, we must remember that assessment techniques and tools have limitations and advantages, so it is important that we have a thorough knowledge of them in order to select or construct the one that best suits our needs. Next, we will delve into graphic organisers as an assessment tool for initial and ongoing formative evaluations, integrating students as assessment agents in the learning process.

References Allen, M. J. (2014). Using rubrics to grade, assess and improve student learning. Strengthening our roots: Quality, opportunity and success professional development day. Miami-Dade College. https://www.academia.edu/39389988/Using_Rubrics_to_Grade_Assess_and_Improve_S tudent_Learning Allen, D., & Tanner, K. (2006). Rubrics: Tools for making learning goals and evaluation criteria explicit for both teachers and learners. CBE-Life Sciences Education, 5(3), 197–203. https:// doi.org/10.1187/cbe.06-06-0168

170

C. E. Förster et al.

Andrade, H. G. (2000). Using rubrics to promote thinking and learning. Educational Leadership, 57(5), 13–18. Andrade, H. G. (2008). Putting rubrics to the test: The effect of a model, criteria generation, and rubric-referenced self-assessment on elementary school students’ writing. Educational Measurement Issues and Practice, 27(2), 3–13. https://doi.org/10.1111/j.1745-3992.2008.00118.x Arter, J. A., & McTighe, J. (2001). Scoring rubrics in the classroom: Using performance criteria for assessing and improving student performance. Corwin Press Balch, D. E., Blanck, R., & Balch, D. H. (2016). Rubrics-sharing the rules of the game. Journal of Instructional Research, 5, 19–49. https://doi.org/10.9743/JIR.2016.4 Blanco, Á. (2008). Las rúbricas: un instrumento útil para la evaluación de competencias [Rubrics: A useful instrument for the assessment of competencies]. In L. Prieto (Ed.), La enseñanza universitaria centrada en el aprendizaje: Estrategias útiles para el profesorado [University education based on learning: Useful strategies for teachers] (pp. 171–188). Octaedro-ICE. https://dialnet.unirioja.es/servlet/libro?codigo=289201 Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading. Association for Supervision and Curriculum Development [ASCD] Churches, A. (2015). A guide to formative and summative assessment and rubric development. In Proceedings of the 21st Century Fluency Project. https://www.sarahnilsson.org/app/download/ 965095587/formative+v+summative+assessment.pdf Corbetta, P. (2003). Metodología y técnicas de la investigación social [Social research methodology and techniques]. McGraw Hill. Cunningham, G. K. (1998). Assessment in the classroom: Constructing and interpreting texts. Falmer Press. Eberly Center for Teaching Excellence and Educational Innovation. (2022). Creating exams. https://www.cmu.edu/teaching/assessment/assesslearning/creatingexams.html Himmel, E., Olivares, M. A., & Zabalza, J. (1999). Hacia una evaluación educativa. Aprender para evaluar y evaluar para aprender, vol. I. [Towards an educational evaluation. Learning to assess and assessing to learn, vol. I]. Pontificia Universidad Católica de Chile and Chilean Ministry of Education [Mineduc] Livingston, S. (2009). Constructed-Response test questions: Why we use them; How we score them. ERIC. Retrieved November 25, 2022, from http://files.eric.ed.gov/fulltext/ED507802.pdf Mertler, C. A. (2001). Designing scoring rubrics for your classroom. Practical Assessment, Research and Evaluation, 7(1), 1–10. https://doi.org/10.7275/gcy8-0w24 Mineduc (2006). Evaluación para el aprendizaje: Enfoque y materiales prácticos para lograr que sus estudiantes aprendan más y mejor [Assessment for learning: Practical approach and materials to help your students learn more and better]. Chilean Ministry of Education. https://biblio tecadigital.mineduc.cl/bitstream/handle/20.500.12365/2055/mono-851.pdf Montgomery, K. (2000). Classroom rubrics: Systematizing what teachers do naturally. The Clearing House, 7(6), 324–328. https://doi.org/10.1080/00098650009599436 Popham, W. J. (2003). Test better, teach better: The instructional role of assessment. Association for Supervision and Curriculum Development [ASCD]. https://www.ascd.org/books/testbetter-teach-better?variant=102088E4 Spence, L. K. (2010). Discerning writing assessment: Insights into an analytical rubric. Language Arts, 87(5), 337–347. Tankersley, K. (2007). Tests that teach: Using standardised tests to improve instruction. Association for Supervision and Curriculum Development [ASCD]. https://www.ascd.org/books/tests-thatteach?variant=107022E4 Truemper, C. M. (2004). Using scoring rubrics to facilitate assessment and evaluation of graduatelevel nursing students. Journal of Nursing Education, 43(12), 562–564. https://doi.org/10.3928/ 01484834-20041201-11

7 Learning Assessment Tools: Which One to Use?

171

Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. email: [email protected] Sandra C. Zepeda is Social Worker. Sandra C. Zepeda did Master in Educational Evaluation Pontificia Universidad Católica de Chile. Sandra C. Zepeda is Lecturer at the UC Faculty of Education in undergraduate and postgraduate training programs and Specialist in curriculum development and evaluation for learning in school and higher education. Claudio Núñez Vega is Professor of Natural Sciences and Biology, Pontificia Universidad Católica de Chile. Claudio Núñez Vega did PhD Philosophy and Educational Sciences from the Universidad Complutense de Madrid, Spain. Claudio Núñez Vega is Associate Professor, Faculty of Education UC, and Specialist in learning assessment, evaluation of initial teacher training and practical teacher training.

8

Assessing with Graphic Organisers: How and When to Use Them Paola Marchant-Araya

Abstract

In this chapter, a review of the most common graphic organizers in classroom evaluation in schools is made, such as: mind maps, semantic maps, concept maps, Gowin’s V, timelines, Venn diagrams, flowcharts, and fishbone diagram. For each of them, its characteristics are described, elements to be taken into account for its use, its advantages and limitations, and an example. Finally, an analysis of how to use graphic organizers to assess learning is presented, its formative or summative use, the way to assign scores and the need to have auxiliary instruments for its evaluation.

8.1

Introduction

The purpose of this chapter is to present the graphic organisers most commonly used by teachers in the classroom, and to discuss their uses in both teaching and assessment. It describes the characteristics, advantages, and limitations of different types of organisers and their ways of representing knowledge. It also analyses the core aspects to consider in the context of assessment with graphic organisers, given that, in general, it is not known what to assess, with what purpose, in what contexts, who participates in this assessment, as well as the general implications of assessing students’ learning with graphic organisers. The chapter is organised in two parts; the first describes the conceptual aspects that allow us to understand where graphic organisers come from at a theoretical level and the main characteristics of each one of them. The second part provides distinctions regarding how to assess learning using graphic organisers and makes recommendations in the classroom context. P. Marchant-Araya (B) Pontificia Universidad Católica de Chile, Santiago, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_8

173

174

8.2

P. Marchant-Araya

Theoretical Background

When children begin to attend school, they have already acquired certain language rules and some core concepts for later school learning. Thus, the learning process is built up in such a way that they make certain methods of ordering events and objects their own, enabling them to see new regularities and to recognise the terms they represent. This process continues throughout life. From the socioconstructivist perspective, learning is a process in which the learner creates an internal representation of knowledge that is constantly open to change as it is constructed from experiences. This perspective posits that learners make sense of the world when they establish relationships and correlations between what they are learning and what they know or have experienced (Bronfenbrenner, 1987; Condemarín & Medina, 2000). What is meaningful to students, then, is what they have experienced, combining thoughts, feelings, and activities. Because a person’s sequence of experiences is unique, each person constructs his or her own idiosyncratic meanings,1 so the more dissimilar two people’s experiences are, the more difficult it will be for them to share meanings (Novak, 1998). Thus, meaningful learning has, by definition, many personal factors involved; Ausubel et al. (1978) distinguish three types: • Conceptual learning: corresponds to the learner’s perception of the regularity on which the concept is based (objects, events, situations or properties), which has common criteria, and which is designated by means of some symbol. For example, the concept of “the number four” will increase in complexity from a single form to multiple forms of writing it (4, , 4, 4, IV) as the learner expands his or her vocabulary and cognitive structures. • Representational learning: the learner recognises a word, sign, or symbol as a label for a specific object, event, or category of events, e.g., nouns are learned through representational learning. • Propositional learning: this learning goes beyond the understanding of words in isolation or in combination and requires the learner’s ability to grasp the meaning of ideas expressed in propositions. Propositions are two or more words combined to form a statement about an event, object, or idea, which may be valid, invalid, or meaningless. The wealth of meaning we have for a concept grows exponentially with the number of valid propositions that relate a concept to others. Both conceptual and representational learning are parallel processes, because they are idiosyncratic and context dependent. Thus, conceptual meanings combine to form propositional meaning, and both can be represented through their label or name. In the classroom it is common to use graphic representations to identify

1 The idiosyncratic meanings are those significant aspects for the students from the particular or characteristic of them. In the case of a school community, they correspond to the common characteristics shared by it.

8 Assessing with Graphic Organisers: How and When to Use Them

175

students’ knowledge. These representations communicate the conceptual structure of a specific disciplinary domain, which contains both the fundamental ideas and the interrelationships that the student makes (Campos, 2005). Graphical representations can account for students’ spatial reasoning, which is the basis of abstract knowledge and inference (Tversky, 2005). They also allow meanings to be conveyed more directly than a set of words, and facilitate retention, memorization, making connections between concepts, problem solving, and communication of information, making it more comprehensible to students (Moreland et al., 1997; Tversky, 2005; Uba et al., 2016, 2017). The graphic representations that students produce are a source of evidence that we can use to monitor the learning process and to use as a teaching resource that favours different cognitive processes (Alshatti et al., 2010; Campos, 2005; Preciado, 2008; Tversky, 2005). Some uses in the classroom are described below: • Identify fundamental ideas and the complexity of the relationships between concepts that students make. • Drive understanding of concepts and relationships. • Facilitate the integration and synthesis of information. • Encourage the representation of problem-situations in order to approach them and look for a solution. • Generate an organisation of the content in a personal or collective way. • Encourage group discussion and reflection. • Stimulate metacognition and higher cognitive skills in general. • Produce mental images that students can use later. • Integrate students’ prior knowledge with the new things you want them to learn. • Motivate the conceptual development of the students. Next, we will review different types of graphic representations of knowledge, highlighting what each of these organisers can contribute to students’ learning processes.

8.3

Graphic Organisers as a Generic Type

This technique of knowledge representation is the oldest we find in literature. It was introduced by Barron (1969), from his studies of the ideas proposed by Ausubel on meaningful learning and by Bruner on the forms of knowledge representation. Ausubel (2000; Ausubel et al., 1978) developed the idea of the “advance organiser,” of a textual and prose nature, which served as scaffolding between new knowledge and that which was previously acquired by students. This type of organiser sought to activate previous knowledge in order to face a learning task and, thus, to be able to differentiate between new knowledge and similar or contradictory ideas or concepts that existed in the student’s cognitive structure. Barron (1969, p. 32) proposed the change from textual to graphic format and a methodology for teaching vocabulary, thus developing the idea of a visual and textual organiser or “structured overview”.

176

P. Marchant-Araya

Table 8.1 Classification of graphic organisers according to structure and temporality in the teaching process Criterion

Type

Description

Temporality of use in the teaching process

Previous graphic organizer

The student elaborates it before presenting the new knowledge

Back graphic organizer

The student elaborates it after receiving the new information

Word and line graphic organiser

Regardless of time, students define their organiser by combining lines with words

Pictorial type graphic organiser

Regardless of time, students define their organiser from drawings or schematics

Mixed graphic organiser

Students combine lines, words, and pictures

Structure

Currently, graphic organisers have given rise to a series of other forms of representation, which are defined as “visual representations of knowledge that present information, highlighting important aspects of a concept or subject within a scheme using labels” (Preciado, 2008, p. 2). There is agreement that graphic organisers are tools that allow students to understand, classify, organise, and connect information, such processes being taught and assessed in the classroom (Alshatti et al., 2010; Bartels, 1995; Campos, 2005; Reyes, 2011; Uba et al., 2017; Villalustre-Martínez & Del Moral-Pérez, 2010). For Campos (2005, p. 30), “the graphic organiser is a schematic representation that presents the hierarchical and parallel relationships between broad and inclusive concepts, and specific details” and is constructed at the same level of reading new material and not at a higher and generic level of abstraction, constituting a teaching tool to favour meaningful learning (Villalustre-Martínez & Del Moral-Pérez, 2010) (Table 8.1). Although these graphic organisers have a number of advantages that favour their use in the classroom, they also have some limitations: Box 1 Advantages and limitations of the graphics organisers Strengths

• Its structure and visualization allow the assimilation and retention of new content, facilitating the understanding of the students • They allow outlining and organising information or specific content • They present concise and synthetic information • They allow to demonstrate the relationships between different concepts • They allow recognising previous knowledge and previous experiences of the students • They allow integrating previous knowledge with what is being learned to build new knowledge • They give evidence of a comprehensive view of new learning • They can be worked collaboratively, favouring social interaction • They allow giving evidence in a holistic way of the representation of the knowledge of the students

8 Assessing with Graphic Organisers: How and When to Use Them

177

Limitations • They can force students to follow a structure in a linear fashion by repeating previous models, limiting their creativity, imagination, and independent thinking • They can decrease students’ intrinsic motivation to write prose if they are used solely or preferentially • They can be used to force students to incorporate a model of thinking that is not their own, for example, the teacher’s way of thinking, especially in assessment • Hierarchical format,2 although simple, are rigid and do not allow students to include their ideas or points of view Adapted from Daniels et al. (2007)

From this type of knowledge representation, other alternatives emerged in the context of the school classroom, some of which are presented below.

8.3.1

Mind Map

From Psychology, Buzan (2002) proposed his idea of the mind map in the early 1970s. This consists of an expression of thought in which “from a central image the main elements of a given topic are branched through a connected nodal structure” (Villalustre-Martínez & Del Moral-Pérez, 2010, p. 18), i.e., a natural function of the human mind. According to Buzan and Buzan (1994; Buzan, 2002), mind maps are characterized by: • The theme is crystallized in a central image. • The central image radiates the main themes or issues in a branching way. • Branches comprise an image or a word printed on an associated line: points of lesser importance are also represented as simpler branches attached to higher level branches. • The branches form a connected nodal structure. In their structure, mind maps usually combine images that have already been created; geometric figures; straight, broken, and curved lines; words, codes, numbers, and colours. All of the above are designed by the student, that is to say, “handmade” according to the imagination, creativity and rationality of whom they represent. Figure 8.1 shows an example whose content refers to a mind map. Like all organisers, this also has strengths and limitations, which are presented.

2

One type of hierarchical format is the “conceptual mindset,” which leads students to represent a structure of propositions (four intellectual operations), which give priority to fundamental ideas and discard secondary ones (Reyes, 2011).

178

P. Marchant-Araya

Fig. 8.1 Example of a mind map. Source: Design by Boukobza (2010). Extracted from www.ibe rmapping.es

Box 2 Strengths and limitations of the mind maps Strengths

• • • • • • • •

It allows organising and clarifying what is known about a topic or concept It allows organising the previous knowledge on a certain topic It allows reorganising the cognitive structure Planning, communicating, and solving problems Study tidal quickly and efficiently It improves recall and memory and allows having a global view of information It enhances people’s mental capacity Useful for taking personal notes in class

Limitations • It can be difficult to use for people who are extremely logical, because they have a hard time trusting their creativity and intuition • Creating a mind map can take a lot of time, which is considered a rare commodity • It may be difficult for others to understand. A map created and personalized assumes the person’s own thought pattern, and when considering it an assessment instrument, it is necessary to be cautious about the criteria used in their review • General patterns do not necessarily help memorization • It mainly represents hierarchical relationships • It can become too complex and lose sight of the big picture Adapted from Eppler (2006) and Agustina (2013)

8 Assessing with Graphic Organisers: How and When to Use Them

8.3.2

179

Semantic Map

It was proposed by Pearson and Johnson in the late 1970s and disseminated by Heimlich and Pittelman (1986). Its purpose is to relate concepts freely. Semantic maps, also called “lexical graphs,” are a strategy that expresses in graphic form the “categorical structure of information or content through relationships and ideas, concepts or fundamental words that integrate a larger concept and that define and explain it” (Campos, 2005, p. 37). The semantic map is structured from verbal and non-verbal concepts. The verbal concepts are presented in geometric figures to symbolize the nodes within which the words and ideas go, and the associations between major and minor concepts are represented through lines of interrelation (solids or arrows) that serve to unite or relate the concepts (Campos, 2005; Sinatra et al., 1986). There are two types of semantic maps (free and fixed). Free maps are developed by each student without a predefined structure and depend on their creativity. The format usually uses writing of a key idea or concept in the centre, inside some geometric figure and, around it, the concepts or characteristics that are associated with the core concept are placed. In the first line of disaggregation, the concepts are written in capital letters, and the following levels in lower case. On the other hand, fixed maps have a predetermined structure that shows a purpose in the organisation of information, and it is possible to distinguish four categories (Sinatra et al., 1986) (Fig. 8.2): (1) Narrative or sequential organisation: students describe each episode in the respective figures, following a sequential structure to show the events that occurred. (2) Thematic or descriptive: people, places or things are described in detail around a central theme and the connections of its components. The most important topics or characteristics are placed directly connected to the central theme. (3) Comparison or contrast: the discourse is presented based on its differentiating and similar elements on each side of the map in the respective nodes. (4) Classification: the relationships between concepts, properties and attributes are shown, presenting macro concepts and those that are derived from them. The strengths and limitations of semantic maps are presented below:

180

P. Marchant-Araya

Fig. 8.2 Examples of fixed semantic map structure

Box 3 Strengths and limitations of semantics maps Strengths

• They improve students’ vocabulary and their knowledge about the meaning of words • They favour understanding of new information • They allow organizing ideas before writing a text • They allow integrating different parts of a content • They serve as study material • They are useful for making the relationship between words visible

Limitations • They are not easy to apply for those who are learning to use them, they require training, and it is necessary to explicitly teach how to make them • It can be difficult to discover the relation between ideas • Top-down structures can make it difficult for non-linear thinkers to understand Adapted from Eppler (2006) and Agustina (2013)

8 Assessing with Graphic Organisers: How and When to Use Them

8.3.3

181

Gowin’s Vee Diagram

This heuristic technique was devised by Gowin in 1988 and appears as a suitable alternative to identify the conceptual and methodological elements that are reciprocally related in the construction of knowledge and is useful to solve a problem or understand a procedure (Campos, 2005; Guerra & Naranjo, 2016; Novak & Gowin, 1984). The name “Vee” responds to its V-shaped representation, as shown in Fig. 8.3. In this diagram, it is possible to distinguish four core elements of its structure: (1) Events/objects that are recorded and the concepts that are associated to these facts are located at the vertex of the Vee and are the starting point of knowledge construction. (2) Focus or determining questions allow you to initiate the activity between the two fields of the Vee; they are located in the centre. (3) Conceptual aspects that are known, applicable, useful, necessary to understand the events and answer the questions; they are located on the left side of the diagram. (4) Methodological aspects or methods used that complete the description. The value judgments involved are described here; they are located on the right-hand side of the diagram. The purpose of Gowin’s Vee is to generate meaningful learning based on the relationship between what students already know and the new knowledge they are developing, understanding and integrating into their cognitive structure. In this sense, the Vee allows students to recognise:

Fig. 8.3 Example of Gowin’s Vee diagram

182

P. Marchant-Araya

(1) What events or objects are being observed. (2) Which of the concepts you already know can you relate to these objects and events? (3) What kind of data register are worth doing. In the school system, this type of knowledge representation is used in different disciplines; however, it is mostly used in science, stimulating students to pose a question or questions that guide the development of a scientific inquiry. In the context of evaluation, Novak and Gowin (1984) observed that students who had constructed a Vee diagram before entering the science lab had better time utilization in activities related to the lab task than those who did not use the lab. The main advantages and limitations of this diagram are presented below: Box 4 Strengths and limitations of Gowin’s Vee diagram Strengths

• • • •

It facilitates building new knowledge It allows the integration of previous knowledge with new learning It promotes understanding in problem solving processes It allows interrelating conceptual and methodological aspects with the questions that can be raised on a topic • The student requires putting into action complex cognitive skills (interpretation, analysis, synthesis, and assessment of knowledge over memorization)

Limitations • To evaluate with the Vee, students must be trained, otherwise it turns out to be a very complex technique for them, given their habit of traditional formats • Concepts are often confused with theories and principles • Requiring students to put complex cognitive skills into play can lead them to a feeling of insurmountable difficulty and to abandon the task, so monitoring of this aspect by the teacher is required

8.3.4

Concept Map

The origins of concept maps can be found in the approaches of Novak and Gowin (1984), who, inspired by Ausubel et al. (1978) and the theory of meaningful learning, sought to propose approaches that would give meaning to students’ learning, stimulating reflection on the structure and process of knowledge production or meta-knowledge. This is how concepts and propositional learning are conceived as a basis on which one builds’ their own idiosyncratic meanings (Novak, 1998). Concept maps appear, in this sense, as a mediation tool between the one who teaches, the one who learns and the one who evaluates, so that learning is effectively meaningful and its importance and contribution to teaching is recognised

8 Assessing with Graphic Organisers: How and When to Use Them

183

(Alshatti et al., 2010; Campos, 2005; Eppler, 2006; Rovira, 2016). Concept maps have been defined as: • “A schematic device for representing a set of concept meanings embedded in a framework of propositions” (Novak & Gowin, 1984, p. 33). • “Graphical representations of concepts of a specific domain of knowledge, constructed in such a way that the interrelationships between concepts are evident” (Cañas et al., 1997, p. 2). • “Hierarchical diagrams that reflect the conceptual organisation of a discipline or part of it (…) can be understood as a strategy to help students to learn and teachers to organise the work material” (Campos, 2005, p. 23). Escaño and Gil de la Serna (1999) add that this representation, in addition to being graphic, schematic, and fluid, promotes and translates the organisation of ideas about a certain content. It “promotes” because its elaboration (and even its reading) requires focusing on the essentials, to organise the ideas and to establish relations between them. It “translates” because it reflects the way knowledge is stored in our minds. Concept maps are also effective in favouring retention and memorization through the connections that are established, since they allow us to apply knowledge in an organised way (Moreland et al., 1997). For Novak (1998), concept mapping is a personal process based on individual knowledge and experiences, or at least, with its own meaning. For this reason, the final result of two people’s maps can be very different. In the same sense, the concept map that someone creates does not represent in an identical way (as if it were a photograph) the structure of learning and knowledge in his or her mind (Novak & Gowin, 1984), so its assessment cannot be for summative or certification purposes and is even less qualified. Its use should be processual, so that each student can give varied evidence at different moments of his or her learning. The technique to elaborate concept maps is based on concepts or “terms,” which must be extracted from a source (text, teacher’s speech, fact, etc.) and visually organised through lines that signify relationships, indicating for each one linking words corresponding to the type of relationship. According to Novak and Gowin (1984), there are three characteristics that differentiate concept maps from other graphic resources: (1) Hierarchization: the concepts are arranged in order of importance or “inclusiveness.” The more general or inclusive ideas occupy the top of the structure and the more specific ones, together with examples, occupy the bottom. Concepts that have the same hierarchy level must be in the same row or height. (2) Selection: they are a synthesis or summary that contains the most significant aspects of a topic. Sub-maps can be developed to expand on different parts or sub-topics of the main topic. (3) Visual impact: “A good concept map is concise and shows the relationships between the main ideas in a simple and eye-catching way, taking advantage

184

P. Marchant-Araya

Fig. 8.4 Example of a concept map (adapted from Guerra & Naranjo, 2016)

of the remarkable human capacity for visual representation” (Novak, 1998, p. 106). The more visual the map, the more content is memorized and the longer the memorization lasts, since perception is developed, which benefits students with attention problems through visualization activities. The basic elements of a concept map are: • Concepts: regularities in events or objects that are designated by a term. That is, ideas or words that refer to events. • Propositions: they constitute the smallest semantic unit that has truth value. They are formed when concepts are joined with linking words. • Linking words: these are words that link concepts and indicate the types of relationships that exist between them. They are also called “connectors,” because of their function of connecting concepts. The map organises these elements by relating them graphically and forming semantic chains, i.e., with meaning. In its simplest form, a concept map is made up of two concepts connected by a linking word (Novak & Gowin, 1984). Concepts are never repeated, they go inside ovals or rectangles and the linking words are placed near the lines of relation. It is convenient to write the concepts in capital letters and the linking words in lowercase letters; they can be different from those used in the text as long as the meaning of the proposition is maintained. For linking words, verbs, prepositions, conjunctions, or any other type of conceptual nexus can be used. These words are the ones that give meaning to the map, even for people who do not know much about the subject. Figure 8.4 shows the structure of a concept map.

8 Assessing with Graphic Organisers: How and When to Use Them

185

It is essential to consider that in the construction of the concept map, what is important are the relationships that are established between the concepts through the linking words, which allow configuring a true meaning (“truth value”) on the topic studied, that is, if we are building a concept map on “the cell,” the structure and relationships of this should lead to represent that concept and not another. It is important to ensure that students have had experience in constructing concept maps before being assessed using this technique (Eppler, 2006) and that the text or content used to construct the map is meaningful to the student. In Mathematics, concept maps are a contribution to visually account for the connections that occur, and thus assess how students are understanding a concept (Bartels, 1995). The strengths and limitations of concept maps are presented below: Box 5 Strengths and limitations of the concept maps Strengths

• They allow teaching to be planned by identifying which concepts are relevant to address and help students to understand the global overview of what they are learning • They constitute a tool that serves to illustrate the cognitive or meaning structure that students have • They allow the student’s conceptual errors to be worked on and given feedback, as well as facilitating the connection of the information with other relevant concepts • They allow the progressive differentiation between concepts, especially if they are elaborated at different moments of the development of the topic • They favour the integration or assimilation of new relations between concepts • They provide a logical and structured organisation of the contents • They favour creativity, problem solving and autonomy • They allow interrelated learning to be achieved by not isolating the concepts, the ideas of the students and the structure of the subject • They encourage negotiation by sharing and discussing meanings when they are made in a group • They are a reference, a good graphic element when you want to remember a concept or a theme • They allow the parts (the whole) to relate to each other

186

P. Marchant-Araya Limitations • The richness of the concepts depends in part on the sociocultural capital of the student • To elaborate conceptual maps, it is required to master the information and knowledge (concepts) with which the student will work • The time invested in the assessment of concept maps is greater, in part because students must be explicitly taught how to make them and because there is no expected common reference (single guideline) when they are reviewed; they must be understood individually • When they are non-directive formats (such as handing them a blank sheet of paper), students find it difficult to situate themselves without a given structure • Concept map assessment is an eminently subjective and arbitrary scoring process and should not be graded • Top-down structures may not be suitable for representing or structuring sequential content such as processes or timelines • They have a medium to high level of difficulty for students, due to the complexity of the network of relations that they imply. Because of this, they require a lot of training • They tend to be idiosyncratic, making them difficult to understand and evaluate in the classroom • General patterns do not necessarily help memorization

8.3.5

Timelines or Technique of Representation and Development in Time

This type of visual representation is a technique used to show chronological sequences, events, milestones, or steps ordered temporally. It was created with the purpose of facilitating description and understanding, since it shows historical facts and moments as they occurred in time (Preciado, 2008; Villalustre-Martínez & Del Moral-Pérez, 2010). Its structure comprises a straight horizontal line graduated in time units. Time can be represented in a classical way, with the exact date, or drawings, figures, photos, icons, or any other form of graphic representation that the author chooses can also be used. An example of a timeline is presented in Fig. 8.5: • • • • •

The theme represented, which always corresponds to a clearly defined title. A straight timeline, graduated in both directions. Important events associated with the theme that have developed over time. Date events occurred (can be exact or approximate). Adequate time interval to understand the topic.

The strengths and limitations of the timelines are presented below:

8 Assessing with Graphic Organisers: How and When to Use Them

187

Box 6 Strengths and limitations of the timelines Strengths

• • • • •

They allow working with temporary conventions They allow reinforcing information patterns They favour the retention and understanding of information They favour the idea of simultaneity in the learning of History They allow organising memories and connections between them

Limitations • Students must know units of time measurement, how time divisions, time conventions (old, new, etc.) are established so that they can make the most of the tool • They must have knowledge of how to make visible the duration of processes and the excess of events • If design or creativity in construction is assessed, students must be informed in advance and given the respective criteria

8.3.6

Venn Diagram

Developed in 1880 by Venn (Campos, 2005), it is a scheme used to represent relationships between sets of information (intersection, inclusion, and disjunction). It is widely used in the subjects of Mathematics, Science and Language and allows students to contrast ideas represented in different texts, look for patterns to connect what they already know with the new information they are learning and deepen their process of understanding and analysis (Dreher & Gray, 2009). By connecting circles, it allows the learner to establish similarities and differences between concepts, ideas, or texts. In the school system, the Venn diagram is widely used to represent groups of ideas that share or do not share common properties.

Fig. 8.5 Example of a timeline. Source: Extracted from https://www.portaleducativo.net/quintobasico/507/Que-es-una-linea-de-tiempo-como-seorganizan

188

P. Marchant-Araya

Section 8.3.6.1 discusses some examples associated with specific tasks in which this graphic organiser is used (Figs. 8.6, 8.7 and 8.8). In order to use a Venn diagram as an assessment tool, it is key to determine which concepts will be compared or related. One homework option is to give the concepts to the students and have them research the topic and arrange the information in a diagram; another option is for the teacher to give the concepts and a list of unseparated elements and have the students classify them in the appropriate part of the circles. Box 7 Strengths and limitations of the Venn diagram Strengths

• It allows students to systematize and synthesize their knowledge on a subject • It is useful for comparing and contrasting two or three sets of items • It allows illustrating situations of intersection, inclusion, and disjunction of elements • Texts can be translated into a summary format that facilitates understanding, especially in Language and Mathematics

Limitations • With more than three groups of elements, it is difficult to manage the information, especially at intersections • Its effectiveness depends on the ability of teachers to create challenging tasks in this format • Its effectiveness will depend on how teachers understand its use and potential to guide the students’ learning process • It does not allow evidencing relations between concepts

8.3.6.1 Examples of the use of the Venn diagram in different subjects Classification task of living beings Jorge and Anita have a set of images of animals that they must classify and separate into three groups using the criteria defined by them. The result of their work is presented in Fig. 8.6. As can be seen, what makes the set unique is written in the part of the circles that do not have a union between them. In this case, the children defined as groups the animals that are birds, those that fly, and those that swim but are not birds, and what they share is written in the area where two or more circles join. Troubleshooting task Of a total of 99 people, 5 speak English and Spanish, 7 Spanish and German, and 8 English and German. If the number of people who speak German, Spanish, and English is 1, 2, and 3 times greater than the number of people who speak the 3 languages, respectively, how many people speak Spanish?

8 Assessing with Graphic Organisers: How and When to Use Them

189

Fig. 8.6 Venn diagram on classification of living beings

Fig. 8.7 Example of solving a mathematical problem

These types of problems that require important reading comprehension can be translated into a diagram that helps understand graphically what is being proposed and what is being asked for. Figure 8.7 shows the development of the task. Depending on how the students put the information in the diagram, execution errors can be detected easily and quickly for the teacher. Problem statement GO + 8 + 7 + X = 2X SO + 5 + 7 + X = 3X EO + 5 + 8 + X = 4X GO + SO + EO + 40 + 3X = 9X GO: German only SO: Spanish only EO: English only Task: Compare the characteristics of a discursive and argumentative text In general, for this type of question a table is requested. However, for a student, the diagram can be more visually significant and for the teacher, faster to review (Fig. 8.8).

8.3.7

Process Flow Diagram or Flowchart

Flowcharts are graphic representations of operations and activities that are developed in sequence of steps in a procedure, following a logical path, and were

190

P. Marchant-Araya

Fig. 8.8 Diagram example to compare discursive and argumentative texts

first used by Gilberth in the United States (Sáenz & Martínez, 2014). For Preciado (2008), diagrams are flowcharts when “the symbols used are connected in a sequence of instructions or steps indicated by arrows.” Studies confirm the effectiveness of this type of organiser to develop skills in text translation and schematic representation and allow students to show evidence of their analytical and processing skills (Alshatti et al., 2010). It is widely used to represent procedures and make students develop logical thinking through questions and answers, Yes or No statements or other elements that connect relationships and hierarchies. In every flowchart we can find the following elements (Sáenz & Martínez, 2014): • • • • •

Start of process. Specification of the process steps. Actions or activities of each stage. Results. End of the process.

For each of these activities, specific symbols are used to denote the elements and actions to be considered in the process. The most characteristic symbols are shown in Table 8.2. Figure 8.9 shows an example of a flowchart. While this organiser is very useful for assessing the understanding of procedures or generating protocols for working with students, like any graphic organiser, it has its strengths and limitations:

8 Assessing with Graphic Organisers: How and When to Use Them

191

Table 8.2 Symbols used in a flowchart Characteristic

Symbol

This symbol represents something general, it is used to start and end a flowchart It is used to represent a result or input. It is also used to indicate an operation or action performed It represents decision making and ramifications

It is used to connect or join the flow to another part of the diagram

The symbols are joined with lines that have an arrow at the end that indicates the direction in which the information of the process goes

Fig. 8.9 Flowchart. Source: Example extracted from http://bernalunea.blogspot.cl/

192

P. Marchant-Araya

Box 8 Strengths and limitations of the flowchart Strengths

• They favour logical and process reasoning in students • They help in understanding a process, its components, relationships, and interactions • They allow diagnosing failures in the processes or development of procedures • They favour the understanding of a phenomenon from its causes and effects • They favour the comprehension and abstraction of information. They can be used in any field of teaching, especially to explain processes • They allow identifying the problems and opportunities for improving a process

Limitations • Its use is massive in Higher Education, at a professional technical and university level, but its application in the classroom in school contexts must be repeatedly mediated by the teacher • Using many levels between the concepts can complicate the flow of information and the understanding by the students of the processes involved • They can be difficult to follow if there are different paths • They illustrate the flow of a process, but its structure is not always evident

8.3.8

Cause-Effect Diagrams: The Fishbone

Also called “Ishikawa diagram” by the surname of its creator, its name is due to the visual form that takes this representation of knowledge, thus organising the causes that may have a specific problem. Its purpose is to identify the factors, variables or aspects that affect a result to understand a problem and then propose a solution. In a study with this type of organiser in the subject of science, its usefulness was evidenced in both teachers and students. Both showed greater understanding of the content after learning to use this diagram and greater ability to implement it in school tasks (Lu et al., 2008). The key elements of the structure of this graphic organiser are (Campos, 2005): • Causes: are the factors that influence, determine, or explain a result. Causes have different hierarchical levels according to their proximity to the centre. The main ones appear in boxes and are linked by horizontal lines with solid arrows that connect to the effects. Each cause or factor can be explained by another second-order cause, which are linked to the first and so on. As can be seen in Fig. 8.10, the arrows vary according to the level of disaggregation at which they are found. • Effects: these are consequences and results attributed to causes and are described in a box with a minimum semantic term or unit.

8 Assessing with Graphic Organisers: How and When to Use Them

193

Fig. 8.10 Example of the fishbone as a graphic representation

• Problem: the main effect or problem is identified in the head, while the possible causes are located in the spines, as shown in Fig. 8.10.

Box 9 Strengths and limitations of fishbone diagram Strengths

• • • • •

It encourages discussions based on specific problems It allows identifying true causes that affect the results It is a good strategy for general problem solving It promotes reflective thinking in students It favours the organization of thought to identify causes and main and secondary effects • It develops assessment skills, information synthesis, and decision making

Limitations • It considers only linear relations • It assumes prior knowledge and training of students to be used in teaching and assessment

The graphic organisers described so far are considered the most common ones in the school setting, however, there are many others that can be used, some of which are presented in Fig. 8.11.

194

P. Marchant-Araya

Fig. 8.11 Examples of other types of graphic organizers. Source: Adapted from http://www.cast. org/ncac/GraphicOrganizers3015.cfm

Although the contribution of graphic organisers for learning is recognised, it should be kept in mind that their effectiveness will depend on the academic level of the students, the complexity of the organiser and the moment in which it is applied (Alshatti et al., 2010). A combination of them in the classroom can be very useful, and although the cognitive load is high for students, they are lowcost tools for identifying prior knowledge and ideas, monitoring the understanding of concepts during the development of a unit, and developing cross-cutting skills

8 Assessing with Graphic Organisers: How and When to Use Them

195

of application, synthesis, and integration of information in one or more subjects (Eppler, 2006).

8.4

How to Evaluate with Graphic Organisers and Other Forms of Conceptual Representation

As we have seen in previous chapters, assessment for learning has focused on the importance of collecting evidence that allows us to know the process and progress of learning, with the purpose of providing feedback and motivating students to reach the final goal (Wiliam, 2011). Thus, students should be given multiple opportunities to demonstrate their learning, reflect and learn from this evidence. Graphic organisers, as a form of knowledge representation, are a type of learning opportunity that can fulfil the purpose of challenging, motivating, entertaining and promoting new learning in them at the same time of its execution. This is why they are considered an effective form not only of teaching but also of assessment (Preciado, 2008). When elaborating a graphic representation, students put intuition and analysis into action, connecting their product to a sense of reality, since the relationships that are established in these organisers follow a meaningful logic for them (Mayer, 1989). Given the above, it is essential to teach students this type of representation and model it through examples already made, as well as to build it together, so that we can be sure that the representation evidences learning, and that the elaboration of the assessment task is not a limitation for the student. It is also important to consider that the representation of knowledge is something personal, specific to each student, so that seeking to standardize knowledge through graphic organisers or other forms of representation is to reduce the potential of these and of the students themselves, limiting their ability to develop higher cognitive skills. This is the main reason why it is not recommended to use any type of graphic organiser as a summative assessment.

8.4.1

What Do I Assess in an Organizer?

If we consider that the representation of knowledge is personal and idiosyncratic, the evaluable, depending on the type of graphic organiser used, should consider aspects such as: • • • • • • •

Correct use of words or concepts. Assimilation of key ideas. Relevance of definitions if any. Clarity of the structure in global terms. Consistency, relevance, and veracity of relationships. Organisation, classification, and synthesis of concepts. Integration of new information.

196

P. Marchant-Araya

• Ability to communicate information (concepts and relationships). An assessment experience in the subject of English showed that through graphic organisers, not only can the organisation and classification of information be assessed, but also how previous knowledge is connected to new knowledge and the reconstruction of information as a basis for writing texts (Reyes, 2011). Other research indicates that declarative and procedural knowledge, as well as cognitive analysis skills, can be assessed (Rice et al., 1998).

8.4.2

With What Assessment Purpose to Use Graphic Organizers?

In the specialized literature, there are several experiences of assessment with concept maps and other graphic organisers, using various assessment intentions, in which distinctive elements are recognised (Rovira, 2016): • Formative purpose: It is recognised that the best and most recommendable option for assessment with graphic organisers is formative (VillalustreMartínez & Del Moral-Pérez, 2010), whether for initial formative purposes, so that the teacher can recognise and explore previous knowledge, as well as to monitor and provide feedback on the teaching and learning process, favouring the autonomy and metacognition of the students (Preciado, 2008). • Summative purpose: In the school system, concept maps and other graphic organisers are often used for certification purposes and, therefore, students are graded on their productions. However, given what has been described in this chapter, this is not recommended. Using them for summative purposes can serve to close a process of previous elaborations and revisions, although it is not advisable to grade what students do, since the grade leads students to respond to the teacher’s expectations rather than to reflect their own ideas, concepts, and thought structure, which affects the evidence they report.

8.4.3

Which Agents Are Involved in Assessing with Graphic Organisers?

In any assessment, in addition to the teacher, students should be included as active assessment agents, understanding that they have different roles and functions in the learning process. The teacher has the responsibility to lead and provide feedback on student learning and should collect valid and reliable evidence of their learning. A concrete action in which organisers can be implemented is, for example, for the teacher to propose that students develop an organiser on a specific theme that he/she is going to begin

8 Assessing with Graphic Organisers: How and When to Use Them

197

to teach, in order to identify prior knowledge about content in some subject. After working on the concepts, the teacher asks the students to analyse their diagram and make the changes they consider pertinent to represent their current knowledge of the topic. Finally, the teacher assesses the students’ understanding and retention of the topic. The learner to him/herself: this is the key actor in the process and therefore has the responsibility and the right to monitor and self-regulate his/her learning. For example, the teacher asks students to choose a key concept in science and develop a map to show all the associated concepts and their relationships. These concepts and relationships are then discussed in class and students are asked metacognitive questions about the decisions made, such as, why did you decide to use this key concept, what skills and knowledge did you have that allowed you to construct your map, and would you make changes to your map now that we have seen the concepts in more detail? Students write down their reflections and adjust their map. Peers or groups of students: collaborative learning is an opportunity to generate new knowledge, which has been described as an advantage in most of the graphic organisers, so peer assessment should be used to enhance learning. For example, the teacher gives students a list of concepts and they are asked to construct a graphic organiser. Subsequently, the teacher collects the organisers and redistributes them among the students in the room along with a rubric or checklist in which the correct connections and relationships between the concepts are presented. The idea is that each student has a paper from another classmate. Each one analyses and evaluates the respective organiser with a rubric, and then in pairs the students talk about their evaluations and give feedback to the partner who is the author of the map, with the purpose of correcting their relationships together, and thus improve their understanding of them. Another example is the teacher asks groups of students to create semantic maps collaboratively and freely from a read text. Then each group of students presents their map to the class and explains the relationships with concepts and established relationships. The groups of students must write an assessment commentary about the work of their classmates, with criteria previously given by the teacher and deliver it once the presentation is finished. At the end, the groups analyse the comments received and decide whether or not to make changes to their representation, justifying the modifications. It is very important that a task in which any graphic organiser is used with dialogue and feedback, otherwise, it will be seen as a mere activity with no connection to the teaching and learning process, wasting its potential.

198

8.4.4

P. Marchant-Araya

Do I Need an Additional Instrument to Assess Graphic Organisers?

If yes, which one would be the most suitable? (1) Checklist: an instrument of a dichotomous nature, usually used to record the presence or absence of the characteristics described in the indicators, which in this case would be the elements requested in a graphic organiser. With this type of instrument, care must be taken to ensure that the indicators are not a set of formal aspects and that the assessment of the content of the subject is not lost. (2) Rating scale: if you want to give feedback to students on the progress or quality of their work, a scale is a good resource. The qualitative (very satisfactory, satisfactory, unsatisfactory, or very unsatisfactory) or quantitative (1, 2, 3, 4, 5) level at which the graphic organiser complies with the requested aspects is recorded. As in the previous point, it is necessary to ensure that the criteria are relevant and not only formal or structural aspects. (3) Rubric: this instrument is the most widely used in literature to evaluate graphic organisers, and the one we recommend, because it presents the aspects to be evaluated and performance levels in greater detail and, therefore, provides better feedback for the students. The example of a rubric for evaluating concept maps in Mathematics in Table 8.3. Bartels (1995) suggests that in order to evaluate concept maps it is advisable to provide comments from the teacher in each dimension evaluated, especially because the map has several hierarchical levels, and the student needs to know where there could be errors in the relationships or in the understanding of the concepts. Following the proposal of Novak and Gowin (1984) for assessments with concept maps, it is recommended to keep in mind the following aspects when evaluating with any of the proposed graphic organisers: (1) Define a task Through this, students are invited to represent the organisation of their knowledge in a specific topic. The task has a determined demand, restrictions, and a content structure of what the students are expected to do. In general, it corresponds to the guide that accompanies the task, or the instruction given to perform it. (2) A response format In the assessment task, the answer format should be included, which will give the student guidelines on how to complete the task. It is common for the teacher to provide a more directive format, which consists of the presentation of a scheme, with nodes and linking lines, which must be completed by the student, including in some cases a list of words or concepts that must be ordered. If we want to follow the original logic of the authors, we should

8 Assessing with Graphic Organisers: How and When to Use Them

199

Table 8.3 Analytic rubric for assessment with concept maps (Bartels, 1995) 2 points

1 point

0 points

Concepts and Shows an terminology understanding of the topic’s concepts and principles and uses appropriate terminology and notations

Makes some mistakes in terminology or shows a few misunderstandings of concepts

Makes many mistakes in terminology and shows a lack of understanding of many concepts

Shows no understanding of the topic’s concepts and principles

Knowledge of the relationships among concepts

Identifies all the important concepts and shows an understanding of the relationships among them

Identifies important concepts but makes some incorrect connections

Makes many incorrect connections

Fails to use any appropriate concepts of appropriate connections

Ability to communicate through concept maps

Constructs an appropriate and complete concept map and includes examples; places concepts in an appropriate hierarchy and places linking words on all connections; produces a concept map that is easy to interpret

Places almost all concepts in an appropriate hierarchy and assigns linking words to most connections; produces a concept map that is easy to interpret

Places only a few concepts in an appropriate hierarchy or uses only a few linking words; produces a concept map that is difficult to interpret

Produces a final product that is not a concept map

Dimensions

3 points

Observations

consider a less directive format that, unlike the previous one, does not contain any outline, nor are the concepts to be included in the map provided, but rather a blank sheet of paper on which students must construct a map based on content (Ruiz-Primo, 2000). (3) A scoring system to be evaluated Depending on the type of graphic organiser you are using, you can consider aspects such as: (a) Hierarchical organisation of its cognitive structure: there are no correct organisers, but hierarchies of relationships made by individuals in terms of the meanings that are established and that motivate these relationships. The teacher can identify if the structures are too general or too concrete,

200

P. Marchant-Araya

either of these alternatives indicate error of understanding or integration in the subordinate concepts (Novak, 1998). (b) Progressive differentiation: this element refers to what Ausubel et al. (1978) has established as meaningful learning, in which new concepts attain greater meaning as new propositional relationships or links are acquired. In this sense, concepts are never fully learned, but are constantly being relearned, modified, or made more explicit. (c) Integrative reconciliation: this concept corresponds to the improvement in the student’s meaningful learning, which recognises new relationships between sets of concepts or propositions. The new relationship is captured when there is a circumstantial alteration in the meaning of a concept of the insight type. Conceptual errors are displaced by new knowledge. The same authors, starting from the previous concepts as a fundamental basis, propose the following scoring criteria for concept maps, which also apply to other types of graphic organisers at a formative level: (a) The propositions, e.g., the concepts with the appropriate linking words that will indicate the valid or erroneous relationships. 1 point is assigned for each valid and meaningful proposition. (b) The hierarchy, always in the sense that the more general concepts include the more specific ones. 5 points are assigned for each valid level of the hierarchy. (c) Cross-linkages, which show connections between concepts belonging to different parts of the concept map. Allow 10 points for each cross-linkage that is both valid and meaningful. Assign 2 points for each cross-linkage that is valid but does not illustrate a synthesis between related sets of propositions. These cross-connections show creativity on the part of the students. (d) Examples, in certain cases, to make sure students have understood the expectation of what the concept is and is not. Assign 1 point each. Ontoria et al. (1992) suggests that each teacher should build his/her own numerical scale and his/her own scoring criteria, since the existing ones have complicated criteria and are based on a score of 100.

8.4.5

Validity and Reliability in the Assessment Process with Graphic Organisers

Assessing with concept maps or other ways of representing knowledge is not an easy task, and there are multiple ways of scoring. In some studies, the scoring system involved only counting the number of certain components of a map. In others, the assessors only scored a part of the concept maps. Thus, the reliability of such tools will depend on a variety of factors (Klein et al., 2002; Rice et al., 1998; Ruiz-Primo, 2000; Ruiz-Primo et al., 1997).

8 Assessing with Graphic Organisers: How and When to Use Them

201

The main factors that can be a source of error in an assessment process with graphic organisers are, for example, variations in the student’s competence in representing knowledge with graphic organisers; variations in the content mastery of those who assess with this technique; consistency in assigning scores, which will depend on the rubric or other system used to assess. Ensuring consistency is one of the most difficult aspects, since the teacher has the task of assessing many maps and runs the risk that his or her evaluative judgment will be askew, affecting the reliability of the process (McClure et al., 1999). It is frequent to find in literature (Bartels, 1995; McClure et al., 1999; RuizPrimo et al., 1997) the use of master maps to assess the organisers graphs: this has the advantage of making the process more reliable, but the disadvantage of limiting students’ potential to develop higher skills, creativity, and intuition, as we have already pointed out in this chapter. What is important is that teachers keep this information in mind when making assessment decisions in the classroom. In some cases, teachers may use model maps that have been validated by other teachers in the same discipline or by colleagues who have experience in the construction of graphic organisers, resembling the process of “expert validation.” Although these models are widely used in the literature, it is worth asking: do all experts in an area share the same knowledge structure? Acton et al. (1994) showed that experts’ structures are highly variable. And the differences in their maps are due to the fact that the knowledge structure reflects not only the domain (content) but also a personal schema of thinking and cognition (problem-solving strategies), so asking students to generate a unique map or a map similar to the model map does not seem to be the most appropriate. In the case of concept maps, we agree with Novak and Gowin (1984, p. 132) that they have construct validity in terms of assessment theory. The authors state that “there is a correspondence between the assessment of cognitive functioning and what our theory predicts about what the cognitive organisation resulting from meaningful learning should be”. An important criterion for assessing content validity is the expert’s judgement of the representativeness of the concepts used in the assessment. Another criterion is the evaluator’s judgment related to the accuracy of the students’ organisers within a domain. In the classroom context, it is recommended that teachers collectively and collaboratively design the teaching and assessment task, as well as jointly review and agree on what and how to provide feedback to students. Finally, the importance of using a variety of assessment tools to collect evidence, exploring new and more meaningful ways of demonstrating learning for students, such as graphic organisers in general, has been demonstrated. Some studies have even shown consistent relationships between concept map scores and students’ academic achievement (e.g., Agustina, 2013; Ruiz-Primo et al., 1997), i.e., students who perform better in a specific area would be able to provide relevant and sufficient evidence of their understanding of what is being assessed over those who perform poorly. Other studies reinforce the positive relationship between student performance and the use of graphic organisers in teaching and learning in

202

P. Marchant-Araya

Science, Mathematics and especially Language (Agustina, 2013; Lu et al., 2008; Uba et al., 2017), which stimulates their use in the school classroom. In summary, in order to assess with graphic organisers, the teacher must have experience in their use, especially if they will be graded, since there is a direct impact on the learning opportunities that students have. In the case of concept maps, a directive format takes longer to be evaluated than a non-directive format. In general, teachers take longer to review the students’ maps when they compare them with a model map than when they do it from the direct understanding of the map created by the students, but in this case, there is a risk of not using the same review criteria for all. Given the above, it is important that the instruction guide for the task with graphic organisers is clear and guided, and that it is left in a free format so as not to restrict the students and to favour learning. In the following chapter we will delve deeper into the criteria of validity and reliability that must be safeguarded in the design of an assessment strategy and in the development of assessment tools to ensure that the evidence of learning that we collect accounts for the progress and achievement of the goals that we have set for our students to reach.

References Acton, W. H., Johnson, P. J., & Goldsmith, T. E. (1994). Structural knowledge assessment: Comparison of referent structures. Journal of Educational Psychology, 86(2), 303–311. https://doi. org/10.1037/0022-0663.86.2.303 Agustina, Y. (2013). The effectiveness of semantic mapping to teach reading viewed from students’ intelligence (UNS-Pascasarjana Prodi. English Education-S891108124-2013) [Master’s thesis, Sebelas Maret University, UNS-Pascasarjana]. Institutional Repository. https://digilib.uns. ac.id/dokumen/detail/29836/The-Effectiveness-Of-Semantic-Mapping-To-Teach-Reading-Vie wed-From-Students-Intelligence Alshatti, S., Watters, J., & Kidman, G. (2010). Enhancing the teaching of family and consumer sciences: The role of graphic organisers. Journal of Family and Consumer Sciences Education, 28(2), 14–35. https://www.researchgate.net/publication/277754454_Enhancing_the_teaching_ of_family_and_consumer_sciences_the_role_of_graphic_organisers Ausubel, D. P. (2000). The acquisition and retention of knowledge: A cognitive view [eBook edition]. Springer Dordrecht. https://doi.org/10.1007/978-94-015-9454-7 Ausubel, D. P., Novak, J. D., & Hanesian, H. (1978). Educational psychology: A cognitive view (2nd ed.). Holt. Barron, A. (1969). The use of vocabulary as an advance organiser. In H. L. Herber & P. L. Sanders (Eds.), Research in reading in the content areas: First year report. Syracuse University, Reading and Language Arts Center. ERIC. Retrieved November 25, 2022, from http://files.eric.ed.gov/ fulltext/ED037305.pdf#page=34 Bartels, B. (1995). Promoting mathematics connections with concept mapping. Mathematics Teaching in the Middle School, 1(7), 542–549. Boukobza, P. (2010, January 24). Mapa: Las leyes de los mapas mentales. https://visual-mapping. es/mapa-las-leyes-de-los-mapas-mentales/ Bronfenbrenner, U. (1987). Ecology of human development [eBook edition]. Harvard University Press. https://khoerulanwarbk.files.wordpress.com/2015/08/urie_bronfenbrenner_the_eco logy_of_human_developbokos-z1.pdf

8 Assessing with Graphic Organisers: How and When to Use Them

203

Buzan, T. (2002). How to mind map: The ultimate thinking tool that will change your life [eBook edition]. HarperCollins Publishers. https://www.perlego.com/book/829089/how-to-mind-mapthe-ultimate-thinking-tool-that-will-change-your-life-pdf Buzan, T., & Buzan, B. (1994). The mind map book: How to use radiant thinking to maximize your brain’s untapped potential [eBook edition]. Dutton. https://www.academia.edu/42863502/The_ Mind_Map_Book_Tony_Buzan Campos, A. (2005). Mapas conceptuales, mapas mentales y otras formas de representación del conocimiento [Concept maps, mind maps and other forms of knowledge representation] [eBook edition]. Colección Aula Abierta. Magisterio. https://unidaddegenerosgg.edomex.gob. mx/sites/unidaddegenerosgg.edomex.gob.mx/files/files/Biblioteca%202022/Metodolog%C3% ADa%20para%20la%20Investigaci%C3%B3n%20Social/MIS-12%20Mapas%20conceptual es,%20mapas%20mentales%20y%20otras%20formas%20de%20representacio%CC%81n% 20del%20conocimiento.%20Agusti%CC%81n%20Campos%20Arenas.pdf Cañas, A., Ford, K., Hayes, P., Reichherzer, T., Suri, N., Coffey, J., Carff, R., & Hill, G. (1997). Colaboración en la construcción de conocimiento mediante mapas conceptuales [Collaborative knowledge construction using concept maps] [Paper]. Institute for Human and Machine Cognition, University of West Florida. https://www.ihmc.us/users/acanas/ColabCon.pdf Condemarín, M., & Medina, A. (2000). Evaluación de los aprendizajes: Un medio para mejorar las competencias lingüísticas y comunicativas [Learning assessments: A means to improve language and communication competencies] [eBook edition]. Editorial Andrés Bello. https:// www.rmm.cl/sites/default/files/usuarios/mcocha/doc/201011141500430.libro_mabel_condem arin_evaluacion_aprendizajes.pdf Daniels, H., Zemelman, S., & Steineke, N. (2007). Content-area writing: Every teacher’s guide. Heinemann Educational Books. Dreher, M., & Gray, J. (2009). Compare, contrast, comprehend: Using compare-contrast text structures with ELLs in K-3 classrooms. The Reading Teacher, 63(2), 132–141. Eppler, M. (2006). A comparison between concept maps, mind maps, conceptual diagrams, and visual metaphors as complementary tools for knowledge construction and sharing. Information Visualization, 5, 202–210. https://doi.org/10.1057/palgrave.ivs.9500131 Escaño, J., & Gil de la Serna, M. (1999). Los mapas conceptuales, un recurso para ser feliz [Concept maps, a resource for being happy]. Revista Aula de Innovación Educativa, 78, 48–57. https://www.grao.com/es/producto/los-mapas-conceptuales-un-recurso-paraser-feliz-au0785985 Guerra, F., & Naranjo, M. (2016). Estado del arte en el tema de los organizadores gráficos en la representación de esquemas y diagramas [State of the art on the topic of graphic organisers in the representation of schemes and diagrams]. UNIMAR, 34(2), 43–60. https://revistas.umariana. edu.co/index.php/unimar/article/view/1240 Heimlich, J. E., & Pittelman, S. D. (1986). Semantic maps: Classroom application. Reading Aids Series, IRA Service Bulletin. International Reading Association. Klein, D., Chung, G., Osmundson, E., & Herl, H. (2002). The validity of knowledge mapping as a measure of elementary students’ scientific understanding. National Center for Research on Evaluation, Standards, and Student Testing (CRESST), University of California. https://cresst. org/wp-content/uploads/TR557.pdf Lu, C., Tsai, C., & Hong, J. (2008). Use root cause analysis teaching strategy to train primary preservice science teachers. ERIC. Retrieved November 25, 2022, from https://files.eric.ed.gov/ful ltext/ED503886.pdf Mayer, R. (1989). Models for understanding. Review of Educational Research, 59(1), 43–64. https://www.jstor.org/stable/1170446 McClure, J. R., Sonak, B., & Suen, H. K. (1999). Concept map assessment of classroom learning: Reliability, validity and logistical practicality. Journal of Research in Science Teaching, 36(4), 475–492. https://doi.org/10.1002/(SICI)1098-2736(199904)36:4%3C475::AID-TEA5%3E3. 0.CO;2-O

204

P. Marchant-Araya

Moreland, J. L., Dansereau, D. F., & Chmielewski, T. L. (1997). Recall of descriptive information: The roles of presentation format, annotation strategy and individual differences. Contemporary Educational Psychology, 22(4), 521–533. https://doi.org/10.1006/ceps.1997.0950 Novak, J. D. (1998). Learning, creating and using knowledge: Concept maps as facilitative tools in school and corporations (1st ed.). Lawrence Erlbaum Associates. Novak, J. D., & Gowin, D. B. (1984). Learning how to learn. Cambridge University Press. Ontoria, A., Ballesteros, A., Cuevas, M. C., Giraldo, L., Martín, I., Molina, A., Rodríguez, A., & Vélez, U. (1992). Mapas conceptuales, una técnica para aprender [Conceptual maps, a technique for learning]. Narcea. Preciado, G. (2008). Recopilación: organizadores gráficos. Orientación Educativa [Compilation: Graphic organisers. Educational guidance]. Retrieved from http://prepajocotepec.sems.udg. mx/sites/default/files/organizadores_graficos_preciado.pdf Reyes, E. C. (2011). Connecting knowledge for text construction through the use of graphic organisers. Colombian Applied Linguistics Journal, 13(1), 7–19. https://doi.org/10.14483/22487085. 2928 Rice, D. C., Ryan, J. M., & Samson, S. M. (1998). Using concept maps to assess student learning in the science classroom: Must different methods compete? Journal of Research in Science Teaching, 35(10), 1103–1127. https://doi.org/10.1002/(SICI)1098-2736(199812)35:10% 3C1103::AID-TEA4%3E3.0.CO;2-P Rovira, C. (2016). Theoretical foundation and literature review of the study of concept maps using eye tracking methodology. El Profesional de la Información, 25(1), 59–74. https://doi.org/10. 3145/epi.2016.ene.07 Ruiz-Primo, M. A. (2000). On the use of concept maps as an assessment tool in science: What we have learned so far. Revista Electrónica de Investigación Educativa (REDIE), 2(1), 29–53. https://www.redalyc.org/pdf/155/15502103.pdf Ruiz-Primo, M. A., Shultz, S. E., & Shavelson, R. J. (1997). Concept map-based assessment in science: Two exploratory studies [CSE Technical Report 436]. National Center for Research on Evaluation, Standards, and Student Testing (CRESST), University of California. https://cre sst.org/publications/cresst-publication-2808/ Sáenz, B., & Martínez, L. (2014). Diagramas de flujo [Flow diagrams]. In L. Martínez, P. Ceceñas & V. Ontiveros (Eds.), Lo que sé de: mapas mentales, mapas conceptuales, diagramas de flujo y esquemas [What I know about: Mind maps, concept maps, flowcharts and diagrams] (pp. 116–131). Red Durango de Investigadores Educativos. Sinatra, R., Stahl-Gemake, J., & Wyche, N. (1986). Using semantic mapping after reading to organise and write original discourse. Journal of Reading, 30(1), 4–13. https://www.jstor.org/stable/ 40011116 Tversky, B. (2005). Visuospatial reasoning. In K. J. Holyoak & R. G. Morrison (Eds.), The Cambridge handbook of thinking and reasoning (pp. 209–240). Cambridge University Press. Uba, E., Oteiku, E., & Abiodun-Eniayekan, E. (2016). Towards embedding graphics in the teaching of reading and literature in Nigeria. International Journal of Education Investigations, 3(7), 106–119. https://eprints.covenantuniversity.edu.ng/7599/1/IJEI.Vol.3.No.7.09.pdf Uba, E., Oteiku, E., Onwuka, E., & Abiodun-Eniayekan, E. (2017). A research-based evidence of the effect of graphic organizers on the understanding of prose fiction in ESL classroom. SAGE Open, 7(2), 1–9. https://doi.org/10.1177/2158244017709506 Villalustre-Martínez, L., & Del Moral-Pérez, M. E. (2010). Mapas conceptuales, mapas mentales y líneas temporales: Objetos “de” aprendizaje y “para” el aprendizaje en Ruralnet [Concept maps, mind maps and timelines: Objects “of” learning and “for” learning in Ruralnet]. Revista Latinoamericana de Tecnología Educativa [RELATEC], 9(1), 15–27. Wiliam, D. (2011). What is assessment for learning? Studies in Educational Evaluation, 37(1), 3–14. https://doi.org/10.1016/j.stueduc.2011.03.001

8 Assessing with Graphic Organisers: How and When to Use Them

205

Paola Marchant-Araya is Social Worker. Paola Marchant-Araya did Master in Psychology, mention in Educational Psychology, and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Paola Marchant-Araya is Assistant Professor at the School of Social Work, Faculty of Social Sciences UC, and Specialist in curriculum design and evaluation and training in higher education.

9

Quality Criteria for Developing Assessment Tools Carla E. Förster and Cristian A. Rojas-Barahona

Abstract

This chapter addresses the concepts of validity and reliability of an evaluation from a theoretical-practical perspective and makes suggestions that allow safeguarding these quality criteria in their evaluations. The lack of rigor and quality in the evaluations are analyzed through the voices of the students and these examples are related to the underlying theoretical concept. Specifically, it delves into content validity, instructional validity and consequential validity, reliability from the perspective of information sufficiency and from objectivity. The selection of these rigor criteria for the construction of instruments is based on their frequency and usefulness in the classroom context.

9.1

Introduction

As teachers, we must constantly make judgments regarding the progress and learning achievement of our students in the different subjects we teach. From the point of view of assessment presented in this chapter, this judgment assumes that we have previously collected information in a systematic and reliable manner, that we

This chapter was originally published in Pensamiento Educativo. Revista de Investigación Educacional Latinoamericana, Vol. 43, No. 2, 2008 (Förster & Rojas-Barahona, 2008). Reproduced with permission of the publisher © 2017. C. E. Förster (B) · C. A. Rojas-Barahona Universidad de Talca, Talca, Chile e-mail: [email protected] C. A. Rojas-Barahona e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_9

207

208

C. E. Förster and C. A. Rojas-Barahona

have analysed it correctly, and that we have contrasted it with a previously established reference (such as the learning objectives and assessment indicators that we have defined). However, on more than one occasion, we will surely be left with the feeling that students knew more than the results of a test reflect, or that these results are not consistent with what we observed in class. It should be borne in mind that the evidence of learning that we collect is always permeated by the psychological, sociocultural, economic, and physical characteristics of each student, and may be influenced by passing problems, particular circumstances of the context or situations specific to the application of the tools with which such information is collected (Luckett & Sutherland, 2000; Salinas, 2002; Sanmartí, 2007). In this sense, the quality of an assessment, that is, the consistency, adequacy and relevance of the situations, procedures and tools used to assess student performance should consider validity, reliability and objectivity as essential elements when interpreting the results and thus making appropriate decisions regarding what and how to teach and provide feedback to the student (Brookhart, 2003; Hattie, 2015; Himmel et al., 1999; McMillan, 2003). The purpose of this chapter is to analyse the concepts of validity and reliability of an assessment from a theoretical and practical perspective, and to propose suggestions that will allow teachers to safeguard these quality criteria in future assessments in their classrooms (Fig. 9.1).

Fig. 9.1 Organisation of the rigour criteria when devising assessment tools and tasks

9 Quality Criteria for Developing Assessment Tools

9.2

209

Validity

The first element to consider in the quality of an assessment is the validity of the information collected with a tool or in an assessment situation. This concept has its origins in measurement and has been widely developed from the psychometric approach, in which it is stated that a test is valid “if it measures what it was intended to measure”; however, this generic definition must be complemented with the purpose for which it was constructed, that is, if the interpretation of a given score correctly points to a conclusion regarding the purpose or construct that the test measures (Hogan, 2002). Consistent with this idea, Messick (1989, p. 13) defines validity as “an overall evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions based on test scores or other modes of assessment”. Thus, for example, the question: Is the University Selection Test (Prueba de Selección Universitaria, PSU by its acronym in Spanish) valid? is not a good question; rather, it would be more appropriate to say: Is the PSU valid for predicting the results of students during their first year of university? or better yet, to what degree is the PSU score valid for predicting the results of students in their first year of university? In other words, validity is not an intrinsic property of a tool or assessment situation, but rather a property of the interpretation and uses that are made of the information obtained from the tool (Valverde, 2006). The concept of validity used in the psychometric approach makes sense only for large-scale tests, applied to a considerable number of students, in which the results are expected to generate a normal distribution, that is, that the majority of people obtain average scores, and a smaller proportion are located at the extremes. But in the classroom, the aim is not for students’ results to be distributed like this; on the contrary, an asymmetric curve is expected in which the majority (or all) are above the passing score for a given learning task, since this would indicate that they managed to learn what was expected for that moment (age, course, period of the year, etc.). For this reason, Stiggins (2001) avoids speaking of “validity” in the classroom context and replaces this term with “high-quality classroom assessment.” Various authors state that in the classroom the meaning of validity is not the traditional one, since statistical evidence is very difficult to obtain (Luckett & Sutherland, 2000; McMillan, 2003; Moss, 2003; Stiggins, 2001). Even more so when most teachers have no knowledge of measurement theory, due to the scarce presence of this discipline in their initial training and, therefore, very few tools for making adaptations and recognising the limitations of “psychometric validity” in the classroom (Brookhart, 2003). Similarly, Moss (2003) questions the extent to which the use of psychometric measurement’s own validity ensures a focus on the relevant issues in the classroom and how other theoretical perspectives would contribute to a sound framework for valid assessments in that context. Brookhart (2003), by means of a comparative table, makes a clear differentiation of the validity of an assessment according to the context for which it was designed (Table 9.1). In validity, he mainly emphasises the purpose and importance of the context in which the assessment is applied. At the large scale, the purpose

210

C. E. Förster and C. A. Rojas-Barahona

is to draw conclusions from the performance of a large number of students for the specific purposes of the assessment designed (in general, decision-making at the institutional or national level), while at the classroom level, the purpose of the assessment is to monitor and certify the performance of students by providing evidence regarding their progress towards learning objectives and the effectiveness of teaching. In terms of context, large-scale assessments do not consider it as a relevant variable, while at the classroom level it is a factor to be considered, since it matters, for example, students’ prior knowledge, their sociocultural context, and the learning opportunities that students have had prior to the assessment. Table 9.1 Comparison of validity in large-scale and classroom assessments Criterion

Large-scale assessment

Classroom level assessment

Interpretation of data and actions taken

They are external to the measurement process

They are internal to the measurement process

Student engagement

The students are the “object” on which the observations are made, they have no interference in the evaluation

Students are observers along with teachers; the measurements correspond to formative assessment processes

Benefits for students

They do not receive a direct benefit from the evaluation, the analyses are not individual

They receive direct benefits since the assessment information is individual

Assessment purpose

The goal is to make a statistically significant inference about student performance and/or the effective use of that information for a specific purpose

The goal is to understand how students are performing compared to the “ideal” (defined in the learning objectives) and/or the effective use of that information for each student’s learning

Context

The context of the The context of the measurement measurement is an irrelevant is a relevant construct construct

Assessed content

Content specifications describe a disciplinary domain

The content specifications reflect the domain (learning objective) and instruction (modes, activities)

Preparation and administration of the test

The preparation and administration of the test is standardised, and the effect of cultural practices and differences is controlled

Test construction and administration is mediated by teachers’ beliefs, practices, and content and student knowledge (e.g., cultural and linguistic differences)

Relationship with teaching

The assessment is independent of teaching, it can be related to different contexts through data crossing

Assessment is part of teaching; a good assessment is a “genuine episode of learning”

Adapted from Brookhart (2003)

9 Quality Criteria for Developing Assessment Tools

211

Although it is clear that the validity of a classroom assessment does not obey a psychometric logic (Brookhart, 2003), it is necessary to highlight some elements of the different types, categories or aspects related to validity, which have been described in various classifications; the most classic ones are construct, content and criterion-related validity (Allen & Yen, 2002; Gorin, 2007; Hogan, 2002; Moss, 2007; Sireci, 1998). Construct validity is the degree to which a tool measures a theoretical construct (e.g., intelligence, creativity, reading comprehension, or mathematical competence); content validity refers to the degree to which a measurement reflects the variable for which it was designed; and criterion validity corresponds to the predictability of a test to infer the outcome of students in a future situation (Allen & Yen, 2002). There is no agreement in the scientific community as to whether there is one type of validity (construct validity) that includes the others or whether they are of a different and exclusive nature and, therefore, should be separated (Borsboom et al., 2004; Gorin, 2007; Moss, 2007; Sireci, 1998). In this sense, we highlight the words of Valverde (2006, p. 26), who points out that “There is a large number of options regarding the type of evidence that can be accumulated and reported. Each type of evidence illuminates and supports different facets of validity”. Thus, this chapter does not intend to deepen the discussion or adhere to a particular classification, but rather to consider which of these categories or aspects would be relevant to consider in order to safeguard the quality of assessment in the educational context, specifically in the classroom. The following is a conceptual review of the fundamental typologies of assessment in the classroom and some examples of assessment situations described by students in which the lack of each type of validity is evident.

9.2.1

Content Validity

Content validity is a concept that has been controversial over time (see summary of the evolution of the concept in Sireci, 1998). Theorists such as Messick (1989) point out that it is part of construct validity and that given its “qualitative” nature, it is technically incorrect to call it “validity,” since it does not correspond to a psychometric nomenclature, and they propose to speak of relevance, representation, or content coverage. Gorin (2007), on the other hand, states that a construct remains independent of the assessment context, while a content is dependent on that context. Hogan (2002) indicates that to determine the construct validity of a tool it is necessary to analyse its internal structure1 and the response processes of each item; since both procedures are very difficult to carry out in the classroom and the context is an important element that conditions assessment situations, we

1

Internal consistency or reliability of the tool, which is calculated statistically with KR-20 or Cronbach’s alpha and the common dimensions underlying the model through factor analysis.

212

C. E. Förster and C. A. Rojas-Barahona

have decided to present content validity as a key element to consider in order to safeguard the quality of an assessment. This validity refers to the correspondence that exists between the content, skills, or learning objectives assessed by the tool and the field of knowledge to which such content is attributed (Brualdi, 1999; García, 2002; Hogan, 2002; Lukas & Santiago, 2004). In the educational field, extensive work has been done to generate content validity indices in large-scale tests, considering variables such as: (a) the opinion of experts (teachers, researchers, curriculum specialists, directors) regarding the curricular coverage of the tests and the teaching of such content, (b) the sample of students to whom the test was applied, (c) the number of items of each content, among others (Crocker et al., 1988). However, these statistical models designed to calculate content validity indexes are not applicable to assessment situations that take place in the classroom, since they do not meet the basic requirements of these models. In this sense, for a teacher, ensuring that his or her assessments have content validity implies guaranteeing that the assessment situations include the knowledge and skills corresponding to the learning objective for which he or she intends to collect evidence. The most relevant content and skills should be considered in order to account for the achievement of learning; that is, a pertinent and representative sample should be selected from all content covered in a course, a class, or a set of classes. Sireci (1998), after an exhaustive literature review, states that there is consensus on at least four elements that characterize the concept of content validity: (1) the definition of a disciplinary domain, (2) the relevance of said domain, (3) the representativeness of the domain, and (4) the construction procedures of the tool or assessment situation. This is particularly important because the conclusions that can be drawn from an assessment are only valid for what was assessed; that is, the scores or categories obtained by each student should come from quality tasks and stimuli, coherent with the conceptual purposes for which they were developed. Therefore, certifying what a student knows and can do or not during a given school period is directly related to the skills and abilities that were assessed. The following are some examples of assessment experiences that are considered negative by students and that are related to the lack of content validity: “During one course, the teacher gave quite a few readings for a test and in it he asked for overly specific data, exact words quoted in the texts to identify which author they might be by and names and dates that, while interesting or important, for an assessment of so much content were of very little relevance.” “For me, all those evaluations in which textual quotations from the bibliography were asked, which are not related to the general theoretical framework of the course, were negative. For example, what does the author refer to with the phrase …?”

9 Quality Criteria for Developing Assessment Tools

213

“I remember a test that involved a lot of content, but that we had been told would not be memorized, however, when it came to answering, it was evident that the answers had to be literal, because the questions were very specific and did not point to what had been considered relevant during the course.” “I have had to answer evaluations in which the questions have nothing to do with the contents that were entered, they seem to evaluate anything but what they should.” “I remember in Mathematics when they used to give us exercises that could only be solved if you could think of the right trick. I think it’s negative because learning doesn’t depend on whether or not you know the trick, but on solving and posing a problem well.” “In one class we were asked to make some graphs of the economy of the region and then the professor evaluated the aesthetics more than whether or not they were well elaborated, and the classmates who made the graphs with colours did better than my group, who had the graphs in black and white … there I learned to worry more about aesthetics than about the contents.” As can be seen in these accounts, students feel that they are assessed on content that is too specific, lacking in meaning, or unrelated to the expected disciplinary domain. The following are some suggestions for safeguarding the content validity of an assessment situation, which correspond to procedures and actions that a teacher could carry out in the development of his or her own assessments. Although there are suggestions in the statistical field that can also be made, they have not been considered here: (1) Develop a table of tool specifications: as we saw in previous chapters, the table of specifications for the development of a test, a checklist, or an assessment scale makes it possible to visualize whether all the content or elements that are expected to be addressed are being covered, whether the skills associated with these aspects are coherent with the learning goals set, and whether the items and descriptors correspond to the content and skills that were expected. From this analysis it is also possible to establish whether there is a gradation of complexity in terms of the skills assessed and whether the items and descriptors are the most appropriate for measuring a given skill. (2) Verify through the criteria of judges: given that in the classroom we are the ones who construct or adapt the assessment tools and situations, we often overlook aspects that are more evident to those who have not seen the assessment before. For this reason, it is recommended that we show other teachers the tool with its correction guideline and the table of specifications so that they can give us their opinion regarding whether the assessment situation is adequate for measuring the learning and indicators defined. It may be that we have problems in the wording of an item or a descriptor and that the learning that we want to assess is not reflected in our tool.

214

9.2.2

C. E. Förster and C. A. Rojas-Barahona

Instructional Validity

According to Hogan (2002), this validity corresponds to a particular application of content validity and is also known as “curricular validity.” It relates to what students have had the opportunity to learn during classes in order to respond correctly in an assessment (Crocker et al., 1988). It should be noted that the name “instructional” refers to teaching and not, as is often interpreted, to the instructions for an assessment task. In education, this type of validity is key, given that it represents the degree of relationship between what is taught and what is evaluated. When this relationship is weak, two problems arise: on the one hand, students do not have the possibility of demonstrating what they learned during classes, and on the other, they are evaluated on aspects that were not taught to them (Himmel et al., 1999; McMillan, 2003). This last idea is reflected especially when the emphasis of what was worked on is changed. For example, in classes, concepts and their definitions are taught, and then in the assessment, students are asked to apply these concepts in situations that have never been worked on during the class, alluding to the fact that they are expected to be able to do so as part of the “construction” of their own learning. Thus, we will say that an assessment has instructional validity when it contains assessment situations that are coherent with the learning activities carried out by the students. When this criterion is not met, there are also ethical problems related to the interpretation of results, such as holding students responsible for not knowing content that they have not had the opportunity to learn and holding teachers responsible for the low achievement of their students without providing them with the conditions, materials, or training necessary to teach the content (Valverde, 2006). Although this problem occurs principally in large-scale tests, it also occurs in the interpretations that management teams make of the results of a placement test or of an external test applied to various classes. This section includes what has been called “semantic validity,” which consists of the fact that assessment situations contain terms whose meaning is known and shared between the tool’s constructor (in the case of the classroom, the teacher) and the students. Often content is taught using a term and then in the assessment a “synonymous” term is used that has not been worked on in class. Strictly speaking, the content was taught, but when faced with an erroneous response from the students, it is worth asking whether they have not mastered the content or whether it was the word they did not understand that provoked this response. Below we present some assessment situations perceived as negative by students and that allude to instructional validity. “Many times, the teachers evaluated subject matter or content that did not correspond to what we had seen in class. In the tests he used strange words that we hadn’t seen or were not in the book and when someone asked what

9 Quality Criteria for Developing Assessment Tools

215

they meant, he humiliated you by saying that you should know because it was general culture.” “It was negative for me when the theoretical classes did not go according to the laboratory evaluations and we were evaluated on material that we had not yet seen, there are many laboratory activities that I still do not fully understand.” “The tests in one course did not correspond to what the professor taught in class, he taught badly, created a climate of terror in the room and then asked us things that he had not taught or that had happened in the course next door.” “The first test they gave us did not correspond to what we were working on and learning in class. We only did theory, and the test was of application, with many details, and the teacher did not do exercises or highlight those elements. In short, what I learned did not help me for the test.” “I have had professors who evaluate in their tests things (content) that they have taught badly, quickly or simply have not dealt with either in classes or in the texts referred to, assuming that we saw them in other courses.” As can be observed, the students’ opinions directly allude to the teacher’s professional performance, therefore, the actions suggested below to safeguard instructional validity are related to the teaching practices that are habitually used: (1) Ensure coherence between what is taught and what is assessed: this means ensuring that the assessment situations contain the content seen in the learning activities carried out. It is essential to keep a record of what was taught in each class. To develop assessment situations, it is not enough to use the timetable or planning, unless it is continually updated. Normally, the dynamics of a group of students generate differences between courses at the same level; with some, one can go deeper or go back on a subject and not with others; there may also be unplanned activities that produce time lags, and if we are not clear about these events, we could evaluate subjects that have not yet been taught or that we addressed in one course, but not in another. (2) Ensure that assessment situations are equivalent to the learning activities carried out: this does not mean repeating the guide that was done in class and now giving it a grade; on the contrary, this suggestion aims to generate assessments that consider skills similar to those already worked on. If during the teaching process activities were intended to achieve the expected learning or learning goals determined for these students, then, in the assessment, the aim is to monitor or certify whether or not they were achieved in order to provide feedback to the student and begin a new cycle. It is suggested, on the one hand, to make use of class planning and that this be a flexible guide to support teaching preparation and, on the other hand, to use the table of specifications, emphasizing that assessment situations be coherent with what each teacher worked on with his or her students. On this last point, the “expert judges”

216

C. E. Förster and C. A. Rojas-Barahona

have no competence, since they were not present during the process, so that the coherence between what was taught and what was assessed only depends on the self-assessment that the teacher carries out in regard to his or her own practice. (3) Ensure that the language used in assessment situations is familiar to students: this means taking special care to use terms that are familiar to students, both in instructions and in the technical language of the discipline to be assessed. It is important to note that the purpose of an assessment is to obtain information on student learning, and “trick words” only generate poor quality information with respect to that purpose.

9.2.3

Consequential Validity

This type of validity is related to the intended and unintended consequences and sequelae of the use and interpretations that will be given to the information obtained in the assessment (Hogan, 2002; McMillan, 2003; Moss, 1997). Although there is no agreement among authors as to whether this validity is relevant to the psychometric domain (Borsboom et al., 2004; Hogan, 2002; McMillan, 2003), in the opinion of Moss (2003) and McMillan (2003), it should be the primary consideration when making decisions regarding classroom assessment. Consequential aspects of validity are especially important in an assessment when interpretations of the information may involve adverse consequences for the participants (Brualdi, 1999), therefore, recognising this validity could help to control these aspects. As we saw in Table 9.1, Brookhart (2003) argues that this validity is associated with the benefits of the assessment for students. In largescale assessments, the consequences are framed in terms of decision-making and public policy reformulation (Schutz & Moss, 2004) and have ethical and social implications (Borsboom et al., 2004) that make it relevant to look at validity from this point of view, but do not affect individual students. In the classroom, on the other hand, consequences are directly related to the teaching/learning dynamics that occur between teachers and students, since there are individual benefits or adverse consequences. In this sense, Brookhart (2003) points out that the integration of teaching and assessment should be taken seriously, and to understand that an assessment situation is one more opportunity to achieve learning. Therefore, it could be said that consequential validity at the classroom level refers to the effects of the assessment on teaching and student learning (Himmel et al., 1999), and consequently, it is related to the purposes for which the assessment was designed. McMillan (2003) poses a series of questions that, if answered positively, would indicate that an assessment has consequential validity: • Do students gain a deep understanding of what they are learning as they prepare for the assessment?

9 Quality Criteria for Developing Assessment Tools

217

• Do students believe they are able to learn new knowledge after self-assessing in practice exercises on an assessment? • Are students able to transfer their knowledge to new situations? • Does the decision to conduct additional formative reviews prior to the submission of an assessment task lead to increased student learning? These questions have different purposes and will not always be asked at the same time. The important thing to keep in mind is that classroom assessment always has consequences and, therefore, teachers should think about them before, during and after conducting an assessment in order to get the most out of it and minimize the negative effects that can be generated. Among the unintended consequences of an assessment, and which detract from its validity, are the lack of motivation of students to participate in an assessment activity, generating an atmosphere of competition among students, student resentment of the format of the assessment, and internalizing a conceptual error due to the task performed (Brookhart, 2003; McMillan, 2003). Some situations perceived as negative by the students, in which consequential validity is evidenced, are presented below: “In a group work, in which everyone was evaluated equally, one of my classmates did not prepare well, he got nervous, and the presentation was not the best, but the other three classmates did their presentation well. For the above reasons, the group grade was a 5 and this rewarded the deficient colleague and harmed the others who did well.” “My dad is a marine biologist and he saw that the book was wrong, we told the teacher, and I explained it correctly in my research, but I got a 5 because I didn’t write it as the book said. What bothers me the most is that they forced us to learn something that was wrong because the teacher didn’t want to assume her mistake.” “In Mathematics, they made us answer an exercise guide for homework that would have a grade in the course. I answered it a bit quickly and I had a 70% achievement. Then the school came up with the idea that those who got more than 80% would be in an advanced group to prepare for the University Selection Test, which I thought was super unfair because they never said they would use it for that. If I had known that I would have answered it more conscientiously, I needed that intensive course to enter Engineering and I had to pay for a separate one.” “I remember tests that were too long where the questions were about memorising content and there was no opportunity to think and analyse anything. I learned things and I did well, but I don’t remember anything, it’s as if I didn’t do the course and now, I’m going to need it for my future job.” “In History the tests were all multiple choice and had huge paragraphs to read, it didn’t matter whether I studied or not because what they asked for

218

C. E. Förster and C. A. Rojas-Barahona

was reading comprehension. I’m slow at reading so as I always did badly, I stopped studying for that course.” Since the consequences of an assessment can be negative and long-term for our students, the suggestions given to safeguard consequential validity are mostly based on actions related to the definition and planning of an assessment: (1) Clearly define the purposes and uses of assessment: as this validity is based on the implications that an assessment process has for students, it is important to previously define the purpose of the assessment, i.e., what is the purpose of the assessment? In this sense, there are three main purposes: (1) formative diagnostic (e.g., to identify students’ prior knowledge, recognise input behaviours, adjust planning to the context), (2) formative process (e.g., to enhance students’ learning and motivation, monitor their progress in defined learning, improve the teaching process), and (3) summative (e.g., to certify students’ learning, inform parents of their children’s performance at the end of the year). It is important to keep in mind that the same tool may be consistently valid for one instance and not be valid at all in another. (2) Clearly identify the evidence that will account for the purposes of the evaluation: as noted above, consequential validity is determined by the correspondence between the information that is collected and the purposes for which it will be used. In this sense, the suggestion is to have good planning that allows us to safeguard this correspondence, which implies not to make an assessment and then define what it can be used for. An example of this is the “unannounced tests” that are carried out in a class, whose motivation is more for behaviour control or punishment than for the assessment of expected learning. These assessments are usually improvised and added as a grade, the consequences of which are, for the most part, negative for students.

9.3

Reliability

Another element that is considered when analysing the quality of an assessment is the reliability or precision of the information obtained from it. Unlike validity, reliability is only related to the consistency of the measurement, regardless of what exactly is measured (Florez, 1999; Hogan, 2002); thus, I can have a very reliable instrument, but it has no validity. The concept of reliability refers to the consistency, accuracy, and stability of the results, to the inferences that can be made, and is directly related to the conclusions and subsequent decision-making (Luckett & Sutherland, 2000; McMillan, 2003). Reliability, in psychometric logic, implies that the tool delivers similar results when its application is repeated in the same circumstances and with the

9 Quality Criteria for Developing Assessment Tools

219

same people. It can be verified in various ways, but the most common are Cronbach’s alpha coefficient, which provides the internal consistency of an instrument; the test–retest correlation, which evaluates the consistency between two measurements applied to the same subject at different times, and the parallel forms, which evaluate the degree of correlation between two versions of the same test. This quality criterion has its origins in psychometrics; a test is said to be reliable if it consistently generates the same or a similar score when administered to an individual, and its score is replicable with a small margin of error. Moss (2003) points out that in classroom assessments (specifically tests), unlike largescale assessments, reliability is not a relevant issue and is rarely considered, since in the classroom teachers have multiple instances to collect complementary information on learning. Smith (2003) reaffirms this idea by pointing out that teachers do not calculate alpha coefficients or test–retest correlations or parallel forms, on the one hand, because they do not have a sufficient number of cases (more than 100 students in a single application) and, on the other, because the assessments are given to a student only once at a specific time. However, what is fundamental in this case is that the student is expected to change his or her learning from one week to the next, an element that goes against what is expected in psychometrics with respect to the stability of the traits being measured. Therefore, in the classroom context, where there is no interest in generating an order or hierarchy of students in the activities being assessed, reliability from a psychometric perspective is not relevant (Brookhart, 2003; Smith, 2003). In addition, many assessments are graded with categories (e.g., excellent, satisfactory, fair, unsatisfactory), which would create a problem for making any “traditional” reliability calculations. Moss (2003) states that the correct thing to do at the classroom level is to analyse “sufficiency of information”; that is, reliability is represented by having sufficient evidence of learning that allows decisions to be made with the least margin of error. This evidence should not only correspond to the same content, but also have the same level of demand, which seems to be the most difficult to fulfil (Smith, 2003). Brookhart (2003) is very clear in differentiating between large-scale and classroom reliability (Table 9.2). At the large scale, reliability is understood as the consistency between the dimensions that make up a measurement tool, while at the classroom level it is seen as sufficiency or saturation of information. In the first case, a stable classification is sought along a continuum, and at the classroom level it is sought to obtain stable information between the ideal of learning (what the teacher expects from his or her students) and what is actually achieved. Popham (2014, pp. 88–89) is clear in pointing out that “if you construct your own classroom tests with care, those tests will be sufficiently reliable for the decisions you will base on the tests’ results (…) you need to be at least knowledgeable about the fundamental meaning of reliability, but I do not suggest you make your own classroom tests pass any sort of reliability muster” (…). Among the factors that can influence the reliability of an assessment are: (a) the number of observations or evidence of learning, since the more instances of assessment of the same learning (number of items in a test, formative situations, etc.), the greater the “internal consistency” of the assessment process and the more

220

C. E. Förster and C. A. Rojas-Barahona

Table 9.2 Comparison of reliability of large-scale and classroom assessments Criterion

Large-scale assessments

Definition

It is the internal consistency between It is the sufficiency of information about a relevant dimensions of a tool student’s learning in an assessment strategy

Classroom assessments

Purpose

To have a stable classification of students on a rating scale or a stable categorization along a continuum of progress

To have consistent information about the difference between a student’s performance and the “ideal”, as defined in the Learning Objectives

Adapted from Brookhart (2003)

reliable the conclusion regarding student achievement; (b) the characteristics of the application, referring to the clarity of the instructions, the time allotted to respond, and the physical space available; and (c) the precision of the review and scoring of the assessment task (Himmel et al., 1999; Hogan, 2002). This last point will be dealt with separately below under the heading of “objectivity,” given its relevance in the educational setting. Students, faced with this quality criterion, point out that their bad assessment experiences are mainly focused on the lack of opportunities to demonstrate their learning in different instances. Some examples given by them are presented below: “In a workshop at the university there was only one test at the end and with that the grade for the course was at stake. I think that’s a bad evaluation, because there are many things that can affect the answer you give and just that day something can happen that makes you give a bad answer, even if you have studied.” “A negative assessment situation was when I had to give a dissertation and I was sick with fever and I felt very bad, the teacher challenged me because my presentation was very bad. It made me very angry, because I knew the subject and I didn’t have another chance to prove it.” “In a course where the tests were one question for each topic and if you didn’t know that specific one, you lost with that question, even if you knew the topic.” In order to ensure that assessments are reliable, it should be borne in mind that the ultimate aim of this criterion is that conclusions and subsequent pedagogical decisions are based on sufficient evidence that accounts for progress and learning achievement in order to avoid making mistakes or at least to minimize them. In this sense, the actions suggested below provide a guide for achieving reliable assessments: (1) Apply to the same student several assessment situations that measure the same learning: giving the student the possibility of demonstrating his or her performance in the same learning through repeated opportunities allows us to be

9 Quality Criteria for Developing Assessment Tools

221

certain that the learning was or was not achieved (Riggan & Oláh, 2011). This can be done in different assessment tasks or by using the same tool in different situations that assess the same thing. (2) Ensure the clarity of the items and instructions: this suggestion is related to the quality of the information we collect, because if the tool is not adequate, our evidence will not be reliable and the decisions we make may be biased or erroneous. In this sense, there are occasions in which a test item or the descriptor of a scale is ambiguous, or we make a guide for an assignment that we find confusing, and the students do not understand it or understand something else. If this happens, the information will not be in accordance with the learning that students have and we must consider this when reviewing, scoring, and interpreting the information in order to make decisions that can range from eliminating the item to expanding the possible correct answers or changing the assessment criteria in the work. The important thing is not to lose sight of the learning objective being assessed. (3) Ensure that the application environment is adequate in terms of resources, space and time: it is very important to keep this suggestion in mind when conducting assessments in environments that are not optimal, for example, there is music in the courtyard, it is very hot in the room, there is a recreational activity scheduled afterwards, the student is sick, etc., because the results of the assessment will be influenced by these aspects and may not represent the real learning of the students. (4) Ensure the accuracy of the review: this suggestion refers to the objectivity of the review and scoring of the assessment task and will be discussed in detail below.

9.3.1

Objectivity

The objectivity or accuracy of the review in an assessment process is a key element associated with the reliability of an evaluation; however, we decided to treat it separately, given its relevance and impact within the classroom. This does not imply that it is independent of reliability. Generally, objectivity is understood as the quality of an object in itself, independent of personal considerations or judgments. If we take objectivity to the assessment sphere, this supposes that both the tools and the judgment that is issued from the information collected with them are impartial. In relation to this issue, Calatayud (1999) states that whoever believes that student assessment is an objective action is embarking on an impossible task. The author states that assessment is a complex practice that involves not only the mastery of a technique, but also has an important moral and value load and, therefore, is an activity linked to the personal beliefs of teachers, whether they want it or not. From this perspective, impartiality is not possible, and each judgment that is made has a component of

222

C. E. Förster and C. A. Rojas-Barahona

the judgment of the person who expresses it. Concern for the objectivity of assessment is not a new issue. Studies associated with teacher expectations, such as those of Rosenthal and Jacobson’s Pygmalion effect (1968), Spear (1984), and RubieDavies (2006), show that students considered “low achievers” have less time to answer an oral question, receive less help or “hints” when they ask a question on a test, and obtain less feedback on their performance than students considered “high achievers,” for whom teachers have higher expectations. It should be noted that these expectations are generally conditioned by students’ personal characteristics, such as gender, socio-economic status, race or ethnicity, physical appearance, and oral language patterns. Studies such as that of López et al. (1983) show how different Mathematics or Physics teachers assign different scores or grades to the same exercise, which could be attributed to different levels of demand in the correction, which should not affect students if all exercises are reviewed by the same person (Gil-Pérez & Vilches, 2008), while other evidence indicates that the same teacher can score or grade a student’s response differently depending on the moment (Hotyat, 1965). Although it is assumed that revision will never be one hundred percent objective, it is common knowledge that teachers have little time to correct their students’ work and tests, and that they do so in the few physical and temporal spaces they have available; some begin at school in the morning but may finish the next day at home in the early hours of the morning. If concrete actions are not taken to avoid distorting the correction criteria, the possibilities of changing the demands and the expected response may be enormous. These problems of objectivity are reflected in the opinions of students regarding assessment experiences that turned out to be negative for their learning process: “In a workshop there were reading quizzes with 3 questions without further explanation, when I received my marked quiz, I observed that I did not have the full score in any of the questions, when I asked why, the teacher only gave me the correct answer and according to me it was the same but with other words.” “It was a test in a class where almost the whole class failed. No one understood why it was all wrong. The questions were very specific, and the guideline was very strict. It wasn’t clear what was wrong and what was right.” “In the Art class, in my case I presented a painting that was well evaluated, I did it quickly and it didn’t take me much work, on the other hand, one of my classmates was evaluated very badly, when his painting was much more elaborate, innovative and he had followed all the instructions. As there was no guideline, he couldn’t argue anything.” “In Secondary Education, all History evaluations were done with questionnaires, the problem was that the tests remained in the teacher’s possession and there was no right to correction.”

9 Quality Criteria for Developing Assessment Tools

223

“A final paper, even though my paper was considered one of the best in the course in the presentation, was poorly graded because I was on bad terms with the professor.” “One teacher rated my work as unsatisfactory, but when we compared it with others, we noticed that it was better than those with satisfactory grades. This experience was negative because the teacher’s criteria, which is not known or understood by the students, weighed more.” Given that the objectivity of an assessment consists of safeguarding the absence of bias or subjective appreciation in the interpretation of the evidence and/or processes that generated it, we suggest some actions that a teacher can easily carry out in the classroom: (1) Inform students of the purpose of the assessment and the learning that will be assessed: the dual function of this action is that, on the one hand, the teacher must make explicit the purpose of their assessments in the planning and, on the other, for their students there is no ambiguity in what they will be assessed and for what purpose, therefore, the assessment ceases to be a “black box” and becomes a learning tool with clear and well-defined standards. (2) Make the assessment criteria known to the students: if we understand that assessment is part of the teaching/learning process and, therefore, is a tool for improving student learning, students should know clearly what criteria they are going to be assessed on. Preparing an oral presentation to be assessed on the use of communicative skills is very different from preparing it to be assessed on the conceptual mastery of that presentation, so the emphasis that a student places and the type of learning that he or she develops will be directly related to the assessment guideline that has been given. If the guideline and its criteria are ambiguous or unknown, the student will do what they think they should do, and if that does not match what the teacher assesses, the evidence will be unreliable, and the conclusions will be wrong. (3) Develop response or review guidelines: it is common to develop tests or assessment situations and then, when reviewing them, we make a checklist of the ideas that should be present in the student’s response or performance, but it is a fact that the criteria and requirements vary due to different causes (e.g., the time of day, the context in which it is corrected, etc.), therefore, having a clear and precise guideline helps to maintain consistency in the application and avoid the biases inherent in a correction process. In addition, it allows students to know these criteria and to contrast them with their results, making the assessment process transparent and allowing them to self-evaluate, recognise their progress, and know what was expected in each case. (4) Establish in advance the criteria for assigning scores based on the relevance and level of complexity of the learning: this element is key in two ways: first, it allows the teacher to ensure, prior to the application of the assessment, that the items reflect, in terms of extension and complexity, the relevance assigned

224

C. E. Förster and C. A. Rojas-Barahona

to them with a given score in relation to the total score of the assessment and, secondly, students, when answering the assessment, make decisions regarding the dedication and order with which they face the items according to the weight that each one has. The absence of this assignment generates ambiguity for the students regarding the emphasis that the teacher gives to each aspect that is being evaluated. We know that our criteria and scoring can change as we go through the assessments, and an answer that has, for example, 0.75 points when we begin will become 0.5 or 0.8 points when we are finishing if we do not have a clear and precise guideline to help us maintain consistency. Assessment is part of the teaching and learning process within the classroom and, as such, can mark a student both positively and negatively. This review has presented how the absence of validity and reliability in an assessment situation can generate demotivation, erroneous learning, and feelings of injustice in students, which in no way favours a learning environment that is meaningful to them. In general, teachers do not calculate the reliability, standard error of a measurement, or the validity and discrimination coefficients of an assessment. These techniques are typical of the development of standardised tests and of large-scale application but have limited importance in classroom assessment. Here we highlight the elements that may be key and necessary to safeguard in order to make assessments that contribute to student learning, such as content validity, instructional validity, consequential validity, reliability and objectivity. As discussed above, validity is not stated in absolute terms, but in degrees, and depends on the purpose for which the assessment was created. Thus, the results will be more or less valid for drawing conclusions about that purpose. Both validity and reliability increase or decrease according to the quality of the learning evidence that supports them. Although the assessment of learning within the classroom is a complex task that requires time and effort, it is essential to understand it as an integral part of the teaching process and not to see it as a separate instance that comes at the end of the process. If this position is assumed, there will be coherence between planning and practice, in the selection of content and skills that are taught and that are assessed, improving assessment practices in which validity and reliability are scarce. In summary, the quality of the evidence we collect to monitor and certify our students’ learning depends on the quality of the assessment tools and situations we use. The more closely an assessment is aligned with the Learning Objectives and the opportunities that the student had to learn them, the better the evidence will be and the better it will allow us to make pedagogical decisions that are more effective in improving learning. We should not lose sight of the fact that review instruments, such as rubrics, give consistency to the review and help us to maintain the same criteria, so our assessments will be more objective. In the final chapter, we will deal with a methodological proposal of case analysis that allows us to systematize evidence obtained from different sources that

9 Quality Criteria for Developing Assessment Tools

225

configure a problematic situation or a need at the school level and to contrast it with theoretical references to make a proposal for improvement.

References Allen, M. J., & Yen, W. M. (2002). Introduction to measurement theory. Waveland Press. Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061 Brookhart, S. M. (2003). Developing measurement theory for classroom assessment purposes and uses. Educational Measurement: Issues and Practice, 22(4), 5–12. https://doi.org/10.1111/j. 1745-3992.2003.tb00139.x Brualdi, A. (1999). Traditional and modern concept of validity. ERIC. Retrieved November 25, 2022, from http://files.eric.ed.gov/fulltext/ED435714.pdf Calatayud, M. A. (1999). La creencia en la objetividad de la evaluación: Una ilusión imposible [Belief in the objectivity of evaluation: An impossible illusion]. Aula Abierta, 73, 205–221. https://dialnet.unirioja.es/descarga/articulo/45456.pdf Crocker, L., Llabre, M., & Miller, M. D. (1988). The generalizability of content validity ratings. Journal of Educational Measurement, 25(4), 287–299. https://www.jstor.org/stable/1434962 Florez, R. (1999). Evaluación pedagógica y cognición [Pedagogical evaluation and cognition]. McGraw Hill. Förster, C. E., & Rojas-Barahona, C. A. (2008). Evaluación al interior del aula: una mirada desde la validez, confiabilidad y objetividad. Pensamiento Educativo, Revista de Investigación Latinoamericana (PEL), 43(2), 285–305. https://pensamientoeducativo.uc.cl/index.php/pel/article/ view/25759 García, S. (2002). La validez y la confiabilidad en la evaluación del aprendizaje desde la perspectiva hermenéutica [Validity and reliability in learning assessment from a hermeneutic perspective]. Revista de Pedagogía, 23(67), 297–318. Gil-Pérez, D., & Vilches, A. (2008). Que deben saber e saber facer os profesores universitarios? [What should university teachers know and know how to do?]. In M. I. Cebreiros & N. Casado (Eds.), Novos enfoques no ensino universitario [New approaches in university education] (pp. 25–43). Universidad de Vigo. Gorin, J. S. (2007). Reconsidering issues in validity theory. Educational Researcher, 36(8), 456– 462. https://doi.org/10.3102/0013189X07311607 Hattie, J. (2015, October 27). We aren’t using assessments correctly: There’s a distinction between formative and summative assessments. Education Week. https://www.edweek.org/policy-pol itics/opinion-we-arent-using-assessments-correctly/2015/10 Himmel, E., Olivares, M. A., & Zabalza, J. (1999). Hacia una evaluación educativa. Aprender para evaluar y evaluar para aprender, vol. I [Towards an educational evaluation. Learning to assess and assessing to learn, vol. I]. Pontificia Universidad Católica de Chile and Chilean Ministry of Education [Mineduc]. Hogan, T. P. (2002). Psychological tests: A practical introduction. Wiley. Hotyat, F. (1965). Los exámenes: los medios de evaluación en la ensenanza [Exams: Evaluation tools in teaching]. Instituto de la UNESCO para la educación. Kapelusz. https://unesdoc.une sco.org/ark:/48223/pf0000221704 López, N., Llopis, R., Llorens, J. A., Salinas, B., & Soler, J. (1983). Análisis de dos modelos evaluativos referidos a la química de Curso de Orientación Universitaria (COU) y selectividad [Analysis of two evaluative models referring to the University Guidance Course on chemistry and selectivity]. Enseñanza de las Ciencias, 1(1), 21–25. Luckett, K., & Sutherland, L. (2000). Assessment practices that improve teaching and learning. In S. Makoni (Ed.), Improving teaching and learning in higher education: A handbook for Southern Africa (pp. 98–130). Witwatersrand University Press.

226

C. E. Förster and C. A. Rojas-Barahona

Lukas, J. F., & Santiago, K. (2004). Evaluación educativa [Educational evaluation]. Alianza. McMillan, J. H. (2003). Understanding and improving teachers’ classroom assessment decision making: Implications for theory and practice. Educational Measurement: Issues and Practice, 22(4), 34–43. https://doi.org/10.1111/j.1745-3992.2003.tb00142.x Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (pp. 13–103). Macmillan. Moss, P. A. (1997). The role of consequences in validity theory. Educational Measurement: Issues and Practices, 17(2), 6–12. https://doi.org/10.1111/j.1745-3992.1998.tb00826.x Moss, P. A. (2003). Reconceptualizing validity for classroom assessment. Educational Measurement, Issues and Practice, 22(4), 13–25. https://doi.org/10.1111/j.1745-3992.2003.tb00140.x Moss, P. A. (2007). Reconstructing validity. Educational Researcher, 36(8), 470–476. https://doi. org/10.3102/0013189X07311608 Popham, W. J. (2014). Classroom assessment: What teachers need to know (7th ed.). Pearson Education. Riggan, M., & Oláh, L. N. (2011). Locating interim assessments within teachers’ assessment practice. Educational Assessment, 16(1), 1–14. https://doi.org/10.1080/10627197.2011.551085 Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom. Rinehart & Winston. Rubie-Davies, C. M. (2006). Teacher expectations and student self-perceptions: Exploring relationships. Psychology in the Schools, 43(5), 537–552. https://doi.org/10.1002/pits.20169 Salinas, D. (2002). ¡Mañana examen! La evaluación entre la teoría y la realidad [Tomorrow’s exam! Assessment between theory and reality]. Graó. Sanmartí, N. (2007). 10 ideas clave. Evaluar para aprender [10 key ideas. Assess to learn]. Graó. Schutz, A., & Moss, P. A. (2004). Reasonable decisions in portfolio assessment: Evaluating complex evidence of teaching. Education Policy Analysis Archives, 12(33). https://doi.org/10. 14507/epaa.v12n33.2004 Sireci, S. G. (1998). The construct of content validity. Social Indicators Research, 45, 83–117. https://doi.org/10.1023/A:1006985528729 Smith, J. K. (2003). Reconsidering reliability in classroom assessment and grading. Educational Measurement: Issues and Practice, 22(4), 26–33. https://doi.org/10.1111/j.1745-3992.2003.tb0 0141.x Spear, M. G. (1984). Sex bias in science teachers’ ratings of work and pupil characteristics. European Journal of Science Education, 6(4), 369–377. https://doi.org/10.1080/014052884006 0407 Stiggins, R. J. (2001). Student-involved classroom assessment. Prentice-Hall. Valverde, G. (2006). Los próximos pasos: ¿cómo avanzar en la evaluación de aprendizajes en América Latina?: La interpretación justificada y el uso apropiado de los resultados de las mediciones [The next steps: How to progress in learning assessments within Latin America?: The justified interpretation and appropriate use of measurement results]. In P. Arregui (Ed.), Sobre estándares y evaluaciones en America Latina [On standards and assessments in Latin America] (pp. 70–81). PREAL. https://www.academia.edu/1414613/Los_pr%C3%B3ximos_ pasos_c%C3%B3mo_avanzar_en_la_evaluaci%C3%B3n_de_aprendizajes_en_Am%C3%A9r ica_Latina

Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. email: [email protected] Cristian A. Rojas-Barahona is Psychologist. Cristian A. Rojas-Barahona did Ph.D. in Psychology, University of Granada, Spain, and Postdoc at the Developmental Brain Behaviour Laboratory, Academic Unit of Psychology, University of Southampton, the UK.

9 Quality Criteria for Developing Assessment Tools

227

Cristian A. Rojas-Barahona is Associate Professor, Faculty of Psychology, Universidad de Talca, Chile, and Specialist in executive functions and cognitive development.

A Case Analysis Methodology to Guide Decision Making in the Schooling Context

10

Paola Marchant-Araya and Carla E. Förster

Abstract

This chapter addresses a look at the analysis of problems in educational contexts oriented towards evidence-based decision making. Its purpose is to provide management and research teams with a tool to systematize information from an institution, a program or project, articulate said information with the conceptual frameworks associated with the subject, and raise critical knots based on reflective analysis and propose solutions and actions for change. This method defines three stages: the construction of the case, the analysis, and the improvement proposal. For each stage, its characteristics, how to carry them out and a real example are developed. Finally, some ethical considerations that must be safeguarded are indicated, what elements the report must have and some limitations to consider.

10.1

Introduction

The purpose of this chapter is to present a method of analysis that, applied to real contexts and following a series of stages, allows both the teaching and technical teams of school organisations as well as those who carry out external assessment of educational institutions to understand a particular reality in a holistic manner and from multiple perspectives in order to propose paths of action that guide

P. Marchant-Araya (B) Pontificia Universidad Católica de Chile, Santiago, Chile e-mail: [email protected] C. E. Förster Universidad de Talca, Talca, Chile e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 C. E. Förster (ed.), The Power of Assessment in the Classroom, Springer Texts in Education, https://doi.org/10.1007/978-3-031-45838-5_10

229

230

P. Marchant-Araya and C. E. Förster

improvement and/or favour the sustainability of their decisions. Through analysis, educational institutions, programmes, or projects will be able to recognise and assess how their actions and decisions have shaped the reality that is experienced every day in the classroom, its causes, consequences, relationships and tensions. An exercise of articulation between conceptual frameworks, evidence and critical reflection is proposed, in which the participation of the educational stakeholders will be relevant, since together the pedagogical decisions for change will emerge. This method of case analysis is flexible and favours the development of higher cognitive skills in the teams, such as reflective and analytical thinking, the capacity for synthesis, evaluation, and creation. It also promotes evidence-based decision-making, which is so necessary in today’s schools (Mendoza, 2006).

10.2

Conceptualization of the Case Analysis Methodology

10.2.1 What Is a Case? In the literature we can find different definitions of what is meant by a case, but there is consensus that it is not necessarily equivalent to an individual or a qualitative study per se (Coller, 2000; Ragin, 1992); “a good case presents an interest-provoking issue and promotes empathy with the central characters” (Boehrer & Linsky, p. 45 as cited in Center for Teaching and Learning Stanford University, 1994, p. 1). The case corresponds to a singular unit of analysis that can be an organisation, programme or educational project, the experience of a group or a person who has experienced something singular, present, or past events. A case can be an experience of organisational change, the implementation of a new policy, an incident or problem experienced. The curriculum of an educational establishment can also be a case, as can the learning assessment methods used by the teachers of a department or cycle. In other words, a case is a real situation experienced at the micro, meso, or macro level (Mendoza, 2006; Simons, 2009; Stake, 1995). For Coller (2000, p. 29), a case is “an object of study with more or less clear boundaries that is analysed in context and considered relevant either to test, illustrate or construct a theory or a part of it, or for its intrinsic value”. Stake (1995), for his part, understands the case as an integrated system, something specific, complex and in operation, a singularity, or a particularity. In other words, a phenomenon that needs to be understood in depth is constituted as a case study. For Rusque and Castillo (2009, p. 53), a case “is the narration of a real or simulated situation that is presented in the context in which it occurs and is analysed on the basis of pre-established criteria”.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

231

Therefore, in this chapter we will understand a case as: The exposition of facts, data, circumstances, opposing points of view, the context of a successful or problematic situation that an organisation, programme or educational or social project has faced or is facing; the experience of a group of people in an institution, be they students, teachers and/or the management team; the response, tensions and actions that have occurred in the situation, and the effects or implications that arise from such responses, that is, facts with certain complexity and in movement, given the decisions and actions that have been and are being taken in the specific context. A case is a singularity with clear limits for three reasons: because of its own nature, that is, given its particularities in context; because it refers to a moment in time and a place; and because it involves a group of people who share an experience based on the relationships they have in common.

10.2.2 What Is a Case Study? Case studies are methodological approaches that have been developed for decades in different modalities (Fig. 10.1) and with different purposes to contribute to individual, group, organisational, social, and political knowledge related to a particular phenomenon (Mendoza, 2006; Mertens, 2015; Stake, 1995; Yin, 2009). There are case studies in the context of research (Covarrubias-Papahiu, 2016; Guizardi & Garcés, 2014; Rambla, 2013); as a teaching and learning methodology in school and university classroom contexts (Brooke, 2006; Coll et al., 2006; Davis & Wilcock, 2003; Golich et al., 2000; Instituto Tecnológico y de Estudios Superiores de Monterrey, n.d.; Revel, 2013; Servicio de Innovación Educativa, 2008); in the field of Law (Goyas & Monzón, 2016); business and business management (Ellet, 2007; Farhoomand, 2004; Richardson, 2013; Rivera, 2000); Medicine (Edelstein et al., 2014; Herreid, 1997); in the development of organisational studies and in the analysis of public policies (Mendoza, 2006). The origins of case studies are in Anthropology, Sociology, Social Work, Psychology, History and in the fields of Law and Medicine. They have also had a great development in Education and educational evaluation, in the context of curricular innovations, programmes and projects as a means to evaluate their effectiveness (Coller, 2000; Simons, 2009; Yin, 2009). Figure 10.1 presents the three contexts that group the case studies in the educational field. In the context of research, case studies emerged in the mid-nineteenth century, in Europe, with the purpose of learning about the living conditions of groups living in vulnerable contexts. From this knowledge, the aim was to have evidence that, complementing the data from surveys and measurements, would serve as a basis for social change (Coller, 2000; Ottenberger, 2000; Yin, 2009). These studies used

232

P. Marchant-Araya and C. E. Förster

Fig. 10.1 Schematic characterizing the variety of case studies

in-depth interviews, observation, and fieldwork as the main techniques for collecting information. Later, at the beginning of the twentieth century in the Chicago School, sociologists of the time developed case studies to address issues such as crime and migrant groups, expanding the selection of different sources of information for the construction of case studies, consolidating as a research method until now (Coller, 2000; Mertens, 2015; Stake, 1995; Yin, 2009). The case study is a holistic and complex approach to the singular, the particular, the unique in the context of reality (Stake, 1995). Its purpose is to construct, describe, explain, and communicate the history of a case that is the object of research. It is an exercise of documentation and interpretation of an experience lived by an institution, a programme, a project, a system, groups, or individuals, and provides information for decision-making (Ottenberger, 2000; Simons, 2009; Yin, 2009). It analyses the different perspectives of how phenomena or situations occur naturally, without the intervention of a third party, i.e., events occur in action and movement, in a real context (Simons, 2009). The interpretation of the facts is always in context and given its methodological characteristics and purposes; it does not seek statistical generalisation. In research, case studies are not exclusive to qualitative approaches, we also have case studies that involve quantitative or mixed approaches; in all of them it is fundamental the question we seek to answer and whether the focus is contemporary or not (Coller, 2000; Stake, 1995; Yin, 2009). Regardless of the above, case study research needs to meet the criteria of rigor and precision in the handling of the empirical data used (Yin, 2009). It should be sought, therefore, to avoid misinterpretations and misrepresentations, so as to ensure the validity of the results and conclusions. For this, the triangulation technique1 and the public discussion

1

The triangulation technique is explained in Stake (1995, pp. 107–116).

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

233

technique are the most recommended (Coller, 2000; Stake, 1995). Both help to verify and contrast the information gathered from different sources (informants and documents). In the context of teaching and learning, case studies date back to 1880 at Harvard Law School. This type of study, also called “case analysis”, has been used in teaching with the purpose of developing in students the ability to think and learn by themselves; to promote metacognition and self-reflection; analysis, synthesis, abstraction, reasoning, and teamwork, as well as to develop a multidimensional view of reality and promote active learning (Davis & Wilcock, 2003; Flyvbjerg, 2004; Instituto Tecnológico y de Estudios Superiores de Monterrey, n.d.; Mendoza, 2006). With a focus on the achievement of certain learning, case analysis or teaching case has been used to analyse facts, problems or situations that occur in reality or that have been constructed in a fictitious manner, with the purpose of having students discuss and debate about them (Lane, 2007), recognise their causes and effects, applying to their own lives the skills and knowledge that arise from group reflection and proposing relevant solutions (Mendoza, 2006; Yin, 2009). Given the group dynamics of the analysis, no single solution is expected, but rather emphasis is placed on the students’ ability to decompose a situation into its multiple parts in order to then organise them and shape a new understanding of the given situation, that is, to achieve learning and the development of complex cognitive skills (Flyvbjerg, 2004; Morra & Friedlander, 1999). It should be noted that in a qualitative case study approach, its validity and probative character do not depend on the capacity of representation, but on its own reality, its authenticity and not its frequency, that is, it is expected to develop the analytical generalisation and particularization of the case (Coller, 2000; Flyvbjerg, 2004; Stake, 1995; Yin, 2009). In this regard, Stake (1995, p. 85) states that the generalisation made in case studies, also called “naturalistic generalisations”, are “conclusions arrived at through personal engagement in life’s affairs or by vicarious experience so well constructed that the person feels as if it happened to themselves”. In the context of organisations, we can find different types of case studies, first and foremost and the most classic, programme evaluations, which, according to Stake (1995), are themselves case studies, since they address a particularity in a real context with a critical eye and a variety of sources. For Morra and Friedlander (1999), evaluations with case studies emphasise both the implementation of the programme and its effects, promoting participation and collaboration, with the purpose of strengthening identification with an improvement project. Another type of case study is “Data wise” (Bambrick-Santoyo, 2010; Boudett et al., 2005; Ebbeler et al., 2017; Parker-Boudett et al., 2006). This type of analysis seeks to encourage the constructive and regular use of assessment data for the improvement of teaching and learning by school leaders and teachers to support decision-making and ongoing improvement. Both the assessment and the methodology of the wise use of data seek to involve the school in the processes of

234

P. Marchant-Araya and C. E. Förster

analysis and generation of results, which allow the community to learn and make the necessary decisions to continue its development and promote student learning. As Santos-Guerra (2001) has argued, a learning school does not consider its members as a sum of individualities with specific tasks, but as a learning community working on a common project. It is in this context that the case analysis methodology for decision-making presented in this chapter arises as a real possibility for the community as a whole to analyse the history of its actions, its results, achievements, failures, critical aspects and tensions, in order to identify a problem and plan lines of action to address and improve it. Thus, there is an opportunity for the education community to work together and to reflect effectively, overcoming the routinization of practices, excessive centralisation, fear of supervision, teacher demotivation, the absence of self-criticism, and the rejection of criticism (Santos-Guerra, 2001). The case analysis methodology for educational decision-making that we present will be understood as: A type of case study that seeks, through a series of stages, the collective reflection of educational teams and the use of mixed methodological tools, to analyse the particularities of a real case, its causes, consequences or effects and singularities, based on existing and/or constructed knowledge and evidence, in order to understand the case in depth, learn from it and systematically propose strategies that allow for the improvement and/or sustainability of the actions undertaken within the school context. Analysing a case in its context involves a holistic observation exercise to understand it as a whole and not as a sum of its parts (Ragin, 1992). For Schramm (1971 cited in Yin, 2009), a case study attempts to illuminate a decision or a set of decisions: why were they made, how were they implemented, with what result? Case analysis is an opportunity for the development of communication skills, teamwork, problem solving, decision-making and assessment (Davis & Wilcock, 2003; Rusque & Castillo, 2009). It stimulates critical reflection of the teams and self-reflection of the participants. The case analysis methodology makes it possible to detect particular needs of school organisations and to make links between reality and the conceptual frameworks or knowledge possessed by those who participate, moving towards the generation of new knowledge and collective learning that arises in the analysis process itself. It also involves the stakeholders in defining actions for improvement and in solving problems based on the evidence collected and analysed. The focus of this analysis is on answering two key questions: How? and Why? rather than how many and how often? This is because the spirit of the analysis is to approach the case from a comprehensive and interpretative approach, which emphasises the experience of the subjects and the in-depth exploration of this experience. It considers the shared history and patterns of continuity and change in the collective experience. This obviously does not rule out the use of quantitative

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

235

data or measurement results; on the contrary, it is valued as evidence that enriches the understanding of a particular phenomenon or reality.

10.2.3

Responsibilities of the Participants in the Case Analysis

The participation of the different stakeholders (external analyst, internal analyst, education community) in the analysis of cases should be promoted during each of the stages of the analysis. We are not referring only to participation in the provision of information, which is the norm, but to effective participation in the sense of being part of the analysis, planning and implementation of subsequent actions. Stimulating self-knowledge and self-reflection of the participating individuals and teams will not only favour the understanding of the case, but also the commitment to the sustainability of the actions resulting from the analysis. Thus, the case analysis can be guided or directed by both an internal or an external analyst. The internal analyst is a member of the community who, by assignment of this responsibility from a superior or by an attribution granted by a collective, is responsible for directing the analysis and the proposal for improvement. This role has great challenges, since he/she must manage the different interests and positions that exist within the organisation, maintaining integrity and ensuring compliance with ethical principles such as the dignity of people, justice, and equity, as well as the value of trust and respect. The internal analyst should guide each stage of analysis by gathering all the necessary evidence from different sources to build the case study and then stimulate critical reflection by the respective group or team. Assuming the role of an internal analyst has the advantage of being close to the needs, backgrounds, and trust of your colleagues, but it also presents the difficulty of being personally involved with the experiences shared in the organisation, as well as the other members of the organisation know your position and ideas and, therefore, may question the objectivity or impartiality of your analysis; for this reason, you must be vigilant to your own beliefs and preconceptions, affiliations and commitments, to be able to take distance and contribute to the decision making process in an effective way. It is recommended, in this sense, to have a small, heterogeneous support team with whom to discuss the development of the analysis, or the support of key stakeholders, specialists in the core theme of the analysis, who can be asked for an external opinion once the case has been built and in stages prior to the analysis. The external analyst is a professional specialist in the methodology of case analysis for decision-making, who has been asked to address a problem or issue of interest to the organisation, management teams, teachers, among others. The analyst and his or her team will be in charge of directing both the analysis process and the design of the improvement proposal, so it is necessary to gather, in a first exploratory interview, the aspects that will illuminate the search for evidence from different sources.

236

P. Marchant-Araya and C. E. Förster

Assuming the role of an external analyst has the advantage of approaching the case with impartiality, collaborating with the people in the organisation to understand the situation and discover the central focus of the case beyond the symptoms or secondary aspects, which are usually the ones that are most clearly appreciated. However, the difficulty of this role is that there is no trust or knowledge of the problems or situations of the organisation, so it is a challenge to build trust and openness while respecting the dynamics and internal codes of relationship without imposing their own. Along with the above, it is the role of the external analyst to promote learning and the creation of internal capacities in the teams and stakeholders of the establishments, programmes, projects, among others, to avoid the threat of dependence and regression to previous stages once the analyst leaves the organisation.

10.3

Organisation and Structure of the Case Analysis Method for Decision Making

10.3.1

Relevance of the Case

Before starting a case analysis, it should be made explicit in the discussion with the members of the institution and in written form why we are doing this analysis, what we want to know or what we want to change. What is the need or concern to which the analysis responds, i.e., the horizon in which it is carried out must be determined. Then it is necessary to be certain that we are dealing with a case, in the terms previously mentioned, as well as to comply with the following aspects that denote its relevance to be analysed: (1) It is a singular case, a particularity that must be understood in its totality and complexity, in a systematic way, through a motivated sampling (Coller, 2000; Stake, 1995). (2) The interest of the analysis does not fall on a particular subject, be it a researcher or a member of the organisation, programme or project, but in itself constitutes a phenomenon of interest because of its scope, the stakeholders involved, its effects or consequences, and its exemplary nature for other cases or its typicity. (3) From the analysis, decisions for improvement or change can be made, which can be implemented in the short or medium term. If the case already exists, then certain baseline conditions should be kept in mind to implement the case analysis methodology, such as: • Management team with willingness and commitment. • Access to information from different sources and available for analysis. • Active stakeholders participating in the processes of analysis and improvement actions.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

237

• Analysts who know the methodology and the critical issues that are worked on in the analysis, so that the discussion goes beyond common sense or the daily life experiences of the subjects. With these conditions in place, it is possible to move towards the construction of the case itself, fulfilling the following stages.

10.3.2

First Stage: Building the Case

10.3.2.1 Contextualisation The first aspect to describe is Contextualisation. This should describe the environment in which the case to be analysed is inserted. The context describes information such as: type of institution, geographic location, resources, number of students and teachers, infrastructure, levels of learning achievement in standardised measurements, internal and/or external assessments, school climate, organisational structure, history, among other aspects. This refers to aspects that allow the reader to situate the case, but which are external to it. It is also part of the social, economic, and political context of the moment, both in the organisation and in the country, if applicable, in addition to the characteristics of the neighbourhood or locality in which the case is developed. The information is presented from the general to the specific, trying to leave at the end of the context what can be directly related to the case in order to connect both parts: context and case. An example of context information is presented below: The case study takes place in a full-day elementary school, which serves 359 students from kindergarten through 8th grade. The school is located in the commune of Valparaíso and is part of an educational foundation. This foundation also has two full-day schools located in other municipalities in the region. The socio-economic composition of the school’s students is primarily lower middle class, with a vulnerability index of over 83.8%. In addition, the school is located in a commune with a high crime rate. The school receives funding from the Preferential School Subsidy (Subvención Escolar Preferencial, SEP by its acronym in Spanish) law, which classifies it as an emerging school because it has not shown “systematically good educational results” of its students according to the instruments designed by the Ministry of Education (Holz, 2019, p. 3). Furthermore, standardised evaluations indicate that the school’s students have very low results compared to students with a similar Socioeconomic Group (Grupo Socioeconómico, GSE by its acronym in Spanish). Contextual information can be obtained from a preliminary interview with stakeholders in the school organisation, in which the need, interest and motivation to

238

P. Marchant-Araya and C. E. Förster

carry out the case are explained. General background information, reports or documents can also be requested that allow us to become involved little by little in the experiences shared. The context can be complemented and enriched once all the information on the case is available.

10.3.2.2 Information Gathering Process In order to build a case and describe it in its entirety, it is necessary to plan how the information will be collected. Thus, as a starting point, it is necessary to recognise the need that led to the case analysis and the core theme or topic involved in this need. Subsequently, the dimensions, sub-dimensions or criteria that characterize the dimension are defined, which will make it possible to organise the collection in an exploratory manner and prevent information that may be relevant from being left out. The disaggregation of the dimension into sub-dimensions or criteria will depend on the complexity of the aspects to be addressed; there may be dimensions that are disaggregated and others that are not. Subsequently, the sources of information and the instruments and/or procedures that will be used to collect the data are defined. In the construction of a case, various quantitative and qualitative tools and techniques are used, i.e., individual and group interviews, questionnaires, observation record guidelines, among others. The more varied the evidence collected, the greater the wealth of information available. In case analyses, the most appropriate techniques to use will be direct observation with its guidelines or recording procedures, interviews, analysis of documents and questionnaires or scales of appreciation. If you want to carry out a case study analysis in a school where there is a concern for the assessment of student learning and its measurement results, then you should plan to collect information in a matrix such as the following (Table 10.1). Once the collection of information has been planned, the instruments can be designed so that they can be applied. In this process, the quality criteria specific to construction, which are described in the specialised literature,2 must be respected and fulfilled. As important as the above is to submit both the data collection matrix and the constructed tools to the judgement of the expert (professional specialist in the thematic area and in the measurement or assessment of the same), before they are applied, in order to be able to adjust the aspects that are necessary to ensure that the collection of information is carried out at an optimum level of quality. When applying such instruments or procedures, it is important to consider ethical aspects that safeguard the confidentiality of the information and take care to “do no harm” to the person providing the information, both in the interview process itself and in the subsequent handling of the information.

2

There is a variety of educational research manuals that describe the elements that should be considered in the construction of data collection instruments or techniques. It is also possible to find this information in social research methodology texts, such as: Hernández et al. (2006), Mella (2003), Corbetta (2003).

1.3.1 Type of agents participating in the assessment according to its purpose

1.3 Assessment agents

1.1.3 Consistency between learning objectives and indicators

1.1.2 Declaration of assessment indicators

1.1.1 Specification of expected learning: conceptual, procedural, and/or attitudinal knowledge

1.2.1 Presence in the didactic unit of the initial formative, process formative, and summative purposes

1.1 Learning objectives

It refers to the definition of what is being evaluated, the purpose of the assessment process, the moments, who participates in the assessment, the situations and instruments selected

Criterion that characterizes the dimension

1.2 Purpose of the assessment

Subdimensions

Operational definition (how the dimension is understood)

Dimensions: 1. Planning of the assessment strategy

Purpose or need: analysing the school’s assessment practices

Table 10.1 Information collection matrix

• Teachers • Unit planning • Assessment regulation or similar

Information sources

• Individual and/or group interviews • Questionnaires • Documentary evidence information matrix

Information collection techniques or procedures

(continued)

• Content analysis • Descriptive statistical analysis • Documentary analysis

Information analysis techniques

10 A Case Analysis Methodology to Guide Decision Making in the Schooling … 239

Operational definition (how the dimension is understood)

1.4 Situations and assessment instruments

Subdimensions

1.4.3 Relevance of the situations and instruments with the learning objectives and indicators to be evaluated

1.4.2 Frequency in the use of situations and tools

1.4.1 Types of situations and assessment tools used

1.3.2 Reasons for the selection of assessment agents

Criterion that characterizes the dimension

Dimensions: 1. Planning of the assessment strategy

Purpose or need: analysing the school’s assessment practices

Table 10.1 (continued)

Information sources

Information collection techniques or procedures

(continued)

Information analysis techniques

240 P. Marchant-Araya and C. E. Förster

Subdimensions

Criterion that characterizes the dimension

2.2 Application of situations and tools

Refers to the set of 2.1 Design of actions that are assessment situations carried out and tools throughout the teaching/learning process to collect evidence of the level of progress and achievement of student learning

2.2.2 Clarity of administration procedures and/or protocols

2.2.1 Existence of administration protocols

2.1.3 Presence of procedures associated with the design

2.1.2 Internal consistency in the planning of situations and/or tools

2.1.1 Presence of planning of the assessment situation and/or tool

Dimensions: 2. Implementation of the assessment

Operational definition (how the dimension is understood)

Dimensions: 1. Planning of the assessment strategy

Purpose or need: analysing the school’s assessment practices

Table 10.1 (continued)

Information collection techniques or procedures

• Teachers • Individual and/or group interviews • Head of the pedagogical technical • Questionnaires unit (UTP by its • Documentary evidence acronym in Spanish) or information matrix similar • Specification tables • Situations and tools • Protocol or application procedures • Assessment regulation

Information sources

(continued)

• Content analysis • Descriptive statistical analysis • Documentary analysis

Information analysis techniques

10 A Case Analysis Methodology to Guide Decision Making in the Schooling … 241

Subdimensions

Criterion that characterizes the dimension

Information sources

3.3 Pedagogical decision making

3.2 Communication of assessment results

3.1 Interpretation and judgment about the results of the assessments

Adapted from Espinoza (2013)

Refers to the interpretation, judgment, and decision making about student learning

3.3.2 Decisions focused on administrative matters

3.3.1 Decisions focused on pedagogical matters

3.2.2 Determination of informed educational stakeholders

3.2.1 Means for communicating results

3.1.3 Reasons for the choice of analyses and type of judgments made

• • • 3.1.2 Types of • judgments (qualitative-quantitative) •

3.1.1 Types of results analysis

Teachers Correction guidelines Reviewed papers Feedback examples Assessment regulation

Dimensions: 3. Assessment decisions for the achievement and improvement of learning

Operational definition (how the dimension is understood)

Dimensions: 1. Planning of the assessment strategy

Purpose or need: analysing the school’s assessment practices

Table 10.1 (continued)

• Individual and/or group interviews • Questionnaires • Documentary evidence information matrix

Information collection techniques or procedures

• Content analysis • Descriptive statistical analysis • Documentary analysis

Information analysis techniques

242 P. Marchant-Araya and C. E. Förster

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

243

At the end of the data collection process, it will be analysed by means of statistical procedures or qualitative techniques that make it possible to systematise and organise it appropriately. For this purpose, it is suggested to consider specialists in this type of analysis, in case the teams do not have these capacities installed.3 We are now in a position to start the case configuration.

10.3.2.3 Case Setting or Framing The configuration of the case itself consists of a process that seeks to determine the structure and underlying meaning of the constituent elements of the case, following the logic of a historical narrative not in the chronological sense but in the sense of order and sequence of events that tell a coherent story (Simons, 2009). As noted above, it has a holistic sense that incorporates multiple sources in a real context. In the explanation of the case, the opposing points of view, the dynamics of change, successes and failures are made visible, written in accessible language so that everyone can understand what is being talked about and the subsequent analysis is facilitated. Include observations, incidents and scenarios that make it easy for the reader to become involved in the situation (Farhoomand, 2004). It is advisable to include in the narrative of the case textual quotations from the voices of the participants, that is, the words of the people themselves, indicating a generic characteristic that represents them, such as: teacher, student, director, without indicating the name of the person who is speaking. You can also consider documentary information, of an official nature of the establishments or programmes, duly cited. The evidence included in this section should be relevant and focused, meeting criteria such as: • Descriptive and explanatory character, without interpretation of the facts: the objective facts are described and how they happened, explaining from the evidence obtained the decisions and actions taken, the reasons that people point out, their perceptions, trust, and distrust, as they are given. • Valid and reliable evidence: the evidence obtained is free of bias or error, it refers to what happened in a truthful way, it is not due to chance. It can be verified in different ways. • No common-sense judgments or opinions: the information is supported by evidence, not just one person’s unsubstantiated opinion. • Follow a logical thread that allows you to take a line of argument. • The information is sufficient for analysis. An example of a case narrative is presented below:

3

For the analyses, it is recommended to use the text by Hernández et al. (2006) and/or the text by Ritchey (2007).

244

P. Marchant-Araya and C. E. Förster

This year, for the first time in 8 years, the school obtained good results in the International English Test, which proves that the school does indeed educate bilingual students. Parents and guardians had made many complaints and enrolment had been declining until now. After the results were announced, the demand for the following year has already risen, and when interviewing parents, one of the reasons that emerged was this improvement. The management, assuming that the success in results is due to the work that one teacher has done, elected her as the best teacher in the school (an award given annually) and proposed that she should be a role model for her colleagues, especially in the area of assessment. Some of the teacher’s highlights are that she aligns her assessments with the activities they do in class. She notes that the assessments used are similar to the activities that appear in the lesson plans. Although she still uses multiple choice questions, she also incorporates many authentic activities to assess students, such as grading matrices or guides (checklists, rating scales, holistic and analytical rubrics) to grade students’ performances and to give them feedback, but she points out that this is very time-consuming and has to be done at home. The good thing is that she uses these same guides for self-assessment and co-assessment. The teacher continuously assesses her students through informal questions in the classroom and by observing them when they work in class, designs her own assessment tools, and adopted grading matrices (rubrics) that she found on the internet to grade and interpret the results of the assessments and to give feedback to the students. As these instruments are from the “educamejor” website, she has complete confidence in their validity. She emphasises that the rubrics are in English because she works in a bilingual school and although the students’ level in English is not very high, they must learn to read them, that is the school’s regulation. One aspect that the teacher mentions that she has incorporated more and more strongly is the use of assessment criteria or achievement indicators; in this sense she says: “my students clearly know the learning objectives and the expectations they have. This helps them to guide their learning process”. Although school officials are enthusiastic about the opportunity presented by the teacher’s achievements and vision, the recognition has brought her some problems with her colleagues, who find her too structured and individualistic in her approach. But since her students are happy, she doesn’t mind those comments. After a few weeks, the school’s academic coordinator begins a process of classroom observation and goes to the model teacher’s room. As a result, she informs her of some elements that stand out and others that could be improved. Among those that stand out is the use of didactic material to encourage student participation and, also, the closing of the class. The lack of permanent monitoring of the students is presented as a weakness; he points

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

245

out that it is not so evident and that she should try to do it through more structured and systematized formative assessment activities. In a meeting with all teachers of the first and second basic cycle, a synthesis document was presented, based on the systematized teaching experience of the model teacher. Some consider it positive, although they would have liked to see a different form of feedback for their own work: “for sure now the assessment will be in line with what they say”. Others believe that it is not appropriate to give them “models” when they have had good Education Quality Measurement System (Sistema de Medición de la Calidad de la Educación, SIMCE by its acronym in Spanish) for years and parents do not complain about what their children learn. There are doubts about how and by whom they will be assessed, and they decide to hold a meeting outside the school to decide with what strategy they will confront the innovations and the procedure that is being installed, especially when there are no clear criteria for classroom observation, nor the consequences that an unfavourable assessment will have in this regard. At the end of the case setup, it is suggested to answer the following question: Does the information presented allow for a complete understanding of the situation, without leaving ambiguous aspects or lack of evidence? If the answer to the previous question is yes, you are ready to move on to the next step.

10.3.3

Second Stage: Case Analysis

In order to develop the analysis, a series of personal and collective skills of the teams must be brought into play, such as the capacity for reflection, abstraction, synthesis, self-criticism, self-monitoring and critical thinking. This stage of the analysis considers four developments that contribute to the decomposition process of the analysis: (1) SWOT analysis. (2) Questions to guide the discussion. (3) Discussion of the evidence with the conceptual aspects or theoretical references relevant to the subject of the analysis. (4) Definition of the problematic core or latent conflict. Each of these aspects will be described below with examples:

246

P. Marchant-Araya and C. E. Förster

(a) SWOT analysis: this type of analytical resource makes it possible to describe the internal and external situation of a case. It is recognised as positive by those who have used this methodology, since it forces team participants not only to focus on weaknesses but also on strengths and opportunities, which always exist in any educational reality. • Strengths (S) are those positive aspects or facts, advantages that are constituted in resources, abilities and even attitudes of the team participants. The formulation of strengths must meet the following quality criteria: 1. They are based on the information of the case, not inferred. 2. It is made explicit why a fact constitutes a strength, just naming it is not a strength in itself because what can be positive for someone can be a weakness for others. When constructing it, it is necessary to state the fact that is extracted from the case and justify why that specific fact constitutes a strength. 3. There must be coherence between the fact that is raised and its justification, that is to say, it cannot be justified by raising another strength or making a reflection on it. Example of strength

There are various assessment tools that make it possible to collect different kinds of evidence of student learning, thus favouring the objectivity of the judgments that teachers make about the levels of achievement attained. It is key at this point that the elements selected must be occurring in the actual context of the case. • Opportunities (O) are those favourable aspects both internal to the case and to the environment or context that could be taken advantage of to maintain or improve the conditions of the case. They are events or scenarios that, if present in the future, would be positive for the case in question. The formulation of opportunities must meet the following quality criteria: 1. They arise from the information of the case but making an inference towards the future. 2. It should be made explicit why they constitute an opportunity, just naming the fact is not a benefit in itself, the idea should be developed and justify why that specific fact constitutes an opportunity in that context and at that time. It must be evident that it has not been considered as a positive aspect of the problem, because it would be a strength. 3. There must be coherence between the fact that is raised and its justification, that is, it cannot be justified by raising another opportunity or making a reflection on it.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

247

Example of opportunity

The existence at the present time of new initiatives carried out by the municipality (the supporter) in regard to the assessment of learning may constitute an opportunity for resources through projects aimed at supporting teachers in assessment. The experience that the school has acquired regarding the use of standardised test data is an opportunity to develop effective results management, allowing for the design and implementation of coordinated actions aimed at a qualitative improvement in the learning that students are able to achieve. • Weaknesses (W) are those “negative” aspects, areas or weak points that interfere in the achievement of the objectives of the organisation or of the people involved. They are barriers that prevent progress or improvement. The formulation of weaknesses must meet the following quality criteria: 1. They are drawn from case information, not inferred. 2. It is made explicit why a fact constitutes a weakness, just naming it is not a weakness in itself because what may be a weakness for someone can be a threat or strength for others. When constructing a weakness, it is necessary to state the fact that is extracted from the case and justify why that specific fact constitutes a negative or detrimental aspect. 3. There must be coherence between the fact that is raised and its justification, that is to say, it cannot be justified by raising another weakness or making a reflection on it. Example of weakness

Teachers have a punishing vision of evaluation, which generates a frightening climate with students, affecting their desire to attend classes and to learn. It is key at this point that the elements selected must be occurring in the actual context of the case. • Threats (T) correspond to unfavourable aspects that are generated around the case and that could threaten the situation, put it at risk or aggravate the problems. These are threats that could adversely affect the case. Given this, it is necessary to diminish them or make them disappear, as well as the weaknesses. The formulation of threats must meet the following quality criteria: 1. They arise from the information in the case, reasonably inferring their consequences. That is, they are not invented, but arise from the known evidence. 2. It is made explicit why they constitute a threat, just naming the fact does not constitute a harm in itself, it is necessary to develop the idea

248

P. Marchant-Araya and C. E. Förster

and justify why that specific fact is a threat in the context of the case. It should be evident that it has not been considered as a negative aspect of the problem, because that would be a weakness. 3. There must be coherence between the fact that it is posed and its justification, that is, it cannot be justified by posing another threat or by reflecting on it. Example of a threat

If the school continues to use assessment results information as a measure of comparison and competition among students, the quality of student learning will be put at risk, deteriorating the formative process. (b) Questions to guide the discussion: once the SWOT has been completed, a series of questions are posed that will help the subsequent discussion. They are described as a means to generate a critical questioning or reflection of the evidence previously collected. To this end, the questions should make it possible to go from the concrete and evident to the latent and abstract, which is not always described, but is discovered by the analyst and his team, given their experience and theoretical domain. Such questions must meet the following criteria: • The answer to the question involves a complex elaboration that cannot be answered with a Yes or No. The answer to the question is not a Yes or No answer. • They should be answered with the information presented in the case. • In their writing, they should stimulate critical reflection and not just be a description of the case, they should be stressful for the analysis. Without the purpose of trying to mechanize its construction but only to guide it, it is recommended to build the questions focusing on What? How? and Why?, placing further emphasis on explaining rather than describing the phenomena (Yin, 2009), as presented in the following examples: (1) How are teachers’ decisions regarding the assessment tools selected and applied in the classroom justified? (2) How is the relationship between planning and implementation of the assessment strategy used by the teacher for the context of the children in the course? (3) Why assessment decisions made by a principal, despite being focused on results and not considering processes, can enhance student achievement and generate teacher and student satisfaction? (4) What aspects of the school’s culture could have favoured the assessment change installed by the school’s management, and slowed down the resistance inherent in a process of change?

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

249

(5) To what extent do the decisions made by the teacher reflect a planned learning assessment strategy, or is it just a sequence of activities to be carried out for the children in the level? (6) How do we explain the effects that the standardised assessment implemented in the school have had on the school culture and on the dynamics of strategic decisions regarding the development of student learning? (7) Why does the recognition by an educational institution of a teacher as a “model in evaluation” generate tensions in her colleagues and resistance to being evaluated? (c) Analysis itself or discussion: analysis is a cognitive skill of higher hierarchy, which consists of breaking down an object of study in search of an understanding or explanation of it in its particular context. In this process of analysis, two higher cognitive skills come into play with greater emphasis: analysing and evaluating. Recalling Anderson et al. (2001), the ability to analyse involves breaking down a phenomenon or situation into its parts and determining how they relate to each other. This process in turn involves differentiating the elements involved in the phenomenon under different categories, for example, the relevant from the less relevant, the characteristic from the uncharacteristic, the frequent from the infrequent, the critical from the non-critical. Then, to organise, that is, to re-structure the facts described in a new explanation of the relationships that occur in the phenomenon or situation, taking into account the tensions or opposing points of view. And to attribute, that is, to determine what is essential or the focus of the issue or situation analysed. In this same regard, in a case analysis process, the information obtained is evaluated, i.e., evidence-based judgment is made, determining a problem to be intervened or the real effects, consequences or results of the decisions taken. This is the critical moment of the methodology, all the previous activities and stages were developed for this point, in which the evidence collected, the guiding questions, the SWOT and the conceptual theoretical aspects are put in relation so that, in a critical reflection, the elements of the case are discussed. The tensions, conflicts, decisions, results of success and failure, among others already mentioned should all appear in the narrative. In addition to these aspects, theory plays an important role in this phase of analysis and in the previous ones so that each phenomenon is understood in its complexity. It is necessary to understand how the subject of analysis has been approached in literature, its characteristics, its components, how the phenomenon has been explained in the educational field and its conceptual relations. The analysis should be written as a descriptive, integrated, substantiated, and supported by theoretical aspects that will ensure that the discussion goes beyond one’s own experience, the specific situation or mere opinion or judgement without basis. It is suggested to describe the case by answering the questions posed previously and slowly and gradually integrating the SWOT aspects in the response. To support the analysis, explanations, relationships, and judgments should be based on the literature.

250

P. Marchant-Araya and C. E. Förster

An example of a discussion text is presented below: Let us return to question 7 previously presented in the sample questions: Why does the recognition by an educational institution of a teacher as a “model in evaluation” generate tensions in her colleagues and resistance to being evaluated? Proper analysis or discussion: From the information or evidence of the case presented, one can observe the emphasis that the school organisation, through its directors, has placed on results rather than on the learning process of students (SantosGuerra, 2001), and how this emphasis is also connected to the demand of parents who expect results in order to validate the quality of the school. The results of the Education Quality Measurement System (Sistema de Medición de la Calidad de la Educación, SIMCE by its acronym in Spanish) test and the International English Test to which the school was subjected are proof of this. It is suggested that there is a concern for assessment, follow-up, and monitoring of learning, based, for example, on the observation of the academic coordinator; however, the pressure from the directors on the teachers is for the results of the children. There is a lack of in-depth analysis of the learning achieved by students in order to guide decisionmaking, placing the responsibility for modelling on a team teacher, running the risk of missing out on a broader view of the students’ learning processes (Brown, 2004; Black & Wiliam, 2009). Presenting a teacher as a model of teaching and assessment in the context of the organisation is seen in a confusing and worrying way by teachers, who are unaware of the criteria and antecedents under which this recognition was attributed to the teacher. The only evidence that seems visible is the results of an evaluation. According to Taras (2002), assessment is recognised as one of the key elements of the teaching/learning process, due to the volume of information it provides to the teacher and the consequences it has for the teacher, the students, the education system in which it is integrated and society. However, that is not the only thing; curricular management, educational leadership, and climate are relevant factors for assuring the success of the school’s purposes. Therefore, it is expected that schools develop systemic strategies to face the challenges of ongoing improvement and not the use of modelling that, if they do not turn out to be paradigmatic, can provoke feelings of anger, frustration, competition, envy, among others, all of which threaten the willingness of teachers to become involved in improvement processes. (d) Definition of the problematic core or latent conflict.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

251

At the end of the analysis, the problematic focus of the case should be described, what needs to be addressed, improved, repaired, or solved. Or, on the contrary, the main explanation for success. The problematic focus does not refer to obvious aspects that have been described in the case, it is not a weakness revealed, or a sequence of facts, but corresponds to that which is latent in the case, and which constitutes its main explanation. This explanation reflects the view of the expert, who integrates his or her experience, evidence, and conceptual management in the definition. The focus should also be accompanied by a description of the stakeholders involved and the consequences caused by the problem detected in a precise manner. The wording of the problem is presented separately from and after the analysis. The latent problem or central focus that corresponds to the example presented is: The central focus of the discussion and the problem to be addressed at the school is related to a restricted view of assessment and its effects on the school by directors and some teachers, centred on certification and the summative dimension rather than on a vision of process and of a formative nature. This restricted conception permeates management decisions and proposals for action, having negative effects on the climate and on the disposition of the teaching teams.

10.3.4 Third Stage: Proposal for Improvement The improvement proposal is an alternative presented by the analyst to face the issue of the case or to ensure the sustainability of its positive effects. It is translated into a matrix with different components that guide the reader about the course of action of the proposal. The organisations value that an analysis process does not only pose a problem but that, on the contrary, it is oriented to future decision making with concrete and organised actions to ensure its implementation. In this construction process, two cognitive skills of importance for the work of the teams come into play: evaluating and creating. When assessing, teams, together with the advising specialist (internal or external), determine the best pathways to address the solution of a problem or sustainable improvement over time based on evidence from the case analysis. The process of creating comes into play when designing and planning a chosen pathway, ensuring that it serves the purpose for which it was intended (Anderson et al., 2001). This proposal must meet the following quality criteria:

252

P. Marchant-Araya and C. E. Förster

(1) Feasibility: that it can be carried out in the given establishment or classroom. (2) Validity: that it corresponds to the core issue of the analysis and not to a collateral or non-existent one. (3) Novelty: that it has not been implemented partially or totally before. Therefore, it is necessary to investigate the history of the establishment and that it is designed for this case, in its particular context. The components of a quality improvement proposal and the questions it seeks to answer are: (a) Purpose: What does the proposal seek to solve, what are the objectives of the proposal? • The purpose of this section is to establish what is sought as a result of the proposal. In other words, the objectives of the proposal are set out here. • The objectives guide the actions to be taken to achieve the purpose. (b) Activities: What activities make it possible to achieve the objectives set for the stage, and does their level of complexity make it possible to achieve the desired behaviour? • They must be in line with the objectives set. • There can be more than one action per target. • They must be feasible to carry out in the specific context of the case. (c) Means of Verification: What is the evidence that can attest to the implementation of the planned actions? • It has to do with the fidelity of the implementation of the proposal. • They are concrete evidence that show that the activity will be carried out and the quality of that realisation. (d) Resources: What material means are required for the development of the activities? • Corresponds to a list of resources that are required to carry out the proposed activity. (e) Responsible: Which stakeholders involved in the problematic situation lead the implementation of the activity? • It is essential for the realisation of each action to define a person who will be in charge of managing it, and who will be responsible for its implementation. (f) Indicators: How can we see that the defined purpose is achieved? • Measurable or observable aspects that will be taken into account to assess the level of achievement of the proposed actions. Therefore, outcome indicators are described. • Each indicator (there may be more than one for each action) sets out in concrete form some behaviour or aspect to be observed. • They are directly related to the stated purpose of the component under study.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

253

Table 10.2 Presentation format for an improvement proposal Purposes/ objectives

Activities

Means of verification

Resources

Responsible

Indicators4

No more than three objectives are recommended, since the proposal must be limited to be feasible

One or more activities for each objective

Concrete evidence that the activities were carried out

It refers to the resources needed to achieve each objective

They are specific people who are in charge of achieving the objectives, therefore it does not refer to the highest authority of the establishment

It refers to a breakdown into specific goals or purposes associated with each goal

It is important that all these aspects are consistent with each other and respond to the core problem or issue of the case. The improvement proposal is presented in a table format in order to ensure coherence between each of its aspects (Table 10.2). The participation of the members of the organisation in the design of the proposal is very important to ensure the loyalty and adhesion of the entire community or the majority of the community in its implementation. The worst thing that could happen is that after all the effort of analysis and design of the proposal, it remains on paper or on the desk of a manager of the establishment.

10.4

Considerations Regarding the Ethical Aspects of the Case Analysis

The case analysis assumes human dignity and integrity as a core value. Participants should be treated fairly, promoting dialogue, mutual understanding, respect, trust, and confidentiality. The analyst (internal or external) must ensure that people do not feel that their rights have been violated or that they are likely to suffer harm, let alone cause harm (House & Howe, 2003; Simons, 2009). If ethical dilemmas arise, it is suggested to resort to the ethical principles and practices agreed upon as a community or establishment; if these do not exist, to agree together on how to proceed by appealing to universal values and principles or to relational and situated ethics. These dilemmas often arise when interests and positions of power are at stake, so the possible tensions in the process should be explored from the outset, recognising the ethical stance of the analyst and the institution, programme, projects and/or people involved in the case (House & Howe, 1999; Simons, 2009). One option that may help is to situate oneself in a democratic ethic (House & Howe, 1999) that recognises and values the principles of fairness and justice and

4

A good text for learning how to construct indicators is Martinic (1997).

254

P. Marchant-Araya and C. E. Förster

equity, along with confidentiality, negotiation, and accessibility as ethical processes of action. Agreeing who owns the extracted data is a decision that can avoid later conflicts. Recognising and taking timely action on how to proceed ethically at the methodological level is also important, especially in the process of data collection for case setting, in aspects such as: using informed consents, respect in the interview process, confidentiality and anonymity of sensitive information, whether extracted from documentation or relating to individuals (Denzin & Lincoln, 2011). The final report and the decisions that emanate from it can also be a source of ethical dilemmas, so it is necessary to generate instances of dialogue and reflection in order not to lose sight of the people who may be involved (Simons, 2009).

10.5

Final Case Report

Case writing involves complex and reflective efforts. The realistic account of the case is described by following the chronology and/or biography of the case, issues, and solutions, rather than problems or hypotheses (Stake, 1995). Verbatim quotations of stakeholders’ voices, graphic information, tables, or diagrams that organise statistical or documentary information are considered. Writing any report involves four macro-activities: planning, organising, writing and revising (Lane, 2007). In this work it is important to describe the case in terms of its stages of development, to take care that the writing is clear, and, for this, a good strategy is to ask someone else to read it and comment on possible improvements in accuracy and writing. The formalities consider clear writing, in third person, highlighting relevant information, maintaining a logical sense, which can be structured as narratives of events, thematic areas or processes addressed in the organisation, helping the reader to understand how the facts and the analysis were given.

10.6

Limitations

The case analysis methodology for decision making, although it is a very effective tool for guiding decisions based on evidence and critical reflection, has some limitations that are important to highlight: • Circumscribed to the specific time of the observation and to the participants’ account, so it cannot be extrapolated to other stages not experienced, be they future or past. • Impossibility of generalising at a statistical level, given the singularity and particularity of the case, and the context in which it occurs, it cannot be used to generalise and infer other contexts from a quantitative logic.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

10.7

255

Examples of the Application of the Case Analysis Method for Decision Making

Espinoza, L. (2013). Análisis de caso de las prácticas evaluativas de las docentes de lenguaje y comunicación de enseñanza básica en un colegio bilingüe [Case analysis of assessment practices of elementary education language and communication teachers in a bilingual school] [Master’s degree thesis in Education, with a minor in Evaluation of Learning, Pontificia Universidad Católica de Chile]. Gaete, J. (2016). Las prácticas de uso de la información que realiza el equipo directivo y docentes de lenguaje y matemática de segundo ciclo básico de un colegio particular pagado, a partir del Sistema de Evaluación de Progreso del Aprendizaje (SEPA) [Information use practices by the management team and second-year elementary language and mathematics teachers in a private paid school, from the System of Evaluation of Learning Progress (SEPA by its acronym in Spanish)] [Master’s degree thesis in Education, with a minor in Evaluation of Learning, Pontificia Universidad Católica de Chile].

10.7.1

Experience of Teachers Who Have Used This Methodology and Comments on Each of Its Parts

Methodology in general

“It is a good methodology; it allows to develop the stages following a logical order and growing in complexity” (language teacher). “The methodology allowed me to carry out an analysis in an orderly and systematic way insofar as it allowed me to go step-by-step building the case from the context and its configuration, questions and SWOT until arriving at the analysis itself and proposing an improvement or solution to the problem” (early childhood educator). “The case analysis methodology is a very useful professional strategy and tool; it allows us to have a model to identify problems associated with the evaluation of learning, to analyse a case by combining theory and assessment practices and, together with this, it facilitates the creation of improvement proposals” (early childhood educator). “The structure is very favourable because it is a totally viable tool to use to identify the critical nodes of any process in any organisation and thus make decisions for improvement. For me, this methodology is a way of working as more practical and feasible to implement, since the changes are feasible, small but important and are born of their own analysis. In addition, the answers are in themselves or within the organisation. It is rarely necessary to look for them outside, the proposals are in the participants themselves, so this method allows them, together, to analyse problematic situations and give them solutions” (school principal).

256

P. Marchant-Araya and C. E. Förster

From the case context

“It is important to be able to understand how the problem was framed and to be able to think of a viable solution” (language teacher). “It helps to the extent that it allows us to situate ourselves in reality, as the problem occurs in the educational community to be analysed. It allows us to visualise other variables that are involved in the problem, which may not be determinant, but which are present and worthy of consideration at the time of analysis” (early childhood educator). “I think it serves to give an orientation to the rest of the steps of the methodology. However, I think that more should be devoted to SWOT, evidence/theory contrast and, of course, to the improvement proposal” (school principal).

From the configuration of the case

“I think the easiest aspect of case design is the organisation of the information: deciding what goes first and what goes second. One of the most complex aspects of case design is the assumptions or inferences; removing judgements when writing the case design is somewhat complex” (early childhood educator). “At this stage it is difficult to put aside personal judgements when you are a member of the establishment, but when you guide the analysis, as a consultant, it is easier” (school principal).

From the realisation of the SWOT and the definition of guiding questions

“SWOT is a good strategy to start the analysis, it helps to think not only about the weaknesses, but also about the positive aspects that are always difficult to see because one always goes to the bad things. The questions allow to raise the really relevant problems and not to fall in the details or in problems derived from the real problem. In any case, it is not easy to construct the SWOT or the questions, and it requires a lot of support from the specialists” (language teacher). “The contribution of SWOT is given because it allows the identification of weaknesses or threats, aspects that help in the analysis of the case and in the proposal for improvement (that is, what is not so good or what should be improved/changed). On the other hand, the strengths and opportunities make it possible to highlight the positive aspects that the community has to face the problem, and also because it gives a hopeful message to those who read

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

257

the report: ‘not everything is so bad in our reality’ or ‘oh, I hadn’t thought of this as an opportunity’” (early childhood educator). “The guiding questions allowed me to connect the SWOT to the analysis, as I answered them under the core aspects of the SWOT as I was analysing the case” (educator from kindergarten). “It is a fundamental tool, because that is where the material for discussion/ strain is taken from” (school principal).

From the case analysis and discussion

“The analysis itself forced me to put into action my reflective, analytical, synthesis, evaluation, description and writing skills” (language teacher). “The process allowed me to deploy skills related to inference: by having to detect the problem; analysis skills: by ‘breaking down’ the problem in the analysis; synthesis skills: by linking all the aspects to make an improvement proposal; writing skills: by having to write clearly and precisely the configuration, the SWOT, the problem definition, and the improvement proposal. Ability to relate aspects: by having to link all the sections together. In terms of knowledge: all the knowledge acquired so far in different training instances” (early childhood educator). “At this stage, I have to know the theory well before guiding the analysis, to have updated knowledge of the subject matter, since many times the participants don’t have time to read” (school principal).

Improvement strategy

“I think it is the most important part of the case, since it is what we are looking for with all the previous work” (language teacher). “This is important, since the case analysis does not just stop at ‘this is the problem: this is good or this is not so good for these reasons’ but goes a step further and creates a strategy that allows a solution to be found to what was previously stated. Not only is a situation discussed, but there is also a real contribution in suggesting what should be done and how” (early childhood educator). “This is fundamental, because the opposite of improvement is the objective of the analysis, it helps to improve the processes and to obtain good results. Especially because in the instances of reflection, they are receiving immediate feedback from their work teams” (school principal). In summary, as we have seen throughout the course of this book, assessment of and for learning is a very powerful tool that we teachers can use to develop the full

258

P. Marchant-Araya and C. E. Förster

potential of our students and motivate them to learn and integrate the knowledge of different subjects. However, it can also be a devastating weapon for them if we do not keep in mind ethical principles and quality criteria in the construction of the instruments, due to the consequences that these faults can have. In our journey, we highlight the importance of assessment literacy for our professional development and the quality of our teaching. If we make poor assessments, we will collect evidence of poor quality and our pedagogical decisions will be wrong. The invitation is to plan and define what we expect our students to learn and to carry out teaching and assessment tasks that lead to those goals, to involve students in their own learning process through self-assessment and peer assessment, to have planned spaces in our classes to give effective and timely feedback, to diversify the ways in which we collect evidence of learning, considering the richness of diversity in the classroom, and to work with other members of the school community to enhance our assessment practices and to work on institutional needs, generating proposals for improvement that are viable and effective.

References Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., Raths, J., & Wittrock, M. C. (Eds.). (2001). A revision of Bloom’s taxonomy of educational objectives. Longman Publishing. Bambrick-Santoyo, P. (2010). Driven by data. Jossey-Bass. Black, P., & Wiliam, D. (2009). Developing a theory of formative assessment. Educational Assessment Evaluation and Accountability, 21, 5–31. https://doi.org/10.1007/s11092-008-9068-5 Boudett, K. P., City, E. A., & Murnane, R. J. (2005). Data wise: A step-by-step guide to using assessment results to improve teaching and learning. Harvard Education Press. Brooke, S. (2006). Using the case method to teach online classes: Promoting Socratic dialogue and critical thinking skills. ERIC. Retrieved November 25, 2022, from https://files.eric.ed.gov/ful ltext/EJ1068074.pdf Brown, H. D. (2004). Language assessment: Principles and classroom practices. Pearson Education. Center for Teaching and Learning, Stanford University. (1994). Teaching with case studies. Stanford University Newsletter on Teaching, 5(2), 1–4. https://valenciacollege.edu/academics/aca demic-affairs/learning-assessment/state-assessment-meeting/documents/2016/01stanford19 994casestudies-method.pdf Coll, C., Mauri, T., & Onrubia, J. (2006). Análisis y resolución de casos-problema mediante el aprendizaje colaborativo [Analysis and resolution of case-problems using a collaborative learning approach]. Revista de Universidad y Sociedad del Conocimiento, 3(2), 29–41. https://doi. org/10.7238/rusc.v3i2.285 Coller, X. (2000). Estudio de casos [Case studies]. Centro de Investigaciones Sociológicas. Corbetta, P. (2003). Metodología y técnicas de la investigación social [Social research methodology and techniques. McGraw Hill. Covarrubias-Papahiu, P. (2016). Representaciones docentes de la educación basada en competencias. Un estudio de caso [Teachers’ representations of competency-based education. A case study]. Propósitos y Representaciones, 4(2), 73–132. https://doi.org/10.20511/pyr2016.v4n 2.120 Davis, C., & Wilcock, E. (2003). Developing, implementing and evaluating case studies in material science. European Journal of Engineering Education, 30(1), 59–69. https://doi.org/10.1080/030 43790410001711261

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

259

Denzin, N. K., & Lincoln, Y. S. (2011). The Sage handbook of qualitative research (4th ed.). Sage. Ebbeler, J., Poortman, C. L., Schildkamp, K., & Pieters, J. M. (2017). The effects of a data use intervention on educators’ satisfaction and data literacy. Educational Assessment, Evaluation and Accountability, 29(1), 83–105. https://doi.org/10.1007/s11092-016-9251-z Edelstein, M., Wallensten, A., & Kühlmann-Berenzon, S. (2014). Is case-chaos methodology an appropriate alternative to conventional case-control studies for investigating outbreaks? American Journal Epidemiology, 180(4), 406–411. https://doi.org/10.1093/aje/kwu123 Ellet, W. (2007). The case study handbook: How to read, discuss and write persuasively about cases. Harvard Business Review Press. Espinoza, L. (2013). Análisis de caso de las prácticas evaluativas de las docentes de lenguaje y comunicación de enseñanza básica en un colegio bilingüe [Case analysis of the evaluative assessment practices of elementary education language and communication teachers in a bilingual school] [Master’s degree thesis in Education, with a minor in Evaluation of Learning], Pontificia Universidad Católica de Chile. Farhoomand, A. (2004). Writing teaching cases: A quick reference guide. Communications of the Association for Information Systems, 13, 103–107. https://doi.org/10.17705/1CAIS.01309 Flyvbjerg, B. (2004). Cinco malentendidos acerca de la investigación mediante estudios de caso [Five misunderstandings about case study research]. Revista Española de Investigaciones Sociológicas, 106(1), 33–62. https://www.redalyc.org/articulo.oa?id=99717667002 Gaete, J. (2016). Las prácticas de uso de la información que realiza el equipo directivo y docentes de lenguaje y matemática de segundo ciclo básico de un colegio particular pagado, a partir del Sistema de Evaluación de Progreso del Aprendizaje (SEPA) [Information use practices by the management team and second-year elementary teachers of language and mathematics in a private paid school, from the System of Evaluation of Learning Progress (SEPA by its acronym in Spanish)] [Master’s degree thesis in Education, with a minor in Evaluation of Learning], Pontificia Universidad Católica de Chile. Golich, V. L., Boyer, M., Franko, P., & Lamy, S. (2000). The ABCs of case teaching: Pew case studies in international affairs. Institute for the Study of Diplomacy. Goyas, L., & Monzón, Y. (2016). Análisis de casos como modalidad de titulación en la carrera de Derecho de la Universidad Metropolitana del Ecuador [Case analysis as a degree modality for the Law Major at the Metropolitan University of Ecuador]. Revista Conrado, 12(53), 125–130. https://conrado.ucf.edu.cu/index.php/conrado/article/view/322 Guizardi, M. L., & Garcés, A. (2014). Estudios de caso de la migración peruana “en Chile”: Un análisis crítico de las distorsiones de representación y representatividad en los recortes espaciales [Case studies of Peruvian migration “in Chile”: A critical analysis of the distortions of representation and representativeness in spatial cut-offs]. Revista de Geografía Norte Grande, 58, 223–240. https://doi.org/10.4067/S0718-34022014000200012 Hernández, R., Fernández-Collado, C., & Baptista, P. (2006). Metodología de la investigación [Research methodology] (4th ed.). McGraw Hill. Herreid, C. F. (1997). What is a case? Bringing to science education the established teaching tool of law and medicine. Journal of College Science Teaching, 27(2), 92–94. Holz, M. (2019). Ley 20.248, de 2008, Subvención Escolar Preferencial. Análisis del contenido original y sus modificaciones (actualización a 2019). Biblioteca del Congreso Nacional de Chile. https://obtienearchivo.bcn.cl/obtienearchivo?id=repositorio/10221/28046/2/BCN_Ley_ SEP_actualizacion_modificaciones_Final.pdf House, E., & Howe, K. (1999). Values in evaluation and social research. Sage. House, E. R., & Howe, K. R. (2003). Deliberative democratic evaluation. In: Kellaghan, T., Stufflebeam, D.L. (eds) International handbook of educational evaluation. Kluwer international handbooks of education, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-030 9-4_7 Instituto Tecnológico y de Estudios Superiores de Monterrey. (n.d.). Las Estrategias y Técnicas Didácticas en el Rediseño: El estudio de casos como técnica didáctica [Didactic strategies and techniques in redesign. The case study as a didactic technique]. Dirección de Investigación

260

P. Marchant-Araya and C. E. Förster

y Desarrollo Educativo, Vicerrectoría Académica. http://sitios.itesm.mx/va/dide2/tecnicas_did acticas/casos/casos.pdf Lane, J. L. (2007, October 7). Case writing guide. Schreyer Institute for Teaching Excellence, Pennsylvania State. https://www.schreyerinstitute.psu.edu/pdf/CaseWritingGuide.pdf Martinic, S. (1997). Diseño y evaluación de proyectos sociales [Design and evaluation of social projects]. Comexani/CEJUV. Mella, O. (2003). Metodología cualitativa en ciencias sociales y educación: Orientaciones teóricometodológicas y técnicas de investigación [Qualitative methodology in social sciences and education: Methodological-theoretical guidelines and research techniques]. Primus. Mendoza, A. (2006). El estudio de casos: Un enfoque cognitivo [The case study: A cognitive approach]. Editorial Trillas. Mertens, D. M. (2015). Research and evaluation in education and psychology. Integrating diversity with quantitative, qualitative and mixed methods (4th ed.). Sage. Morra, L., & Friedlander, A. C. (1999). Case study evaluations. World Bank Operations Evaluation Department [OED]. https://documents.worldbank.org/en/publication/documents-reports/ documentdetail/323981468753297361/case-study-evaluations Ottenberger, A. (2000). El estudio de casos en la investigación social [Case studies in social research]. Universidad Tecnológica Metropolitana. Parker-Boudett, K., City, E. A., & Murnane, R. J. (2006). The “data wise” improvement process: Eight steps for using test data to improve teaching and learning. Harvard Education Letter, 22(1), 53–56. https://www.hepg.org/hel-home/issues/22_1/helarticle/the-data-wis e”-improvement-process_297#home Ragin, C. (1992). Introduction: Cases of “what is a case?”. In C. C. Ragin & H. S. Becker (Eds.), What is a case? Exploring the foundations of social inquiry (pp. 1–15). Cambridge University Press. Rambla, X. (2013). Las complejas geografías de la política educativa: Tres estudios de caso [The complex geographies of educational policy: Three case studies]. Educação & Sociedade, 34(125), 1229–1249. https://doi.org/10.1590/S0101-73302013000400011 Revel, A. (2013). Estudios de caso en la enseñanza de la biología y en la educación para la salud en la escuela media [Case studies on teaching biology and health education in secondary school]. Biografía, 6(10), 42–49. https://doi.org/10.17227/20271034.10biografia42.49 Richardson, A. (2013). Effective case analysis: Techniques for success in case-based learning an examinations. Captus Press. Ritchey, F. (2007). The statistical imagination: Elementary statistics for the social sciences (2nd ed.). McGraw Hill. Rivera, J. (Ed.). (2000). Casos de empresas: Estrategia, finanzas, marketing, recursos humanos [Business cases: Strategy, finances, marketing, human resources]. Ediciones UC. Rusque, A. M., & Castillo, C. (2009). Método de caso: Su construcción y animación [Case study method, its construction and animation]. University of Santiago de Chile. Santos-Guerra, M. A. (2001). La escuela que aprende [The school that learns] (2nd ed.). Morata. Servicio de Innovación Educativa. (2008). El método del caso [The case method]. Universidad Politécnica de Madrid. https://innovacioneducativa.upm.es/sites/default/files/guias/MdC.pdf Simons, H. (2009). Case study research in practice. Sage. Stake, R. E. (1995). The art of case study research. Sage. Taras, M. (2002). Using assessment for learning and learning from assessment. Assessment & Evaluation in Higher Education, 27(6), 501–510. https://doi.org/10.1080/026029302200002 0273 Yin, R. K. (2009). Case study research. Sage.

Paola Marchant-Araya is Social Worker. Paola Marchant-Araya did Master in Psychology, mention in Educational Psychology, and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile.

10 A Case Analysis Methodology to Guide Decision Making in the Schooling …

261

Paola Marchant-Araya is Assistant Professor at the School of Social Work, Faculty of Social Sciences UC, and Specialist in curriculum design and evaluation and training in higher education. Carla E. Förster is Marine Biologist. Carla E. Förster did Master in Educational Evaluation and Doctor in Educational Sciences, Pontificia Universidad Católica de Chile. Carla E. Förster is Professor of Universidad de Talca, Chile, Head of Department of Evaluation and Quality of Teaching, and Specialist in assessment for learning, metacognition and initial teacher training. email: [email protected]