Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test Design, Administration, Scoring, Analysis, and Interpretation [1 ed.] 9781611223873, 9781617619915

This practical book has been written to address a need for undergraduate and graduate students, teachers, and social res

182 69 2MB

English Pages 229 Year 2011

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test Design, Administration, Scoring, Analysis, and Interpretation [1 ed.]
 9781611223873, 9781617619915

Citation preview

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

EDUCATION IN A COMPETITIVE AND GLOBALIZING WORLD

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

STANDARDIZED ASSESSMENT AND TEST CONSTRUCTION WITHOUT ANGUISH THE COMPLETE STEP-BY-STEP GUIDE TO TEST DESIGN, ADMINISTRATION, SCORING, ANALYSIS, AND INTERPRETATION

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

EDUCATION IN A COMPETITIVE AND GLOBALIZING WORLD Additional books in this series can be found on Nova‟s website under the Series tab.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Additional E-books in this series can be found on Nova‟s website under the E-book tab.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

EDUCATION IN A COMPETITIVE AND GLOBALIZING WORLD

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

STANDARDIZED ASSESSMENT AND TEST CONSTRUCTION WITHOUT ANGUISH THE COMPLETE STEP-BY-STEP GUIDE TO TEST DESIGN, ADMINISTRATION, SCORING, ANALYSIS, AND INTERPRETATION SAAD F. SHAWER

Nova Science Publishers, Inc. New York

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2012 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers‟ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Additional color graphics may be available in the e-book version of this book. LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA Shawer, Saad F. (Saad Fathy) Standardized assessment and test construction without anguish : the complete step-by-step guide to test design, administration, scoring, analysis, and interpretation / Saad F. Shawer. p. cm. Includes index. ISBN 978-1-61122-387-3 (e-book) 1. Educational tests and measurements--Design and construction--Handbooks, manuals, etc. I. Title. LB3051.S45 2010 371.26'2--dc22 2010036161

Published by Nova Science Publishers, Inc. † New York

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

CONTENTS Preface

vii

Part I: Background and Layout Chapter 1

Understanding Assessment and Evaluation

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Part II: Test Construction: Test Aim, Objectives and Type

1 3 13

Chapter 2

Stage 1: Writing Aims and Objectives

15

Chapter 3

Stage 1 (Continued): Writing Cognitive Objectives

25

Chapter 4

Stage 1 (Continued): Writing Affective Objectives

41

Chapter 5

Stage 1 (Continued): Writing Psychomotor Objectives 51

Chapter 6

Stage 2: Write the Test Type

61

Part III: Test Construction: Test Table of Specifications and Type of Items

71

Chapter 7

Stage 3: Create a Table of Specifications

73

Chapter 8

Stage 3: (Continued) Create a Table of Specifications

85

Chapter 9

Stage 4: Determine Type of Items

93

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

vi

Contents

Part IV: Test Construction: Test Validity and Trial Chapter 10

Stage 5: Validate Content

105

Chapter 11

Stage 6: Test Trial

117

Part V: Test Construction: Test Administration, Scoring, Analysis and Interpretation

133

Chapter 12

Stage 7: Administer the Test

135

Chapter 13

Stage 8: Score, Analyze and Interpret the Test

139

Chapter 14

Stage 8: Score, Analyze and Interpret the Test (Continued)

147

Stage 8: Score, Analyze and Interpret the Test (Continued)

163

Stage 8: Score, Analyze and Interpret the Curriculum Test

171

Appendix A

The Curriculum Achievement Test

181

Appendix B

Alignment Matrix of Test Items with the Six Levels of Cognitive Objectives

199

Appendix C

Answer Key

203

Appendix D

Answer Sheet

205

Chapter 15 Chapter 16

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

103

References

207

Index

213

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PREFACE This practical book has been written to address a need for undergraduate and graduate students, teachers, and social researchers and professors. It proposed simple step-by-step eight stages of test construction, with each chapter being concerned only with one stage. In some cases, two or more chapters discuss a single stage. Each chapter starts with pre-reading advance organizer questions that also act as hands-on exercises about the main issues each chapter will raise. Moreover, the aim and objectives are stated at the beginning of each chapter to remind the readers of what they are expected to achieve by the end of every chapter. The book focuses on practice rather than theory since an actual test was used to act as a reference example to shows how each step and stage of the test construction process can be put into action. The book is divided into 5 parts with 16 chapters covering the 8 stages of the test construction process. Part I comprises chapter 1 (Understanding assessment and evaluation) that discusses testing, assessment, measurement, and evaluation in addition to outlining the test construction stages. Part II discusses stages 1 and 2 of test construction in chapters 2, 3, 4, 5, and 6. Stage 1 focuses on writing test aim and objectives in chapters 3 (Writing cognitive4 (Writing affective5 (Writing psychomotorStage 2 discusses types of tests in chapter 6 (Write the test type). Part III is concerned with stages 3 and 4 of test construction, chapters 7, 8, and 9. Stage 3 shows the process of creating a test table of specifications in chapters 7 (Create the Table of Specifications) and 8 [(Create the Table of Specifications (continued)]. Stage 4 focuses on the type of test items in chapter 9 (Determine type of items).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

viii

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Part IV is concerned with stages 5 and 6, chapters 10 and 11. Stage 5 discusses test validation in chapter 10 (Validate content) whereas Stage 6 explains test reliability in chapter 11 (Test trial). Part V covers stages 7 and 8 of test construction in chapters 12, 13, 14, 15, and 16. Stage 7 discusses test administration in chapter 12 (Administer the test). Stage 8 focuses on test scoring, analysis and interpretation in chapters 13 (Score the test), 14 (Analyze the test), 15 (Interpret the test), and 16 (A curriculum test scoring, analysis & interpretation). This book may be consulted by undergraduate and graduate students as well as teachers, educational researchers and professors. Here is the book layout: Part I: Background & Layout Chapter 1: Understanding Assessment and Evaluation Part II: Test Construction: Test Aim, Objectives & Type Stage 1: Test Aims and Objectives Chapter 2: Writing Aims and Objectives Chapter 3: Writing Cognitive Objectives Chapter 4: Writing Affective Objectives Chapter 5: Writing Psychomotor Objectives Stage 2: The Test Type Chapter 6: Write the Test Type Part III: Test Construction: Test Table of Specifications & Type of Items Stage 3: Test Table of Specifications Chapter 7: Create the Table of Specifications Chapter 8: Create the Table of Specifications (Continued) Stage 4: Test Type of Items Chapter 9: Determine Type of Items Part IV: Test Construction: Test Validity & Trial Stage 5: Test Validation Chapter 10: Validate Content Stage 6: Test Trial Chapter 11: Test Trial Part V: Test Construction: Test Administration, Scoring, Analysis & Interpretation Stage 7: Test Administration Chapter 12: Administer the Test Stage 8: Test Scoring, Analysis & Interpretation Chapter 13: Score the Test

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Preface

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Chapter 14: Analyze the Test Chapter 15: Interpret the Test Chapter 16: A Curriculum Test Scoring, Analysis and Interpretation

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

ix

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

PART I

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

BACKGROUND AND LAYOUT

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 1

UNDERSTANDING ASSESSMENT AND EVALUATION

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Are there differences between a test, assessment, measurement and evaluation? Define program, program outputs and program objectives. What is the impact of assessment-driven contexts on teachers, students, and principals? Mention three uses and three objects of evaluation? Show the differences and similarities between teacher, student, program, and curriculum evaluation. What are the stages of achievement test construction?

AIM(S) OF THE CHAPTER To understand the basic terms of assessment and evaluation.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

4

Saad F. Shawer

OBJECTIVES OF THE CHAPTER To define a test, assessment, measurement and evaluation. To list at least one difference between testing and assessment, measurement and assessment and evaluation and assessment. To state the stages of achievement test construction. To write three uses and three objects of evaluation. To list four implications of assessment-driven contexts on the teachers, students and school principals. To define a program, program outputs and program objectives. To write a definition of student evaluation.

INTRODUCTION

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

This chapter discusses a number of basic terms relevant to understanding measurement, assessment and evaluation according to the following order: 1. Assessment 1.1 Testing and Formative and Summative Assessment 1.2 Range and Uses of Assessments 1.3 Impact of Assessment-Focused Contexts on Teachers, Students and Principals 2. Evaluation 2.1 Program, Program Outputs, Program Objectives 2.2 Curriculum, Student and Teacher Evaluation 2.3 Uses and Objects of Evaluation 2.4 Evaluation, Testing, Assessment and Measurement 3. Stages of Test Construction

1. ASSESSMENT Assessment is a process of gathering information about individuals in terms of ability, attributes and other characteristics by means of several data collection instruments (Hughes, 2003; Shawer, Gilmore, & Banks-Joseph, 2008; Shawer, Gilmore, & Banks-Joseph, 2009). This involves a collection of

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Understanding Assessment and Evaluation

5

evidence about the progress that students, teachers and schools have made. Assessment has traditionally been confined to standardized testing in the form of multiple-choice and other close-ended question techniques. Standardized testing has been a powerful assessment tool beacuse it determines classroom content, pedagogy (procedures and activities) and learning outcomes (target behaviors, skills and knowledge to be changed). Although standardized testing is a common student evaluation method, other methods exist. For example, observation of actual performance is also a powerful assessment tool. Moreover, interviews, assignments, projects and portfolios are also useful tools for assessment and evaluation.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.1. Testing and Formative and Summative Assessment Testing is a tool for gathering information about individuals‟ ability (e.g., language ability). Being so, it is a useful tool for formative and summative assessments. Formative assessment is a process of checking on individuals‟ progress during a learning program or course by gathering information about the extent to which the target subjects have mastered what they have been expected to learn and using this information to modify their future learning plans. Informal tests, learning portfolios and quizzes are useful tools for gathering information in formative assessment. In contrast, summative assessment is a process of gathering information at the end of a learning program or course by gathering information to measure what the target subjects have achieved. Formal tests are the common tools used to gather information in summative assessment (Adkins, 1974; Frederiksen, Mislev & Bejar, 1993; Gall, Borg & Gall, 1996; Shawer, 2000).

1.2. Range and Uses of Assessments Students, teachers and schools come across a variety of assessments and evaluations which include national, state, school and classroom assessments. National assessments involve a set of standards developed by national professional organizations. These include tests like SAT (scholastic aptitude test). Although state assessments also involve standards set by professional organizations, these are defined by professional organizations in a specific state or province, like state mastery tests. Moreover, school and classroom

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

6

Saad F. Shawer

assessments involve the use of a variety of assessment tools, like observations, tests, portfolios and projects to check the extent to which the national and state standards are met. Assessments are used for a variety of purposes. For example, assessments could be used to measure how schools meet national and state standards, to promote students from grade to grade and for graduation. Assessments are also used to secure funds to schools on the basis of student achievement of the standards in addition to comparing performance across schools in a specific state, province or region (Assessment, Evaluation & Curriculum Re-design Workshop, 2009).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.3. Impact of Assessment-Focused Contexts on Teachers, Students and Principals In the contexts where schooling is assessment-driven, teachers are expected to determine the targets of assessment and set their curriculum, lesson plans and methods of teaching in the light of the outcomes to be assessed. They design lesson plans and content in the light of national, state and local standards. In addition, these teachers develop their assessment rubrics or criteria in the light of national and state standards to assess students‟ progress and use the results of their assessments to adapt teaching to the set standards (Assessment, Evaluation & Curriculum Re-design Workshop, 2009). In such contexts, the students will be also clear about what they are expected to accomplish. They plan their learning to make use of content and teaching in ways that help them to achieve the target skills. For example, if students are assessed through only multiple-answer questions, they will focus on breaking down content and skills in ways that help them to choose correct answers. If they have to create answers, they are expected to invest their learning in making inferences, conclusions and collection of evidence that help them to develop the skills and knowledge that they will be tested in. Moreover, the students will constantly assess themselves to make sure they are working enough to get the target skills. They are expected to use critical thinking and reflections to develop their knowledge and skills to a level capable of addressing what they would be asked to meet (Assessment, Evaluation & Curriculum Re-design Workshop, 2009). School principals are no different when leading and managing in such assessment-focused contexts. They will be influenced more than their students and teachers by national and state assessments. Not only are they expected to

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Understanding Assessment and Evaluation

7

draw the teachers‟ attention to achieve the set standards but will also define the desired performance that teachers and students have to achieve. Furthermore, principals will organize their resources in ways conductive to achieving the set standards (Assessment, Evaluation & Curriculum Re-design Workshop, 2009). For example, school principals may purchase specific software materials, initiate specific teacher development programs or may direct teachers to develop small units to address certain pedagogical deficiencies. By the same token, school principals monitor the consonance between classroom teaching and expected outcomes in the light of the national and state standards. Moreover, they establish direct links between teachers who teach a specific subject. For example, teachers who teach the same subject in lower and higher grades are asked to coordinate with other colleagues so that they focus on specific things or show the link between the topics in these different grades. In addition, principals would urge teachers to revise their curriculum and teaching practice and evaluate the whole curriculum and teacher performance in the light of the set standards (Assessment, Evaluation & Curriculum Redesign Workshop, 2009).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2. EVALUATION Evaluation is a word that covers judgments of many kinds. Evaluation can involve informal subjective assessments made by ordinary people to judge the value, merit or worth of something (Shawer, 2000). In contrast, it may mean a formal and systematic examination of a planned social intervention conducted by a professional evaluator. Formal evaluation is „a form of disciplined inquiry that applies scientific procedures to the collection and analysis of information about the content, structure and outcomes of programs, projects and planned interventions‟ (Clarke, 1999, p. 1). Moreover, evaluation research is „a systematic application of social research procedures in assessing the conceptualization and design, implementation and utility of social intervention programs (Rossi & Freeman, 1982, p. 20). According to Struening and Guttentag (1975), evaluation is conducted to examine the effects of a program on individuals, groups, institutions and communities in the light of program objectives. Broadly speaking, evaluation involves the collection, analysis and interpretation of information about something to determine how well it is

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

8

Saad F. Shawer

performing and using the resulting information to make decisions about its improvement or change.

2.1. Program, Program Outputs, Program Objectives A program is an intervention introduced to achieve specific objectives so that it can address a social need or solve an identified problem. In other words, a program involves a specific number of steps that we take to achieve a particular aim. In addition, program outputs are the services that a program offers whereas program objectives are the formal goals to which all program resources are directed (Patton, 1990; Rutman, 1984; Shawer, 2000).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.2. Curriculum, Student and Teacher Evaluation As indicated above, evaluation involves the collection, analysis and interpretation of information about something to determine how well it is performing and using the resulting information to make decisions about its improvement or change. This can simply be applied to all objects of evaluation. For example, curriculum evaluation refers to the collection, analysis and interpretation of information about a curriculum to determine how well the curriculum is performing and using the resulting information to make decisions about curriculum improvement or change. Similarly, student evaluation refers to the the collection, analysis and interpretation of information about student progress and using the resulting information to make decisions about student learning. Moreover, teacher evaluation or appraisal is the collection, analysis and interpretation of information about teacher performance and using the resulting information to make decisions about teacher career (Shawer, 2010a).

2.3. Uses and Objects of Evaluation According to Rossi and Freeman (1982), Gall et al. (1996) and Shawer (2010b) evaluation can be used for:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Understanding Assessment and Evaluation

9

management and administrative purposes to assess the appropriateness of program changes. identifying ways to improve the delivery of interventions. accountability and identifying funding requirements. planning and policy purposes to test innovative ideas on how to deal with human problems. deciding on whether to expand or curtail a program. evaluating instructional methods (e.g., lecture, inquiry teaching). evaluating curriculum materials (e.g., textbooks, multi-media packages). evaluating programs (e.g., language arts programs, teacher education programs). evaluating organizations (e.g., kindergartens, alternative schools). evaluating educators (e.g., in-service teachers, teacher aids, school principals). evaluating students (e.g., elementary students, college students).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.4. Evaluation, Testing, Assessment and Measurement Although evaluation, testing, assessment and measurement are closely connected terms, they are different. Testing is only one of many data collection methods used in assessment, measurement and evaluation. Measurement refers to the collection of data about a specific phenomenon, which may involve several instruments, including tests, observations, interviews and others. However, assessment does not mean measurement because assessment involves both measurement and testing, being a process of collecting evidence about the progress that students, teachers and schools have made and assigning values about performance. Moreover, assessment does not mean evaluation either because assessment does not involve making judgments, it just provides crude values to be used for evaluation purposes. It is evaluation that involves assessment, measurement and testing alongside making judgments for taking future actions about the targets of evaluation (Alderson, Clapham, & Wall, 1995; Clarke, 1999; Shawer, 2000).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

10

Saad F. Shawer TEST CONSTRUCTION STAGES

STAGE 1: WRITE TEST AIM & OBJECTIVES - Aim: assess mastery of course content - Objectives: assess attainment of course specific learning

outcomes STAGE 2: WRITE TEST TYPE - Performance: achievement & criterion-referenced

STAGE 3: CREATE A TABLE OF SPECIFICATIONS - Step 1: determine test content - Step 2: determine relative weight of content - Step 3: determine relative weight of objectives - Step 4: distribute items among themes (units) of content - Step 5: distribute items among levels of cognitive objectives

-

STAGE 4: DETERMINE ITEM TYPE - Objective techniques: multiple-choice questions STAGE 5: VALIDATE TEST CONTENT - Content validity: a jury of content experts STAGE 6: TRY OUT THE TEST - Internal consistency: Cronbach’s alpha - Item analysis: single out weak items - Test length: average time of test-takers

STAGE 7: ADMINISTER THE TEST - Proctors: who watches the examinees - Instructions: oral or/ and written - Resources: venue, clock and other equipment

STAGE 8: SCORE, ANALYZE & INTERPRET THE TEST Scoring: reliable answer key, answer sheet, scorer agreement Analysis: (data entry into computer) (mean, standard deviations & percentages (descriptive statistics)) Analysis: one-way planned and post-hoc ANOVAs, post-hoc multiple comparison (Scheffe & Tukey HSD)(inferential) Interpretation: criterion-reference, comparing scores against cut-off criteria rather than with other examinees’ scores

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Figure 1.1. Stages of test construction (Source: Shawer (2010c, p. 212)).

3. STAGES OF TEST CONSTRUCTION As shown in figure 1.1, this book adopts the following stages and steps proposed by Shawer (2010c) for achievement test construction: 1. Stage 1: Write the test aim and objectives (what abilities to be tested) 2. Stage 2: write the test type (achievement, aptitude etc.) 3. Stage 3: Create a table of specifications: Step 1: determine test content Step 2: determine relative weight of content Step 3: determine relative weight of objectives Step 4: distribute items among themes of content Step 5: distribute items among levels of objectives 4. Stage 4: determine type of items/ techniques (e.g. multiple-choice) 5. Stage 5: validate content 6. Stage 6: test trial Step 1: determine test reliability

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Understanding Assessment and Evaluation

11

Step 2: calculate item analysis (coefficient of difficulty/ facility, item discrimination) Step 3: determine test length (overall time of the test) 7. Stage 7: administer the test 8. Stage 8: Score, analyze and interpret the test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Each of these stages will be discussed in a separate chapter. We will deal with achievement test through an actual test to make the process easy to follow and comprehend. We will consider a „curriculum‟ course the subject-matter of our testing in order that we can rely on actual subject matter.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

PART II

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

TEST CONSTRUCTION: TEST AIM, OBJECTIVES AND TYPE Stage 1: Test Aims and Objectives Chapter 2: Writing Aims and Objectives Chapter 3: Writing Cognitive Objectives Chapter 4: Writing Affective Objectives Chapter 5: Writing Psychomotor Objectives Stage 2: The Test Type Chapter 6: Write the Test Type

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 2

STAGE 1: WRITING AIMS AND OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Define 'aim' and „objective‟. Mention the characterizing features of a learning aim. Draw sharp differences between an 'aim' and 'objective'. Mention the characterizing features of a learning objective. What are the criteria of writing a precise objective? What are process objectives and outcome objectives? Say if this is an aim or objective: „the students will be able to answer first degree division problems‟. Why is writing clear aims and objectives vital to teaching and learning?

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

16

Saad F. Shawer

AIM(S) OF THE CHAPTER The students will formulate clear aims. The students will formulate precise objectives.

OBJECTIVES OF THE CHAPTER To write a clear definition of a general aim. To write a clear definition of a learning aim. To mention the characterizing features of a learning aim. To translate a learning aim into a set of specific objectives. To write a clear definition of a learning objective. To mention the characterizing features of a learning objective. To recognize learning objectives from a list of aims and objectives.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

INTRODUCTION This chapter throws much light on the skills involved in writing test aims. It focuses on the macro skills of writing objectives through explaining the meaning and scope of aims. It also discusses the skills involved in writing lesson objectives by focusing on the micro skills of writing objectives through explaining the meaning, scope and classification of objectives. Chapters 3, 4 and 5 will continue to highlight the process of writing objectives. Chapter 3 focuses on the micro skills of writing cognitive, chapter 4 discusses writing affective objectives whereas chapter 5 explains the processes involved in writing psychomotor objectives. As for this chapter, it focuses on the following points according to this order: 1. 2. 3. 4. 5. 6. 7.

Skill 1: Defining an Aim Skill 2: Recognizing the Defining Characteristics of an Aim Skill 3: Defining an Objective Skill 4: Recognizing the Defining Characteristics of an Objective Skill 5: Writing Precise Aims and Objectives: Tips to Avoid Problems of Writing Objectives Importance of Writing Clear Aim and Objectives

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1: Writing Aims and Objectives

17

8. Classification of Objectives.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. SKILL 1: DEFINING AN AIM Imagine you held your suitcase at an airport. You met a friend who asked where you were heading for and you answered „I do not know‟. Imagine the impact of the response on that friend. This is the situation of a teacher who enters a classroom or constructs a test without having a clear idea of what they want to achieve. Generally speaking, an aim is what someone wants to achieve or get. In other words, an aim is the final destination someone aspires to reach. However, defining an aim depends on what level of objectives we deal with (i.e., 'aims' or 'objectives'). Specifically speaking and from a pedagogical perspective, an aim is the general and positive cognitive, affective and psychomotor changes teachers intend to make in students as a result of going through a lesson or course experiences. In other words, an aim assists teachers to compare the differences made in student learning before and after taking a target lesson or course (Shawer, 2006). From a testing viewpoint, an aim is the general and positive cognitive, affective or psychomotor changes examinees wish to ascertain that examinees have achieved. In other words, an aim assists examiners to compare the differences made in student learning before and after taking a target lesson or course by means of a data collection method like tests. For example, the students before taking a social studies class did not know the capitals of four European countries (cognitive), four Asian countries and four African countries. After finishing this class, the students could recall the capitals of all or most of these countries. A pre and post lesson comparison is the result of stating our aim (Shawer, 2006). Now you have developed the skills of defining an aim, how can you make sure what wrote was an aim not an objective? This is the second skill you need to develop as discussed in section 2 below.

2. SKILL 2: RECOGNIZING THE DEFINING CHARACTERISTICS OF AN AIM To be able to write or recognize aims, you need to make sure that the aim you write or come across possesses the following defining characteristics:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

18

Saad F. Shawer An aim just gives the direction to a specific destination rather than others, but it does not tell how we can reach that destination; The scope of an aim is broad, therefore it is difficult to measure it as it is; A set of objectives is needed to cover the different dimensions of one aim; An aim must be translated into specific steps (objectives) to be measured.

In the above-mentioned social studies example, the aim only directed the teacher and students to the social studies domain rather than science and other disciplines. Within the social studies domain, the aim directed the teacher and students to teach and learn about some capital cities rather than others. This aim, however, did not tell the teacher and students what capital cities to teach and learn (Shawer, 2006).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3. SKILL 3: DEFINING OBJECTIVES Contrary to aim, an objective is the specific and positive cognitive, affective and psychomotor changes teachers intend to make in students as a result of going through a lesson experiences. In other words, an objective assists teachers to compare the specific differences made in student learning before and after taking a target lesson. From a testing standpoint, an objective is the specific and positive cognitive, affective or psychomotor changes examinees wish to ascertain that examinees have achieved. In other words, objectives assist examiners to compare the specific differences made in student learning before and after taking a target lesson or course by means of a data collection method like tests (Shawer, 2006). An objective is what the students will be able to do by the end of a lesson (Pollard & Triggs, 1997). For example, the students before entering a social studies class do not know (cognitive) the capitals of four European countries (London, Paris, Rome and Madrid), four Asian countries (Delhi, Tokyo, Baghdad and Beijing) and four African countries (Cairo, Addis Ababa, Johannesburg and Tunisia). After finishing this class, the students will, for example, mention that Cairo is the capital of Egypt. They will be also able to mention that Egypt is an African country and will be able to mention the same for other countries.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1: Writing Aims and Objectives

19

Objectives (also known as learning outcomes or performance objectives) are statements of what learners can do as a result of learning (Bloom, 1956; Gallagher & Smith, 1989). A key characteristic in objectives definition is the pre and post learning comparison in terms of what learners can do by the end of a lesson or program (through for example a test) which they could not do before attending such a lesson or program (Mager, 1962). Moreover, objectives concern learning outcomes resulting from teaching. For example, if a student could not answer a specific question but after going through instruction this particular student is able to answer this particular question, we have evidence the objective has been realized (Mager, 1975). In summary, an aim is just the desired state of change you hope to achieve whereas each objective is a step toward achieving the general change (aim). Moreover, pre and post lesson comparisons on each objective assist us to make sure that each specific change has been realized. Now you have developed the skill of defining objectives, how can you write precise objectives?

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

4. SKILL 4: RECOGNIZING THE DEFINING CHARACTERISTICS OF OBJECTIVES Each objective should reflect these defining characteristics: The scope of an objective is limited; It is easy to measure each objective as it is; It meets very specific criteria; A set of objectives is needed to achieve one aim; Each objective is a specific step toward achieving one aim. In the above-mentioned social studies example, when the students mention a capital city of a country, this means they realized one objective. When they answer that a particular country is in a certain continent, they realize another objective. When they do the same for each specific country, they achieve all the objectives (Richards, 2001; Shawer, 2006; Tyler, 1949). Guilbert (1984) and Westberg and Jason (1993) stress that precise objectives should be: relevant to learner developmental stage and purposes of learning feasible in the sense that they could be achieved transferable into explicit and measurable behavior (observable)

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

20

Saad F. Shawer substantial in terms of targeting worthwhile outcomes.

Mager (1962) introduced a model for writing behavioral objectives stipulating that any objective should have three elements: (a) a measurable or action verb; (b) specification of the target behaviors that the learners should be able to do; (c) and the criteria of competency. Activities or process objectives are not objectives (outcome objectives). Activities are the means used to achieve objectives. Let's give examples in the coming paragraphs to help you better develop the following specific micro skills: 1. 2. 3. 4. 5.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

6. 7. 8. 9. 10.

define an aim write an aim state the defining characteristics of an aim recognize an aim from a group of aims and objectives translate an aim into a set of objectives that cover all the aim dimensions define an objective write an objective state the defining characteristics of an objective recognize an objective from a group of aims and objectives write a set of objectives that cover all the dimensions of one aim.

5. SKILL 5: WRITING PRECISE AIMS AND OBJECTIVES Suppose a teacher has set out this as the lesson aim: 'foreign language students will successfully understand a reading text in English'. You may notice that this is an 'aim' not 'objective' by going through the characterizing features of both aim and objective. It is an aim because we do not know exactly how we can measure student successful reading and understanding of the text. Its scope is broad because it did not tell us that the students will be able to do this, this and this. However, it just guides us to a main destination (ability to read and understand texts) but does not tell us how we can reach or achieve that. It is like someone who wants to travel from Paris to London. All we know the person will go to London not to Madrid, for example. It does not tell whether the person will take a train, car, ship or a plane. Nor does it tell the time a person takes to reach London. It does not tell about the day of travel,

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1: Writing Aims and Objectives

21

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

either. These details are the specific things that an aim does not tell, but a set of objectives does. As we have pointed out, learning objectives, according to Pollard and Triggs (1997, p. 255), are 'statements of what you want pupils to learn.' Objectives are simply what you want your students to learn as a result of teaching. What the teacher will do and what the students will do must be a reflection of the planned objectives. In other words, all teaching and learning activities as well as assessment have to be directed toward achieving the set of objectives. Precisely, the teacher and students' roles revolve round achieving the lesson objectives. An objective can tell the specific things that help achieve the general aim. For a teacher to achieve this aim „foreign language students will successfully understand a reading text in English‟, he/she needs to translate this aim into objectives. For example, the teacher states that by the end of the lesson, the students will be able to: specify the central or general idea (topic) in the text specify the main ideas in the text locate the supporting ideas of each main idea in the text answer five questions about the text (teacher has to mention them) give synonyms of five difficult vocabularies (teacher has to mention them) read the text with maximum three pronunciation mistakes read five sentences in the text in less than two minutes. You can see now that the teacher could measure if students have successfully read and understood the text in English by getting the students achieve the above specific things (objectives). You may have observed that an aim just specifies the general or final learning behavior we want students to achieve (destination). In contrast, objectives show how this could happen. For teaching purposes, teachers should set out the lesson aim clearly and specify a set of objectives that can help the students to achieve the lesson aim. Similarly, for testing purposes, teachers should write the test aim clearly and specify a set of objectives to help them make sure their students have achieved the test aim. Formulating clear aims is therefore a vital prerequisite to teaching and testing purposes. Imagine the above aim was formulated as „the students will successfully understand English.‟ This is not an aim that could be achieved in a typical lesson, course or test. Indeed, it is not an aim as it did not direct us to a specific destination. It did not direct us to the teaching of reading, writing,

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

22

Saad F. Shawer

vocabulary or listening. Nor did it help us focus on something to achieve. It is like someone his destination is Europe! We do not know where in Europe. Where will the plane be heading for? How can we achieve something we do not know?! Remember an aim is to direct us to something rather than others. By the same token, developing a skill at formulating clear objectives is important. Imagine a teacher attempts to realize the above aim „foreign language students will successfully read and understand the text in English‟ through these objectives:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. 2. 3. 4.

the students will be able to specify the ideas in the text the students will be able to answer any questions about the text the students will be able to read the text the students will be able to give synonyms of vocabulary.

All of the above are not precise objectives. Remember an objective is to tell what the students will be able exactly to do and the criteria of doing it. The first is not an objective because it does not specify what kinds of ideas the students will be able to recognize. It did not tell the students to specify, for example, the central idea, the main ideas or the supporting ideas of each main idea. To declare the first one as a correct objective, you must break it down into specific things students could do by the end of the lesson or achieve on a test. Each specific thing stands for an objective as follows: the students will be able to specify the central idea in the text the students will be able to specify the main ideas in the text the students will be able to specify the supporting ideas of each main idea in the text. The second one „the students will be able to answer any questions about the text‟ is not a precise objective either. It does not specify the type of questions to be answered. A precise objective would be „the students will be able to answer three questions that cover the three supporting ideas of the fourth main idea in the text‟. The third is not a precise objective „the students will be able to read the text‟ because it does not spell out specific things if done we can say the students achieved that objective. A precise objective can be „the students will be able to read the first paragraph with a maximum of three pronunciation mistakes‟ and „the students will be able to read the first paragraph at a minimum speed of five sentences a minute‟. Can you now reformulate the third

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1: Writing Aims and Objectives

23

one „the students will be able to give synonyms of vocabularies‟ to make it a precise objective? Before discussing the processes of writing cognitive, affective and psychomotor objectives in chapters 3, 4 and 5 respectively, it is useful to first highlight some tips about objectives and how vital they are to the teaching and learning situations.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

6. TIPS TO AVOID PROBLEMS OF WRITING OBJECTIVES Avoid writing a broad objective as this will be confused with an aim. Do not write two objectives in one. Select a single aspect of behavior to develop or test. Do not write an objective that has no behavior to develop or evaluate. Just make clear the specific thing the students have to do. Words like „understand‟ do not involve observable behaviors. Make sure you specify the type of objective. If cognitive, specify if the students have to remember, understand, apply, analyze, evaluate or create (chapter 3). Specify the criteria to which an objective must conform to be able to measure it.

7. IMPORTANCE OF WRITING A CLEAR AIM AND OBJECTIVES Aim and objectives determine the content to be taught in class and tested. Aim and objectives determine the use of a specific teaching method rather than others in class. Aim and objectives determine the activities to be carried out in class. Aim and objectives determine the learning outcomes to be assessed. Aim and objectives determine methods of learning assessment. Aim and objectives guide the planning and design of instruction and the assessment of the learning that results from this instruction. Aim and objectives show the focus and value of the teaching program, lesson or test.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

24

Saad F. Shawer Aim and objectives focus attention on what to be achieved and help teaching to be organized. Aim and objectives define the roles of teacher and students. Aim and objectives could be used to appraise teachers and assess students. Aim and objectives spell out exactly the content of a course and exams.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

8. CLASSIFICATION OF OBJECTIVES It is not enough at all that a teacher decides the lesson aim and translates that aim into a set of objectives that covers its facets. Teachers also need to decide on the kind of objectives they want to achieve, whether cognitive, affective or psychomotor. If you plan to provide students with new information, write cognitive objectives. If planning for changing students' attitude toward something or increasing their motivation to do something, use affective objectives; or use psychomotor objectives to train students to acquire certain skills, like using a microscope or drawing a map. Depending on the lesson aim, it is possible to plan for realizing all these different categories of objectives in one lesson or test. Alternatively, you could focus on developing one category of objectives rather than the others (Shawer et al., 2008). In summary, you have to develop the following macro skills: 1. Write cognitive objectives to develop mind/brain-related outcomes (chapter 3). 2. Write affective objectives to develop heart-related outcomes (chapter 4). 3. Write psychomotor objectives to develop body-related outcomes (chapter 5).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 3

STAGE 1 (CONTINUED): WRITING COGNITIVE OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What is meant by cognitive objectives? How many levels could cognitive objectives be written at? Draw a distinction between cognitive, affective and psychomotor aims. What processes does developing student cognition at the remembering level involve? Classifying and organizing are processes at which level of cognitive objectives? Using existing things in new ways is a process involved at which cognitive objective? Judging something in light of a set of criteria is a process involved at which level of cognitive objectives? What are the main criticisms of cognitive objectives? Which level of cognitive objectives do the following processes reflect?

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

26

Saad F. Shawer Storing written and oral information and concepts. Retrieving and recognizing written and oral information and concepts. Which level of cognitive objectives do the following processes reflect? Translating and interpreting information and drawing comparisons. Identifying the main characteristics of something. Explaining the reasons behind something. Which level of cognitive objectives do the following processes reflect? Breaking down a whole into main and supporting elements. Finding out the relationships between the supporting elements and main elements of a whole. Linking the sub-parts to the main part. Identifying the causes/motives behind a whole. Breaking down the organizational structure of a whole Which level of cognitive objectives do the following processes reflect? Being clear about the rules of doing something. Using the rules of doing something in practicing situations. Which level of cognitive objectives do the following processes reflect? Making judgments about ideas validity and quality through some criteria. Making a judgment through providing a set of objective and rational criteria. Making judgments based on personal evidence. Making judgments based on internal and external evidence. Comparing personal, internal-based and external-base evidence criteria. Making judgments of methods, material, ideas and opinions against a set of stated criteria. Assessing the strengths and weaknesses of something. Assessing the anticipated impact of something. Taking decisions on the basis of assessing the time and effort to be spent on something against the expected outcomes.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

27

Assessing a solution to a problem. Assessing an aesthetic work and stating the judgment criteria. Which level of cognitive objectives do the following processes reflect? Producing new and original things. Proposing a plan to solve a problem Proposing new ways of improving an existing thing. Thinking of ways of remedying the weaknesses of something. Suggesting alternative methods of doing or using something. Viewing something in ways different from those already known. Using something existing in different ways from those in use. Replacing something already existing with a new other. Producing something new and original. What is the difference between creativity and synthesis?

AIM(S) OF THE CHAPTER

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

To develop micro skills of writing cognitive objectives.

OBJECTIVES OF THE CHAPTER By the end of this chapter, you will be able to write cognitive objectives at the level of: Remembering Understanding Applying Analyzing Evaluating Creating

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

28

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

INTRODUCTION Unless teaching and assessment are in line with one another, effective learning and teaching are difficult to occur. This means two things: (a) Teachers should assist learners to develop their thinking at different levels by assisting them to remember, understand, apply, analyze, evaluate and create. (b) Teachers must reflect in their exams exactly the thinking levels at which they taught. A mismatch between these two (teaching and assessment) definitely results in teaching, learning and testing difficulties and even failure. For example, a teacher who teaches and assesses at the remembering and understanding levels does one good thing but also makes a grave mistake. The good thing is that she assessed learners at what (i.e. thinking level, not specific information) she exactly taught them. It is like a teacher who gave a boy in every of 30 classes a ball to keep till the end of the term. At the end of the term (exam), the teacher asked the boy to give back the 30 balls. The mistake that the teacher made is depriving her learners from developing their minds to the fullest. For example, her students are not expected to be able to apply, analyze, evaluate or create knowledge. In contrast, a teacher who teaches at the remembering level and assesses at the creating level makes two grave mistakes. The first mistake is that she also deprived her learners from developing other levels of thinking. The second mistake is that teaching them at a low thinking level (remembering) and assessing them at a higher level (creating) that they have not been taught at. This means the teacher failed to match her teaching to her assessment. She is like someone who trained a person to drive a car but at the time of testing driving ability the trainer asked the trainee to demonstrate ability at swimming. Of course, it is better to train the person to do both. This chapter therefore seeks to help the readers to develop the micro skills of writing cognitive objectives at these levels: remember (knowledge), understand (comprehension), apply (application), analyze (analysis), evaluate (evaluation) and create (synthesis). It also seeks to help the reader to be able to reflect these different levels of thinking in their test items. The chapter proceeds in this order: 1. Levels and Dimensions of Cognitive Objectives 1.1. Factual, Conceptual & Procedural Knowledge 1.2. Meta-Cognitive Knowledge (Meaning & Importance) 1.2.1 Components of Meta-Cognition 1.2.2 Meta-cognitive Strategies

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

29

2. Lower-Order Thinking Level of Objectives 2.1. Write Objectives at the Knowledge/ Remembering Level 3. Medium-Order Thinking Level of Objectives 3.1. Write Objectives at the Comprehension Level 3.2. Write Objectives at the Application Level 4. Higher-Order Thinking Level of Objectives 4.1. Write Objectives at the Analysis Level 4.2. Write Objectives at the Evaluation Level 4.3. Write Objectives at the Synthesis (Creating) Level 5. The Cognitive Objectives Debate

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. LEVELS AND DIMENSIONS OF COGNITIVE OBJECTIVES The behaviorist psychologist Bloom (1956) provided taxonomy suggesting student cognition develops at five hierarchical levels: knowledge, comprehension, application, analysis, synthesis, and evaluation. Anderson et al. (2001) revised the original taxonomy by using verbs instead nouns (e.g., using „remember‟ instead of „knowledge‟). Anderson et al. also changed the order of synthesis and evaluation, placing „synthesis‟ (create) at the top of the hierarchy instead of „evaluation‟. This new classification has become known as Anderson et al.'s taxonomy that involves: „remember‟, „understand‟, „apply‟, „analyze‟, „evaluate‟ and „create‟. Moreover, a fourth dimension (meta-cognitive) has been added to Bloom‟s three dimensions of knowledge that included „factual‟, „conceptual‟ and „procedural‟.

1.1. Factual, Conceptual and Procedural Knowledge Factual knowledge (also what knowledge) is a basic knowledge to any discipline because it includes the facts, concepts, and terms that all learners need to acquire or know as a foundation to subsequent learning processes. Conceptual knowledge refers to classified knowledge that involves principles, generalizations, theories, and models pertinent to a particular domain. Procedural knowledge (also how knowledge) concerns that kind of knowledge which learners could use to perform or do something in addition to the skills or

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

30

Saad F. Shawer

methods of inquiry. Procedural knowledge also concerns how things happen and relate (Shawer, 2001).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.2. Meta-Cognitive Knowledge (Meaning and Importance) Meta-cognitive is the ''knowledge or beliefs about what factors or variables act and interact in what ways to affect the course and outcome of cognitive enterprises'' (Flavell, 1979, p. 907). Such knowledge is the ''cognition that reflects on, monitors, or regulates first-order cognition'' (Kuhn, 2000, p. 178); and can be "reflected in either effective use or overt description of the knowledge in question'' (Brown, 1987, p. 65). Meta-cognitive knowledge can therefore play a significant role in determining the purposes, route and content of learning (Wenden, 1998). It can, for example, lead learners ''to select, evaluate, revise, and abandon cognitive tasks, goals, and strategies in light of their relationships with one another and with… [their] own abilities and interests with respect to that enterprise'' (Flavell, 1979, p. 908). Through metacognition, learners discover their learning styles to maximize learning (Flavell, 1985; Wenden, 1986). Thus, meta-cognitive knowledge influences and facilitates effective strategy use and instruction, enabling learners to ''think about and deal with cognitively complex tasks… pay attention, represent, remember, or transform information'' (Moely, Santulli & Obach, 1995, p. 301). This “properly organized learning results in mental development and sets in motion a variety of developmental processes that would be impossible apart from learning” (Vygotsky, 1978, p. 90). Metacognitive strategies enable students to “explain how and why cognitive development both occurs and fails to occur” (Kuhn, 2000, p. 179). Metacognition determines strategy type via multiple direction processes between performance and meta-level thinking until the most effective strategies are matched with the task. However, it is situation difficulty that requires superior strategies.

1.2.1. Components of Metacognition Meta-cognition comprises person, task, and strategy knowledge (Flavell, 1979). Person knowledge is that ''general knowledge learners have acquired about human factors that facilitate or inhibit learning'' (Wenden, 1998, p. 518), which concerns intra- and inter-individual differences. Intra-individual differences are learner beliefs about their ability to achieve a task while inter-

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

31

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

individual differences concern learner awareness of the differences between their own and others‟ abilities (Flavell, 1979). Task knowledge concerns task analysis by thinking about task purpose, classification and demands. Learners think about task purpose to determine task compatibility with their needs and goals, while they identify differences between current and previous tasks through task classification. Task demands require learners to determine the task requirements. If learners possess enough abilities to meet the task requirements, they get confident of completing it. If not, they either give it up or persevere to acquire new skills to complete it. Task knowledge (part of task demands) is the information learners have already about their subject (Wenden, 1998). Task knowledge can be “abundant or meager, familiar or unfamiliar, redundant or densely packed, well or poorly organized, delivered in this manner or at that pace, interesting or dull” (Flavell, 1979, p. 907). Individuals‟ knowledge about task demands in terms of task easiness or difficulty influences their motivation to meet the task requirements. Strategic knowledge is that “general knowledge about what strategies, why they are useful, and specific knowledge about when and how to use them” (Wenden, 1998, p. 519).

1.2.2. Metacognitive Strategies Though overlap between cognitive and meta-cognitive strategies exists, they can be differentiated through their use. For example, if self-questioning is used to check mastery of learning, then it is a meta-cognitive strategy. If used for study, self-questioning is a cognitive strategy (Lemcool, 2007). Metacognitive strategies are “general skills through which learners manage, direct, regulate and guide their learning, i.e. planning, monitoring and evaluating” (Wenden, 1998, p. 519). Learners use them to over-view, pay attention, set goals, plan, organize and self-monitor learning (Hedge, 2000). Learners also employ meta-cognitive strategies to assess achievement, determine task demands, use appropriate cognitive strategies and regulate learning in line with outcomes. Corrective action forms part of meta-cognition and self-regulated learning (SRL) through modifying task procedures as appropriate to achieve the goals and making use of human (e.g., capable peers and teachers) and information resources to sort out unexpected problems (Lemcool, 2007). Empirical research showed positive correlations between learners' meta-cognitive knowledge and regulation of learning and academic achievement (Goh, 1997; Horwitz, 1987; Mori, 1999; Murray, 2007; Thornbury, 1996; Wenden, 1998).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

32

Saad F. Shawer

Since meta-cognitive knowledge deals primarily with the planning, processing and monitoring of cognitive tasks, synthesis (create) rather than evaluation is the higher-order level of thinking because synthesis involves the processes involved in evaluation and other lower level objectives. Therefore, this book presents the cognitive hierarchy of objectives according to this Anderson et al.'s (2001) classification: „remember‟ (knowledge), „understand‟ (comprehension), „apply‟ (application), „analyze‟ (analysis), „evaluate‟ (evaluation), and „create‟ (synthesis). The hierarchy moves from the simple to the complex, the concrete to the abstract, and from the specific to the general.

2. LOWER-ORDER THINKING LEVEL OF OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.1. Skill 1: Writing Objectives at the Knowledge/ Remembering Level In many cases a teacher needs to just provide students with bare facts and concepts that students need to remember (keep in memory). This usually suits pre-intermediate level students who need to acquire bare facts and information required to performing subsequent higher-order tasks. On other occasions, teachers need to supply prior concepts and facts to advanced level students in order that teachers are able to teach a specific lesson. In these cases, teachers need to write objectives at the 'knowledge/remembering' level to provide the basic facts, information and concepts upon which subsequent learning depends. A student cannot achieve a comparison or an analysis of something without having first this basic (factual) knowledge. Similarly, assessors or examiners need to write cognitive objectives at the remembering level to check the extent to which students acquired the facts and information delivered to them. Whether teaching or testing students, teachers and test developers can develop their skills at this level by: 1. Asking learners to recall or retrieve information in a written or oral form. Use action verbs similar to these: state, define, label, list, mention, tell and spell. You may also ask questions starting with what, who, when, where and how many. Examples of these objectives involve: State/mention/list the five personal subject pronouns. What are the personal object pronouns? How many demonstrative pronouns are there in English?

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

33

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Who is the current United Nations Secretary General? Write/label/name a person who manages a school. Define democracy, statement sentence, and imperative sentence. Spell these words: „rudimentary,‟ „category‟, „creative‟ and „clerk‟. Pronounce these words: „rudimentary,‟ „category‟, „creative‟ and „clerk‟. Pronounce the second letter in the English alphabet. 2. Asking students to recognize in a written/ visual or oral form. Use action verbs similar to these: identify, recognize, locate, underline and match. Examples of these objectives involve: Identify the subject pronouns from „me‟, „myself,‟ „we‟, and „I‟ (written). Locate adjectives in this sentence: „a wise person said it‟ (written). Recognize this word „argumentative‟ in the tape (oral). Recognize the correct pronunciation of the letter „F‟ in my speech (oral). Underline the subject in „the lion hunted the man‟ (written). Match the capital and small letters in the columns (written). All the above examples of objectives just require very simple or low-order thinking. The students need solely to mention, match, underline or recognize either in written or oral forms.

3. MEDIUM-ORDER THINKING LEVEL OF OBJECTIVES 3.1. Skill 2: Writing Objectives at the Comprehension/ „Understand‟ Level Definitely teachers aim to develop (and examine) student cognition further so that they can understand the knowledge they have acquired or stored. This entails objectives at the 'comprehension/„understand‟ level'. You could use three techniques: translation, interpretation and extrapolation. When the students transform/ translate something from one state to another, they show evidence of understanding. When they interpret some basic information by

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

34

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

showing meaning, they display other evidence of understanding. When students put together different pieces of information to make an idea or proposition, then they show understanding. The students demonstrate understanding when they are able to grasp meaning, organize and arrange material mentally. You can develop your skill at this level by: 1. Asking students to draw comparisons and relationships. Use action verbs similar to these: compare, show the similarities, contrast, and show the differences. Examples of these objectives involve: Compare statement and imperative sentences. Show the differences between statement and imperative sentences. Show the similarities between statement and interrogative sentence. Contrast these simple present and simple past sentences. 2. Asking students to organize and classify. Use action verbs similar to these: organize, classify, and put in order. Examples of these objectives involve: Put these events in a chronological order. Organize these sentences from simple to complex. Classify these sentences into statement, interrogative and imperative categories. 3. Asking students to identify main characteristics. Use action verbs similar to these: describe, summarize, and trace. Examples of these objectives involve: Describe the events in this picture. Identify the central, main and supporting ideas in the text. Trace all words with the same sound in the text. 4. Asking students to translate, interpret and extrapolate. Use action verbs similar to these: explain, translate, interpret, restate, exemplify, infer, justify, convert and paraphrase. Examples of these objectives involve: Explain why the verb in the sentence ends in „S‟. Translate/convert this sentence into a formula. Interpret the meaning these two sentences intend to convey. Restate/paraphrase this sentence. Exemplify/give example of an interrogative sentence. Infer the principle upon which these two sentences rely on.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

35

3.2. Skill 3: Writing Objectives at the „Application/„Apply‟ Level‟

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Storing, retrieving and understanding knowledge have little value if students cannot apply or generalize their abstract learning into concrete situations. This means teachers have to develop their skills of writing objectives at the „application/„apply‟ level‟. Do you think that getting and understanding information is enough?! If the students cannot make use of what they learn, then their learning is useless. Imagine you have learnt at school the way of making a kitchen table. You should make a table at home when you get the same equipment needed for making a table. Again, imagine students learnt a mathematical formula that was applied to a particular example. Do you think the students who learnt it have actually learnt it if they could not apply it to new examples! Teachers have then to train students in using or implementing information that they have learnt to answer new questions or solve new problems in new situations. Teachers and test developers could develop the skill of writing objectives at the application (apply) level through: 1. Making clear the rules that govern a situation or product to be created (by teachers at the time of teaching). 2. Giving students situations to experiment with the rules on new examples (by teachers at the time of teaching). 3. Asking learners to apply (use) their understanding to new similar situations (by examiners at the time of testing). Use action verbs similar to these: implement, apply, compute, calculate, determine, solve, use, and find out: According to the simple sentence definition, which of these sentences is simple? According to our definition of democracy, which of these countries is democratic? Turn the following sentences into compound. Add these numbers. Determine which of the following people have the task-oriented leadership style based on our definition of task-oriented leadership. Find out the passive interrogatives of these sentences. Use the sentence formation rule to form a sentence out of these words.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

36

Saad F. Shawer Solve this problem: 2 x 15 ÷ 2 =.

4. HIGHER-ORDER THINKING (PROBLEM-SOLVING) LEVEL OF OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

4.1. Skill 4: Writing Objectives at the “Analysis/„Analyze‟ Level” A higher-order level of thinking suggests that students need to know how to analyze learning tasks into small component parts through recognizing the underpinning elements, relationships, and principles. They need to break down tasks and information into parts to identify the motives, causes, elements, relationships and organizational principles underpinning a whole. This places pressure on you to develop the skill of writing and measuring objectives at the “analysis/„analyze‟ level”. For example, when a teacher gives students an essay to read, they should be able to analyze it. They should identify the central idea, the main ideas, and the supporting ideas. They should know whether the tone of the essay is positive or negative. Moreover, they should identify the tense and type of sentences used and so on. Imagine they read and cannot perform these tasks! Teachers and test developers could develop the skill of writing objectives at the analysis level through asking students to: 1. 2. 3. 4. 5. 6.

Break down a whole (or a learning task) into the main parts. Break down each main part into its constituent sub-parts. Find out the relationships between main parts and the whole. Link sub-parts to the main part. Identify the causes/ motives behind the whole. Break down the organizational structure of the whole. Use action verbs similar to these: identify the motives, identify the causes, work out the main elements, work out the minor elements, analyze, diagnose, examine, and determine the factors. Analyze the text into its main and smaller component parts. Work out the tense, tone and style of the text. Work out the motives behind the murder. Determine the main suspects linked to the murder scene. Diagnose the structural errors in the text.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

37

4.2. SKILL 5: WRITING OBJECTIVES AT THE EVALUATION/„EVALUATE‟ LEVEL

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Students also need to develop evaluation abilities as prerequisites for creation or creativity (synthesis) tasks and thinking. They need to know how to use internal and external evidence and criteria to assess something. Students need to assess which processes, elements, techniques and methods are suitable for the creation of a new and creative whole. Teachers and examiners need to develop student judgmental abilities. This relates directly to critical thinking skills and ability to make objective judgments. This involves the ability to know how to use a set of criteria according to which students can reach sound decisions, take action, and invest much effort and time into something. Imagine you have much money and you think of developing a way of keeping planes flying when having severe technical problems. If you have not assessed in detail and calculated the time, money and effort to be spent, you might end up with a disaster or at least a huge loss in resources. To help students achieve this high-order level of thinking, you need to develop the skill of writing objectives at the “evaluation/„evaluate‟ level”. You could develop the skill of writing objectives at this level by: 1. Asking students to make judgments about the validity and quality of ideas in the light of a set of criteria. 2. Asking students to provide a set of objective and rational criteria through which they have to make their judgments. 3. Asking students to make judgments based on personal evidence. 4. Asking students to make judgments based on internal evidence. 5. Asking students to make judgments based on external evidence. 6. Asking students to compare their personal, internal-based and external-base evidence criteria in order to revise their initial judgments. 7. Asking students to make judgments of methods, material, ideas and opinions against a set of stated criteria. 8. Asking students to assess the strengths and weaknesses of something. 9. Asking students to assess the anticipated impact of something. 10. Asking students to take decisions on the basis of assessing the time and effort to be spent on something against the expected outcomes. 11. Asking students to assess a solution to a problem.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

38

Saad F. Shawer 12. Asking students to assess an aesthetic work and to state the judgment criteria. Use action verbs similar to these: appraise, defend, prioritize, value, evaluate, rank, rate, judge, decide, review, critique, and assess. Appraise a teacher‟s method of teaching in the light of these criteria. Assess your own weaknesses and strengths. Evaluate this essay. Defend the actions taken by the hero of this short story. Critique/ review this article.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

4.3. SKILL 6: WRITING OBJECTIVES AT THE SYNTHESIS/„CREATE‟ LEVEL We come to the skill of teaching students how to synthesize separate parts into a new whole. The sum of all objectives is synthesis (create) where students learn to compile different kinds of information into a whole by proposing new and original ways of using things. This means they can produce a new plan and propose a new set of operations. Moreover, they learn how to develop a model or a set of abstract relations that would improve the way something is done. Students can learn how to generate new things and view existing things in different ways. Suppose law students were given a murder crime to figure out its mystery to achieve the value of justice. If provided with some motives, suspects, and witnesses, could they put these pieces of information together to catch a killer? You cannot help them achieve this higher-order of thinking unless being able to write objectives at the “synthesis/ „create‟ level”. You could develop the skill of writing objectives at this level through: Asking students to produce new and original things. Asking students to propose a plan to solve a problem. Asking students to propose new ways of improving an existing thing. Asking students to think of ways of remedying the weaknesses of something. 5. Asking students to suggest alternative methods of doing or using something. 1. 2. 3. 4.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Cognitive Objectives

39

6. Asking students to view something in ways different from those already known. 7. Asking students to use something existing in different ways from those in use. 8. Asking students to replace something existing with a new other. 9. Asking students to produce something new and original.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Use two categories of action verbs that guide you to write the objectives at the creating level. The first category involves those verbs referring to the production of something new and original (non-existing), like produce, generate, develop, propose, create, make, devise, design, innovate, originate, predict, plan, compose, formulate, construct, and invent. The second category refers to manipulating existing things in new ways. These verbs include: improve, change, combine, synthesize, and use in different ways. Develop a plan for this new project. Compose/write an essay about alternative energy sources. Improve this essay. Compile these paragraphs into an essay with making necessary adjustments. Improve this essay‟s writing style and tone to become acceptable to opponents. Write a conclusion to this essay. Predict a conclusion to this story.

5. THE COGNITIVE OBJECTIVES DEBATE The debate about cognitive objectives concerns two main criticisms: (A) The complex relationship between the six cognitive objectives (B) The order of cognitive objectives. In regard to the complexity of relationship which Bloom himself acknowledged involves the difficulty to separate the domain of each level from the others. In other words, when we perform tasks at the application (apply) or analysis (analyze) level, does this mean that processes of comprehension (understand) are not involved? No one can claim that the processes involved at the different levels are mutually exclusive. This makes it

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

40

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

difficult to attempt writing objectives at one level without involving the others. The same problem applies also to the relationship between the cognitive, affective and psychomotor objectives. We cannot be sure that the processes involved in cognitive objectives exclude affective or psychomotor processes. It should be noted, however, that though writing objectives at one level involves processes of some other objectives, the main on the target level is sufficient enough to distinguish it from the others. The second criticism is pointed at the order of the objectives at the hierarchy. The criticism concerns the order of synthesis (create) and evaluation (evaluate) in particular, where it creating is believed to be the highest-level order of thinking as it involves evaluation in addition to the other levels. Taking creating as the highest level has been introduced by Anderson et al. (2001). As pointed out earlier on, this is the order embraced in this book. Having developed the skills at writing objectives relating to mind/brain development (cognition), let's help the reader to develop the skills of writing objectives that relate to the heart (motivation, attitudes, interests, feelings, likes, and dislikes) in the following chapter.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 4

STAGE 1 (CONTINUED): WRITING AFFECTIVE OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What are affective objectives? Which part of the body is most concerned with affective objectives? Which of these processes are involved in achieving objectives at the responding level? Providing content that meets students‟ needs; Providing substantial rather than trivial content; Providing content consonant with students‟ interests; Providing methods and activities that engage students in learning; Providing tasks of appropriate difficulty levels (neither too easy nor too difficult); Using positive reinforcement of student learning. Which level of objectives do the following processes reflect? Making students appreciate a way of doing things rather than others.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

42

Saad F. Shawer Showing students' commitment to a desirable value (democracy, justice, etc.). Making students replace a negative belief or attitude with desirable ones. Making students think about the worth of the learning provided. Which level of objectives do the following processes reflect? Showing the importance of the topic to students; Showing how relevant the topic is to students; Linking task achievement to positive or negative reinforcement; Asking students to make preparations to teach specific lesson elements; Asking students about ways of making the lesson interesting to them; Asking students questions about the lesson before teaching it; Using various advance organizers (e.g., students tell a personal experience); Making group competitions and praising winners. Which level of objectives do the following processes reflect? Giving priority for doing things rather than others. Putting learning tasks on a hierarchy according to worth. Justifying why some learning tasks are in a higher order than the others. Justifying why some learning tasks are preferred to the others. Classifying values together and resolve conflicts among them. Saying internal value system. Developing a philosophy of life out of constructed values. Balancing between right values (e.g. Freedom) and responsibility values. Which level of objectives do the following processes reflect? Asking students to articulate their value system. Asking students to justify the value system they adopted. Asking them to present their life philosophy and learning philosophy. What is a voluntarily, obligatory and enjoyable response?

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Affective Objectives

43

AIM(S) OF THE CHAPTER To develop micro skills of writing affective objectives.

OBJECTIVES OF THE CHAPTER By the end of this chapter, you will be able to write affective objectives at the level of:

Receiving Responding Valuing Organization Characterizing of a value complex

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

INTRODUCTION This chapter focuses on developing the skills of writing affective objectives. Krathwohl, Bloom, and Masia (1964) suggested parallel taxonomy of objectives for writing objectives at the affective domain. The affective taxonomy involves five levels: receiving, responding, valuing, organization and characterizing of a value complex. Teachers and examiners need to think seriously about developing student motivation, attitudes, and interests because cognitive objectives are hard to achieve unless students are emotionally and psychologically ready. Like Bloom‟s hierarchy, this taxonomy also moves from the simple to the complex, the concrete to the abstract, and from the specific to the general. This chapter aims to help you acquire the following skills: 1. 2. 3. 4. 5.

Skill 1: Write Objectives at the Receiving Level Skill 2: Write Objectives at the Responding Level Skill 3: Write Objectives at the Valuing Level Skill 4: Write Objectives at the Organization Level Skill 5: Write Objectives at the Characterizing of a Value Complex Level

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

44

Saad F. Shawer

1. SKILL 1: WRITE OBJECTIVES AT THE RECEIVING LEVEL Students' thinking needs to be stimulated so that are willing to receive and pay attention to learning tasks. They should get prepared to consciously and willingly tackle learning tasks and problems through well-written objectives at the „receiving level‟. Imagine the students are not willing to listen to a lesson, do you think they will learn? Receiving or willingness to learn is as essential as the learning tasks and activities themselves. Moreover, willingness to learn is the door to learning. If the door is closed, how could you enter a place? If the mind is closed, how could you pass learning experiences into students' minds? Having opened the door (mind), the students are ready to receive the target pedagogical input. You could develop the skill of writing objectives at this level through:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. 2. 3. 4. 5. 6. 7. 8.

Showing the importance of the topic to students; Showing how relevant the topic is to students; Linking task achievement to positive or negative reinforcement; Asking students to make preparations to teach specific lesson elements; Asking students about ways of making the lesson interesting to them; Asking students questions about the lesson before teaching it; Using various advance organizers (e.g., students tell a personal experience); Making group competitions and praising winners.

Use affective verbs similar to these: feel, sense, pursue, attend, perceive, look forward to, get anxious to, capture interest, listen attentively, watch attentively, and motivate. Here are the examples: The students will look forward to knowing what happened to Julius Caesar. The students will feel happy to play Macbeth's role. The students will perceive the injustice that Hamlet went through. The students will be motivated to write a conclusion to the essay mentioning who the murderer was. The students will attend to the personal experiences told by classmates.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Affective Objectives

45

The students will pursue reading to know the end of the short story. The students will have happy experiences through interviewing some people in English/French etc. The students will listen attentively to the story played on the tape. The students will watch attentively the video clip story.

2. SKILL 2: WRITE OBJECTIVES AT THE RESPONDING LEVEL Students also need to get prepared to actively participate in learning tasks with willingness and commitment through writing objectives at the „responding level‟. There are three kinds of response:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

(a) Compulsory response (b) Voluntary response (c) Enjoyable response You could make students learn by force by punishing, rewarding or failing them, but this is like rote transmission of information more than learning. The students will be voluntarily willing to learn, when teaching and learning activities are interesting. Learning, as Bruner (1978) pointed out, must be relevant to learners, neither too difficult nor too easy. This requires substantial rather than trivial content. To make students respond with enjoyment, learning must reflect their needs and interests. It should be presented to them in the most appropriate ways. You could develop the skill of writing objectives at this level through: 1 2 3

4 5

6 7

Providing content that meets students‟ needs; Providing substantial rather than trivial content; Providing content consonant with students‟ interests; Providing methods and activities that engage students in learning; Providing pedagogical tasks of appropriate difficulty levels (neither too easy nor too difficult; Using positive reinforcement of student learning. Using affective and action verbs of two categories: The first category concerns the observation of student satisfaction, like: enjoy, obey rules, conform, comply, and show interest. The second category

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

46

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

relates to student action, like contribute, complete homework, answer, discuss, participate, read, and write. Here are the examples: The students will enjoy writing the concluding paragraph. The students will show interest in the tasks. The students will comply with classroom protocols. The students will contribute to the discussion with interest. The students will answer with interest. The students will complete tasks with enjoyment. The students will engage in discussing learning tasks. Before moving on to learn how to write the third affective objective, it is interesting to show the close relationship between (receiving) and (responding) and how they relate to achieving the next objective (valuing). Imagine a friend watched a film and described it to you as very interesting and exciting, and that it concurs with the type of films you prefer. This means your friend made you anxious for seeing that film to the extent you came to the cinema (receiving). This also means your friend managed to make you willing to watch it. You started to watch the film, but you found it boring, silly and a total waste of time. Do you think that you will continue watching? If you were forced to keep watching, will you watch with interest? Of course not (responding)! We therefore conclude that being a successful teacher in making your students willing to learn (receiving) does not mean or guarantee that you will get them participate (responding) or at least interested in learning (Shawer, 2003; 2006; Shawer et al., 2008).

3. SKILL 3: WRITE OBJECTIVES AT THE VALUING LEVEL Students also need to affectively assess tasks by achieving objectives at the „valuing level‟. Assessment of what students receive and respond to involves high-order thinking. When students think about their learning and attach a value to learning tasks, they show awareness of the logic behind what they learn. This means they draw differences between useful and useless aspects of learning. If students do not assess their learning, then their learning is mechanical and of no practical benefit. You could develop the skill of writing objectives at this level through providing learning experiences that:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Affective Objectives

47

1. Make students appreciate a way of doing things rather than others. 2. Show students' commitment to a certain desirable value (democracy, justice, cleanliness, rationing, objectivity etc.). 3. Make students replace a negative belief or attitude with desirable ones. 4. Make students think of the worth of the learning provided.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

You use these verbs: respect, appreciate, persuade, change attitude, change belief, embrace, endorse classroom learning, show commitment toward, believe in, and concern for. Here are the examples. The students will show commitment toward rationing use of water. The students will demonstrate commitment toward democratic values. The students will appreciate the values of cleanliness, justice and tolerance. The students will believe in hard work. The students will respect scientific discovery. The students will respect cultural differences. The students will appreciate differentiation of classroom content and activities. The students will be persuaded to invest effort in the welfare of community. The students will concern for poor people. The students will embrace the value of discipline in their lives.

4. SKILL 4: WRITE OBJECTIVES AT THE ORGANIZATION LEVEL You further need to develop the skill of writing objectives at the „organization level‟. You need to write objectives that help students organize learning tasks to specify which tasks are of higher priority to learn and what comes second. They need to know how to accommodate different values and information into their schema in meaningful categories and order so that they can be ready for use where necessary. For example, if a student has learnt to appreciate the values of democracy, tolerance, and hard work, he/she has to think which one is of highest priority in their context. If democracy is already

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

48

Saad F. Shawer

in action, then hard work and tolerance should receive more attention. As a result, the learner's attention and awareness is translated into action, turning learning meaningful and of practical use. You could develop the skill of writing objectives at this level through asking students to: 5. give priority for things over others. 6. put learning tasks on a hierarchy according to worth. 7. justify why some learning tasks are in the higher order than the others. 8. justify why some learning tasks are preferred to the others. 9. classify values together and resolve conflicts among them. 10. say internal value system. 11. develop a philosophy of life out of their constructed values. 12. balance between right values (e.g. freedom) and responsibility values (e.g. respect for others).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Use verbs similar to these: organize, construct, prioritize, give preference, arrange, classify and put in order. Here are the examples: Put these learning tasks in a priority order. Organize these sources of material according relevance. Arrange tasks according complexity and relevance. Why did you put these tasks in that order? What are the values that you believe in? Which values are more important for you? Which parts of the lesson are successful and which parts are not? Why?

5. SKILL 5: WRITE OBJECTIVES AT THE CHARACTERIZING OF A VALUE COMPLEX LEVEL When students value or assess learning tasks and organize or set priorities among them, they are so close to achieve objectives at the „characterizing of a value complex level‟. This means they consolidate the desirable value system they have developed in order that this system is reflected in their behavior and becomes a feature of their personality. In other words, a student‟s behavior becomes controlled and directed by a particular set of values. Students

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Affective Objectives

49

internalize a value system that represents their philosophy of life. The behavior becomes consistent and predictable (Shawer, 2003). You could develop the skill of writing objectives at this level through: 1. ongoing asking students to articulate their value system. 2. ongoing asking students to justify the value system they adopted. 3. asking them to present their life philosophy and learning philosophy.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Writing objectives at the valuing and organization levels (the characterizing level is the outcome of previous levels). Use verbs similar to these: give your views, work in and defend. Here are the examples: Give your views about multicultural integration. Work in a group to prepare this report. What is your philosophy of learning? How would you approach a difficult task? What is your conceptualization of good health habits? How much do you practice sports? What kinds of sports do you practice? What kinds of groups do you socialize with? Would you socialize with people from different cultures? So far, you have developed the skills of writing both cognitive objectives (mind development) and affective objectives (heart development), why don't you develop motor (body) skills? The following chapter shows how you could achieve that.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 5

STAGE 1 (CONTINUED): WRITING PSYCHOMOTOR OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What are psychomotor objectives? Which part of the body is most concerned with these objectives? Which level of objectives do the following processes reflect? Asking students to read about the target skill. Asking students to listen to a description about the target skill. Observe or watch someone perform the target skill. Pay attention to the rudiments of the skills. Pay attention to sequence of operations involved in performing the target skill. Attend to the relationships between operations involved in performing the skill. Pay attention to the finished product or outcome of the target skill. Asking learners to perform the skill through following directions. Asking learners to take enough time in performing the target skill.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

52

Saad F. Shawer Asking learners not to worry about a high level of coordination in performing the target skill. Which level of objectives do the following processes reflect? Asking students to follow written directions. Asking students to follow oral directions. Involving no observations. Attempting a specific activity over and over (practice). Which level of objectives do the following processes reflect? Asking students to perform a skill as accurate as possible. Performing the acts in sequence. Focusing on coordination of acts performed. Exercising control over their acts. Which level of objectives do the following processes reflect? Adapting acts to the situation. Simultaneously combining more than one skill. Performing different acts in coordination. Performing different acts in sequence. Performing acts in time. Performing acts at a certain speed level. Which level of objectives do the following processes reflect? Asking students to compete one or more skills with total ease. Asking students to perform one or more skills with limited mental effort. Asking students to perform one or more skills with limited physical effort. Which processes are involved in a skill familiarization? Which processes are involved in a skill observation? Which processes are involved in a skill performance? What are the main differences between imitation, manipulation and precision?

AIM(S) OF THE CHAPTER To develop micro skills of writing psychomotor objectives.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Psychomotor Objectives

53

OBJECTIVES OF THE CHAPTER By the end of this chapter, you will be able to write affective objectives at the level of: Imitation Manipulation Precision Articulation Naturalization/habit formation

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

INTRODUCTION This chapter focuses on developing the skills of writing psychomotor objectives, which chiefly concern the learner‟s ability to physically manipulate materials and instruments. Psychomotor objectives focus on the change and development of behavior through emphasizing the basic motor skills learners can perform and the coordination and precision at which the work could be done. There are many classifications of objectives at the psychomotor domain. Dave‟s (1970) classification is adopted as it lends itself directly to application in education. A sequential and progressive process is needed in order for motor skills to be developed over time. Planning for these objectives to be realized in our teaching is necessary in many fields. In languages and social studies, learners need to learn how to write, use computers and draw maps. In science, students need to learn how to use lab equipment, like the microscope. Psychomotor objectives are also needed in medicine, engineering, physical education, and other fields. This chapter discusses Dave's classification as follows: Skill 1: Writing Objectives at the Imitation Level Skill 2: Writing Objectives at the Manipulation Level Skill 3: Writing Objectives at the Precision Level Skill 4: Writing Objectives at the Articulation Level Skill 5: Writing Objectives at the Habit-Formation/Naturalization Level 6. The Curriculum Test Aim (Model/Reference Test) 7. The Curriculum Test objectives (Model/Reference Test) 1. 2. 3. 4. 5.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

54

Saad F. Shawer

1. SKILL 1: WRITING OBJECTIVES AT THE IMITATION LEVEL If you ponder over the cognitive objectives, you soon find out that psychomotor objectives follow a similar structure. They both start with the simple to the more complex operations. When you plan to train students to acquire a specific skill, you need to begin with imitation (basic level) through copying or emulating a physical behavior. For example, you may wish to teach someone how to drive a car. The first thing is to give the trainee a description and explanation of the component parts relating to the driving process. For example, to explain what each element is used for. What the clutch, gear, accelerator and the like are used for. Then you need to show the learner how to position in the car, make gear neutral and turn it on, etcetera. The learner first looks and listens to a trainer with the purpose of imitating observed behaviors. You could develop the skill of writing objectives at this level through three main processes:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Theoretical familiarization Observation performance

1.1. Theoretical Familiarization This involves familiarizing the novice with some information about the target skill before starting to perform it. This involves two processes: Asking students to read about the target skill. Asking students to listen to a description about the target skill.

1.2. Observation A novice comes into contact with the target skill through matching the theoretical information with actual performance of the target skill before starting to perform it himself/herself. Performance of a skill at this stage could be real or a video show. This involves asking learners to:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Psychomotor Objectives

55

Observe or watch someone perform the target skill; Pay attention to the rudiments of the skills; Pay attention to sequence of operations involved in performing the target skill; Attend to the relationships between operations involved in performing the skill; Pay attention to the finished product or outcome of the target skill.

1.3. Performance For the first time a novice performs the target skill in action under close supervision. This involves these processes:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Asking learners to perform the skill through following directions. Asking learners to take enough time in performing the target skill. Asking learners not to worry about high levels of coordination in performing the skill. Use verbs that involve theoretical familiarization, observation and performance of a skill similar to these: read, listen, look, attempt, align, copy, imitate, place, balance, duplicate, repeat, follow, rest on, step here, mimic, grasp, and hold. Here are the examples: Mention the rudiment processes involved in drawing a map. What did you observe in my driving? Duplicate this picture. Repeat formatting this document. Perform the basic steps in sequence. Press the clutch.

2. SKILL 2: WRITING OBJECTIVES AT THE MANIPULATION LEVEL Learners further need to manipulate the target skill through following instructions. For example, you ask the learner to sit in his/her seat and try to

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

56

Saad F. Shawer

switch the car on with verbal or written instructions. This way, the learner's skill develops somewhat from just watching and imitating to trying things out themselves. The difference from the imitation stage is that the student manipulates the skill without a visual model or direct observation. The learner should have written directions to help him/ her manipulate and develop the skill further. You could develop your ability of writing objectives at this level through:

1. Asking students to follow written directions. 2. Asking students to follow oral directions.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3. Involving no observations. 4. Attempting a specific activity over and over (practice). Use verbs similar to these: produce, follow instructions, complete, perform, and play. Here are the examples: Perform the first three movements according to the instructions. Follow the instructions to format this page. Complete this country‟s map. Produce a power point slide according to these directions. Write this sentence according to this calligraphic format.

3. SKILL 3: WRITING OBJECTIVES AT THE PRECISION LEVEL You need to help learners to perform the skill at the precision level though an independent action that involves neither written directions nor a visual model (observation). You could develop the skills of writing objectives at this level through:

1. 2. 3. 4.

Asking students to perform a skill as accurate as possible. Performing the acts in sequence. Focusing on coordination of acts performed. Exercising control over acts. Use verbs similar to these: achieve with speed, achieve without errors, perform masterfully, perform at the right sequence, produce accurately, and do proficiently. Here are the examples: Draw an accurate map of the Red Sea.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Psychomotor Objectives

57

Produce a well-developed slide. Drive this car masterfully. Fix this pipe perfectly.

4. SKILL 4: WRITING OBJECTIVES AT THE ARTICULATION LEVEL

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

You then need to develop your ability in writing objectives at the articulation level in order to train your students to do separate operations at the same time, modify or adapt the product to the situation and combine more than one skill in harmony. For example, the learner could reverse the car and could explain the processes involved. The learner could also use the screen wipers as well as light indicators and explain when to use them. You could develop the skill of writing objectives at this level by:

5. 6. 7. 8. 9. 10.

Adapting acts to the situation. Simultaneously combining more than one skill. Performing different acts in coordination. Performing different acts in sequence. Performing acts in time. Performing acts at a certain speed level. Use verbs similar to these: adapt, alter, customize, perform simultaneously and in harmony, and perform at a certain speed. Here are the examples: While driving the car use screen wipers, headlights and change the gear. Adapt this map to include water resources. Customize this house template to include three more rooms and two balconies.

5. SKILL 5: WRITING OBJECTIVES AT THE HABIT-FORMATION/NATURALIZATION LEVEL You and your students need to achieve objectives at the habit-formation level. This means that the learner performs a skill with the minimum energy and thinking. Imagine what a person did when they first started to drive.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

58

Saad F. Shawer

Watch them after some months or years, they do things with very little thinking and effort. You could develop your ability in writing objective s at this level through:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. Asking students to compete one or more skills with total ease. 2. Asking students to perform one or more skills with limited mental effort. 3. Asking students to perform one or more skills with limited physical effort. Use verbs similar to these: perform in two minutes' time, and perform with no mistakes. Draw a map of the Red Sea in 10 minutes. Analyze this sample using the microscope in three minutes. Perform this eye operation. Kick the ball toward this angle. Drive the bike without holding the handles. Swim on your back to reach the other side in two minutes. In short, when teaching a specific skill, this could start with listening to an explanation from a teacher or trainer, then imitating what they do. The next step is to manipulate or practice the involved steps till one reaches precision. Finally, all these steps lead the learner to do the skill at a degree of automaticity. We should note here that objectives of a test specify precisely each element of the content we seek to test students in. Having explained the process of writing the test aim and objectives, we now write the aim and objectives of our model test (curriculum test).

6. THE CURRICULUM TEST AIM (MODEL/REFERENCE TEST) The curriculum test that worked as our reference or model example throughout this book sought to measure the extent to which student-teachers mastered the curriculum course content and developed course design skills.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 1 (Continued): Writing Psychomotor Objectives

59

7. THE CURRICULUM TEST OBJECTIVES (MODEL/REFERENCE TEST)

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

The test medium-level objectives sought to examine the extent to which student-teachers had been able to: Master key curriculum terms (e.g., curriculum, program, syllabus, content and instructional blocks). Use main curriculum models (e.g., objectives) and strategies (e.g., centralized). Acquire curriculum foundations knowledge (e.g., philosophical, psychological, and sociological). Conduct needs assessment (e.g., general and specific needs assessment). Write precise curriculum aims and objectives. Determine scope, continuity and sequence of single and multi-subject curricula. Identify main curriculum designs (e.g., subject-matter and learner centered). Undertake subject-matter designs for their subjects (e.g., fused and correlated designs). Undertake learner-centered designs for their subjects (e.g., project design). Evaluate curriculum, students, and teachers.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 6

STAGE 2: WRITE THE TEST TYPE PRE-READING REFLECTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What is the difference between performance and personality measures? How can you determine the type of your test? Draw a comparison among the following types of tests and state which type of test is each? Intelligence tests Aptitude tests Achievement tests Diagnostic tests Proficiency tests Placement tests Performance assessments Personality inventories Projective techniques Self-concept Attitude scales Vocational interest measures

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

62

Saad F. Shawer

AIM(S) OF THE CHAPTER To decide on the most appropriate type of test for your classroom or study.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

OBJECTIVES OF THE CHAPTER To show the differences and similarities between performance and personality measures? To define, draw differences between, and use each of the following types of tests: Intelligence tests Aptitude tests Achievement tests Diagnostic tests Proficiency tests Placement tests Performance assessments Personality inventories Projective techniques Self-concept Attitude scales Vocational interest measures

INTRODUCTION This chapter explains the third stage of test construction. It highlights different types of tests and how test type influences the process of test construction. A test is a tool for gathering information about individuals‟ ability (e.g., language ability). Testing is one of many tools which provide information for assessment. Gall et al. (1996) drew a comparison between testing and self-reporting measures. „A test is any structured performance situation that can be analyzed to yield numerical scores, from which inferences can be made about how individuals differ in the construct measured by the

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 2: Write the Test Type

63

test.‟ In contrast, a self-reporting measure refers to „instruments in which individuals respond to items that reveal aspects of personality‟ (1996, p. 246). Although both testing and self-reporting measures are similar in construction and administration, they differ in certain aspects. Self-reporting measures do not require subjects to perform while tests do. Moreover, selfreporting measures require subjects to reveal whether they possess a trait or a feeling while tests require them to demonstrate ability. Gall et al. (1996) also indicated that a standardized test involves a set of procedures to maintain consistency throughout the process of test administration and scoring. Although there are various types of measures, they could be grouped into two main categories: performance and personality. This chapter proceeds according to this order: 1. Performance Measures (Tests) 2. Personality Measures 3. The Curriculum (Reference) Test Type

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. PERFORMANCE MEASURES (TESTS) Performance tests demand students to perform in order to demonstrate that they possess a particular ability. They include these types of tests: 1.1. 1.2. 1.3. 1.4. 1.5. 1.6. 1.7.

Intelligence Tests Aptitude Tests Achievement Tests Diagnostic Tests Proficiency Tests Placement Tests Performance Assessments

1.1. Intelligence Tests Intelligence tests give an estimate about student general ability by sampling performance on different tasks so that a composite score in various domains (reading, arithmetic, problem-solving etc.) could be obtained. A composite score is known as intelligence quotient (IQ). It is possible to have

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

64

Saad F. Shawer

sub-scores about a specific domain, for example, the quantitative ability and verbal ability.

1.2. Aptitude Tests Aptitude tests seek to predict individuals‟ performance with regard to a particular domain whether it is a profession or achievement in a certain subject. However, these should show predictive ability as this type of tests is used in deciding on college admission and hiring individuals.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.3. Achievement Tests Achievement tests examine students on specific information (maybe comprehension, application, analysis, evaluation and/or synthesis). Regardless of test content, all achievement tests should show content validity to make sure they represent the content they purport to measure. Achievement tests measure the extent to which students or individuals have mastered the content of a specific course or achieved the objectives of that course. There two types of achievement tests: progress and final. Progress achievement tests are administered during the teaching of a course to check on student progress. As such, they are useful in formative assessments. In contrast, final achievement tests are administered at the end of a course to check student overall achievement. These are usually used for promoting students between grades and they are therefore useful in summative assessments.

1.4. Diagnostic Tests Diagnostic tests assess student strengths, difficulties and weaknesses to determine what learning still needs to take place. A test in language arts may tell that a student is good at reading and writing but poor at speaking and having difficulties at listening. Diagnostic tests are variants of achievement tests but these seek to assess student learning difficulties. They purport to assess student weaknesses and strengths in order to make recommendations about remedial instruction necessary to put learning on course. A diagnostic test is

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 2: Write the Test Type

65

a form of achievement test to identify a student‟s strengths and weaknesses in a particular school subject. Diagnostic tests usually focus on the low end of the achievement spectrum and provide a detailed picture of the student‟s level of performance in the various skills that the subject involves. (Gall et al., 1996, p. 266)

Diagnostic tests are best constructed and administered as criterionreferenced tests to provide a picture about a student‟s overall performance in a specific subject.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.5. Proficiency Tests Proficiency tests measure ability on the basis of the specifications of what candidates have to be able to do rather than on the basis of mastering content or achieving the objectives of a course. As such, proficiency tests focus on measuring proficiency more than competency. Competency means that individuals possess the sufficient quantity of the knowledge, skills and ability to achieve a particular task or job. In contrast, proficiency means that individuals possess an advanced level of the knowledge and skills which will enable them to achieve a particular task or job with greater ability and higher standard of performance. For example, a test seeking to determine a student‟s ability to follow a postgraduate course of study at a university is a proficiency test. Proficiency tests therefore must set out the criteria on which a student is deemed proficient in terms of level, type of content, skills and so on. These criteria should be reflected in the specifications of the test.

1.6. Placement Tests Placement tests determine the stage or level at which a student should be placed in order to achieve a match between a course and student ability. They are used to assign students to classes at different levels. In particular, placement tests are useful in grouping students in differentiated classrooms.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

66

Saad F. Shawer

1.7. PERFORMANCE ASSESSMENTS (AUTHENTIC/ALTERNATIVE ASSESSMENTS) Performance assessment measures assess performance through direct observation of the individuals‟ performance on actual tasks. Tasks are usually designed to represent and reflect real life tasks. Examples of performance assessments include practical driving tests, actual teaching and portfolio assessments. Moreover, assessment could be while individuals are completing tasks and could be a final product after task completion. A driving test is an example of performance assessment while individuals complete a task, whereas portfolio assessment is an example of measuring performance after completing a task.

2. PERSONALITY MEASURES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Personality measures do not require the subjects to perform. Alternatively, they require the individuals to reveal whether they possess a trait or a feeling. Precisely, they seek to show differences between individuals in certain personality aspects. These involve: 2.1. 2.2. 2.3. 2.4. 2.5.

Personality Inventories Projective Techniques Self-Concept Attitude Scales Vocational Interest Measures

2.1. Personality Inventories Personality inventories measure several personality variables in a single self-reporting measure, usually in an objective format. The biggest weakness is that people provide false information because the information that personality inventories require is normally sensitive. People do not usually reveal aspects of their personalities.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 2: Write the Test Type

67

2.2. Projective Techniques Projective techniques involve open-ended or free responses where individuals can express their fantasies. Projective techniques are therefore less prone to false responses. The most famous projective technique is the thematic appreciation test (TAT) that involves drawings about several human situations. Respondents are asked to draw a story out of the drawings with help of some questions written under each picture.

2.3. Self-Concept Self-concept measures assess individuals‟ conceptualization of their own selves. Respondents are asked bout, for example, self-esteem to reveal the extent to which they feel positive about themselves. Self-concept refers to the knowledge, beliefs and feelings they have about their own selves.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.4. Attitude Scales Attitude scales assess individuals‟ attitudes toward people, attributes, values or objects. An attitude is the individual‟s pre-disposition to respond to a stimulus in a particular way and a tendency to produce the same response whenever encountering the same stimulus. According to Gross (1996) an 'attitude' is a human's tendency to produce the same response to a particular stimulus. Whenever the situation is repeated, the same response will be also produced. Gross (1996) draws our attention to a multi-dimension theory of attitude development. This theory suggests that we could develop attitudes through three components: affective, cognitive, and behavioral. The affective element is what a person feels toward the object of attitude. If students feel good toward scientific discovery when they see young inventors honored, it will encourage them to form positive attitudes toward scientific discovery. The cognitive component is the knowledge that a person has about the object of attitude. Having shown good examples to our students about scientific discovery, teachers need to provide them with knowledge and facts about scientific discovery to take their attitudes a step forward. The behavioral

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

68

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

component is the outcome of the affective and cognitive components. This means what a person actually does is the result of what they feel and know. There are slight differences, says Gross (1996), between attitudes, beliefs and values, despite being true that an attitude includes both. 'Beliefs' are what we know about the world or object of attitude. A belief then is the correlation between something and a particular attribute. For example, Ancient Egypt is known for ancient monuments, whereas America is known for capitalism and modernity. For example, people believe so because they know that Egypt has the largest and most significant monuments left from the ancient world. It is clear then that beliefs are the cognitive component of attitudes. Teachers are then asked to help their learners have correct and socially accepted information about issues of relevance to society to help them form positive beliefs. As for 'value', Gross (1996, p. 435) refers to it as 'an enduring belief that a specific mode of conduct or end-state of existence is personally or socially preferable to an opposite or converse mode of conduct or end-state of existence.' A value then is a belief and another cognitive part of attitude. However, it is more deep-rooted in one's personality. Values are often dichotomous, if you like a certain value you automatically dislike its opposite. For example, if you like justice, you must dislike injustice. Teachers may plan to help their students learn values like: Theoretical values that implant in students the curiosity of how things work, the logic or laws behind a particular thing. Aesthetic values that help students understand how to value and taste music, art, and the like. Political values that draw attention to the structure of society and the system on which the society functions. This is to promote values, like democracy and to help them perceive its importance in society for social stability and justice. Economic values that familiarize students with the fiscal system in society. Social values that relate to the welfare of community and society at large. Religious values that concern the relationship with God, people, moral issues, and life after death. Attitudes could be assessed through Thurstone or Likert Scales, by asking individuals either to agree or disagree to a set of statements about the object of

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 2: Write the Test Type

69

attitude. Individuals scale their responses in different weights through responding to a set of statements about the object of attitude. Scaled responses involve (e.g., strongly disagree, disagree, undecided, agree and strongly agree).

2.5. Vocational Interest Measures Vocational interest measures assess the degree of individuals‟ preference or interest in certain professions, activities, sports and hobbies. They can show the differences among people‟s preferences to professions, for example between who prefers banking to teaching.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3. THE CURRICULUM (REFERENCE) TEST TYPE Having explained various types of tests, it is clear that this book is interested in achievement tests because these seek to measure the extent to which student-teachers mastered the content and achieved the objectives of a course (in our example, curriculum course). Moreover, the test was criterionreferenced (see stage 8 of test construction). So far, the test aim, objectives and type have been determined. Chapters 7 and 8 explain how content can be sampled and represented through creating a test table of specifications.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

PART III

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

TEST CONSTRUCTION: TEST TABLE OF SPECIFICATIONS AND TYPE OF ITEMS Stage 3: Test Table of Specifications Chapter 7: Create the Table of Specifications Chapter 8: Create the Table of Specifications (Continued) Stage 4: Test Type of Items Chapter 9: Determine type of items

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 7

STAGE 3: CREATE A TABLE OF SPECIFICATIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What are the steps involved in determining a test table of specifications? How can you determine the content of a test? How many ways could be used to determine the relative weight of content? Determine the relative weight of objectives according to the list of objectives method. Determine the relative weight of objectives according to the test developer‟s judgment method. What is a table of specifications used for? Mention the formulas used for determining the relative weight of content and objectives.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

74

Saad F. Shawer

AIM(S) OF THE CHAPTER To create a table of specifications of achievement tests.

OBJECTIVES OF THE CHAPTER To state the five steps of creating a table of specifications. To analyze the content of a test. To determine themes of content through major, main, and minor themes of content. To determine the relative weight of content through pages. To determine the relative weight of content through lessons. To determine the relative weight of objectives using the list of objectives method. To determine the relative weight of objectives using the test developer‟s judgment method. To define a table of specifications.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

INTRODUCTION A test table of specifications enables teachers, researchers, and other social workers to determine the content, weight of content, weight of objectives, item number, allocate items to each theme of content, and distribute items among the six levels of cognitive objectives. In other words a test table of specifications involves these five steps: Step 1: determine test content; Step 2: determine relative weight of content; Step 3: determine relative weight of objectives; Step 4: distribute items among themes of content; and Step 5: distribute items among levels of objectives. Based on the actual test construction processes involved in developing the model (curriculum) test referred to earlier on, this chapter sheds the light on the first three steps while the next chapter discusses the remaining two:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 3: Create a Table of Specifications

75

Table 7.1. Major themes of a curriculum course 1 2 3 4 5 6 7 8

Curriculum conceptualization Curriculum philosophies Curriculum models and strategies Needs assessment Writing curriculum aims and objectives Selection of curriculum content Organization of curriculum content Curriculum evaluation

1. Step 1: Determine Test Content 2. Step 2: Determine Relative Weight of Content 3. Step 3: Determine Relative Weight of Objectives

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. STEP 1: DETERMINE TEST CONTENT The author determined the (curriculum) test content by means of content analysis through determining: (a) major themes (units), (b) main themes (lessons), and (c) minor themes of each main theme (lesson points). For example, he first determined these major themes of a curriculum course as shown in Table 7.1. Second, the author determined the main themes that each major theme consists of. For example, Table 7.2 shows the first major theme (curriculum conceptualization) and its main themes. Third, he determined the minor themes that each main theme comprises. These are shown in Table 7.3. The author shortened these steps in a single step through creating a matrix that comprised the minor and main themes of the each major theme (Table 7.4). This was repeated with each major theme. Table 7.5 shows the content analysis process of the curriculum course that resulted in a number of major, main and minor themes (also see appendix A for the complete test).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

76

Saad F. Shawer Table 7.2. Main themes of the course first major theme

Major theme Main themes

1 A

B

Curriculum conceptualization Curriculum terms

Curriculum supervision

Main theme

A 1

Minor themes

Table 7.3. Minor themes of two main curriculum themes

2 3 B 1 2

Curriculum specialist

Table 7.4. Main and minor themes of a major theme (Curriculum conceptualization)

Minor themes

Major theme Main theme

Main theme Minor themes

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Minor themes

Main theme

Curriculum terms Definition of curriculum, program, syllabus, content & instructional blocks Curriculum development & school-based curriculum development and curriculum innovation, change and design Curriculum foundations, philosophies and domains Curriculum supervision The principal as a curriculum supervisor

1 A

Curriculum conceptualization Curriculum terms

1

3 B 1

Definition of curriculum, program, syllabus, content & instructional blocks Curriculum development & school-based curriculum development and curriculum innovation, change and design Curriculum foundations, philosophies and domains Curriculum supervision The principal as a curriculum supervisor

2

Curriculum specialist

2

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

77

Stage 3: Create a Table of Specifications Table 7.5. Major, main, and minor themes of a curriculum course Themes No. 1 2 3 4 5 6 7 8 Total

Major Curriculum conceptualization Curriculum philosophies Curriculum models and strategies Needs assessment Writing curriculum aims and objectives Selection of curriculum content Organization of curriculum content Curriculum evaluation 8

main 2 3 2 2 3 8 10 3 33

minor 5 6 6 3 10 12 25 6 73

2. STEP 2: DETERMINE RELATIVE WEIGHT OF CONTENT

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Relative weight of content concerns how much of each unit/ major theme to be reflected in the test in terms of a specific number of items/ questions (Adkins, 1974). We can decide on relative weight of content via two ways: 2.1. Determine Relative Weight of Content by Lessons 2.1. Determine Relative Weight of Content by Pages

2.1. Determine Relative Weight of Content by Lessons First, calculate the percentage: Divide each unit number of lessons by the subject total number of lessons times 100, via this formula: each unit number of lessons ______________________________ × 100 = subject total number of lessons The author determined content weight in terms of the percentage of items that was allocated to each unit from the test overall items. For example, unit

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

78

Saad F. Shawer

one percentage was: 2 (unit 1 lessons) ÷ 32 (course total lessons) ×100 = 6%. These are shown in the table of specifications (Table 7.6), column 2. A table of specifications consists of a horizontal axis on which contents are inserted and a vertical axis on which objectives weights are written. Second, write the weight of each unit in the table of specifications. The relative weights of content themes are always written in the vertical column, as also shown in the table of specifications of Table 7.6. The remaining blank cells will be completed with the number of items/ questions and objectives percentages as appropriate when we do the remaining steps. Though you will see many tables of specifications throughout this and the next chapter, they are only offshoots of one table of specifications. The point that we made many tables of the same table of specifications was that each of the five steps is responsible for completing specific cells of the very table. Every time we discuss one of the five steps, we reproduce the table to complete the remaining blank cells. When we complete all the blank cells, the table of specifications is also complete.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 7.6. Table of specifications: Relative weight of content based on lesson calculations Objec- weight remem- undertives ber stand Content % Unit 1 6% Unit 2 9% Unit 3 6% Unit 4 6% Unit 5 9% Unit 6 24% Unit 7 31% Unit 8 9% Total 100%

apply analyze

evaluate

create total

2.2. Determine Relative Weight of Content by Pages It is also possible to determine test content through the number of pages. You could do this by also dividing the number of pages of each unit by the total number of pages of the subject times 100.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 3: Create a Table of Specifications

79

First, calculate the percentage: Divide each unit number of pages by the subject total number of pages times 100. Use this formula: each unit number of pages _________________________ × 100 = subject total number of pages The curriculum course was assigned a textbook that consisted of 280 pages. From these 280 pages, the eight units covered these pages:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

35 pages (unit 1) 40 pages (unit 2) 25 pages (unit 3) 20 pages (unit 4) 47 pages (unit 5) 30 pages (unit 6) 43 pages (unit 7) 40 pages (unit 8) Table 7.7. Table of specifications: Relative weight of content based on page calculations Objec- weight remember understand apply tives Content % Unit 1 12.5% Unit 2 14% Unit 3 9% Unit 4 7% Unit 5 17% Unit 6 11% Unit 7 15.5% Unit 8 14% Total 100%

analyze evaluate create total

For example, unit one content was determined as 35 (unit one pages) ÷ 280 (subject total number of pages) × 100 = 12.5%.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

80

Saad F. Shawer

Second, write the weight of each major theme in the Table of specifications. As pointed out above, the relative weights of themes of content are always written in the vertical column. The eight unit calculations are written in the vertical column in Table 7.7. However, this method does not precisely enable the test developer to precisely achieve the appropriate balance of content.

3. STEP 3: DETERMINE RELATIVE WEIGHT OF OBJECTIVES

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Weight of objectives concerns the number of items each level of objectives will be allocated from each unit items and test overall items. Though the usual way of determining relative weight of objectives depends on the test developer's personal judgment by specifying each level of objectives a percentage that applies to all units of content, the author developed a more systematic and efficient method (the list of objectives method) for this purpose. The following paragraphs explain these two methods: 3.1. The List of Objectives Method 3.2. The Test Developer‟s Judgment Method Table 7.8. Cognitive objectives classification across eight units Units Unit 1 Unit 2 Unit 3 Unit 4 Unit 5 Unit 6 Unit 7 Unit 8 Total

remember 5 4 3 3 2 7 6 4 34

understand 3 3 2 2 2 7 4 2 25

apply

analyze

evaluate

create

Total

0 1 1 3 2 7 5 2 21

0 2 1 0 2 4 7 2 18

0 2 1 0 2 3 10 2 20

0 0 0 0 2 4 8 0 14

8 12 8 8 12 32 40 12 132

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 3: Create a Table of Specifications

81

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3.1. The List of Objectives Method First: prepare a list of objectives It is not unusual that teachers prepare a set of objectives for each lesson they teach on a daily basis. It is not unusual either that every teacher classifies each lesson objectives into cognitive, affective or/ and psychomotor objectives. Once more, it is not unusual that each teacher writes cognitive (e.g., remember), affective (e.g., receive), and psychomotor (e.g., imitate) objectives at different levels to achieve the lesson aim. This means every teacher has a ready-made list of objectives as a result of everyday teaching. With reference to the model curriculum test, the author created a list of 132 objectives which resulted from his teaching during a semester. He wrote about five objectives to realize in each lesson on daily basis. These 132 objectives were distributed among the course eight units into: 8 objectives (Unit 1), 12 objectives (Unit 2), 8 objectives (Unit 3), 8 objectives (Unit 4), 12 objectives (unit 5), 32 objectives (unit 6), 40 objectives (unit 7) and 12 objectives (unit 8) (8 + 12 + 8 + 8 + 12 + 32 + 40 + 12 = 132 objectives). See Table 7.8, final column. Second: distribute the list of objectives among the levels of objectives The author distributed each unit‟s number of objectives between the six levels of cognitive objectives, also shown in Table 7.8. It should be noted that these objectives were already distributed as the author wrote objectives at different levels of objectives for each lesson. Table 7.8 shows the number of objectives allocated to each unit (step 1/ final column) and the distribution of these objectives among the six levels of objectives (step 2). This classified list of objectives (as a result of daily teaching) provided a ready-made list of objectives. Third: calculate percentage of objectives Divide the number of objectives at each level by each unit total number of objectives times 100. Use this formula: Number of objectives at each level Unit: ________________________________________ × 100 = % Total number of objectives of the unit Table 7.9 shows objectives weights (for each unit) in percentages. The author calculated the weights of objectives through calculations of objectives percentages. In regard to unit one, for example, he calculated the remembering level weight as: 5 (number of objectives at the remembering level) ÷ 8 (total

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

82

Saad F. Shawer

number of unit one objectives) × 100 = 62.5%. This means that 62.5% of items will be allocated to the remembering level. This step was repeated for each level of objectives in each unit. All unit weights of objectives are written in the table of specifications of Table 7.9 (the rows under the six levels of objectives for each unit). Fourth, insert weights of objectives in the table of specifications (horizontal axis) Table 7.9. Table of specifications: Relative weight of cognitive objectives across eight units Content Weight % Unit 1 6% Unit 2 9%

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Unit 3 6% Unit 4 6% Unit 5 9% Unit 6 24% Unit 7 31% Unit 8 9% Total 100%

Objectives remember understand apply analyze 62.5% 37.5% 0% 0%

evaluate create 0% 0%

Total

remember understand apply 33% 25% 8%

analyze 17%

evaluate create 17% 0%

100%

remember understand apply 37.5% 25% 12.5%

analyze 12.5%

evaluate create 12.5% 0%

100%

remember understand apply 37.5% 25% 37.5%

analyze 0%

evaluate create 0% 0%

100%

remember understand apply 16.7% 16.7% 16.7%

analyze 16.7%

evaluate create 16.7% 16.7%

100%

remember understand apply 22% 22% 22%

analyze 12.5%

evaluate create 9% 12.5%

100%

remember understand apply 15% 10% 12.5%

analyze 17.5%

evaluate create 25% 20%

100%

remember understand apply 33% 16.7% 16.7%

analyze 16.7%

evaluate create 16.7% 0%

100%

100%

This is also shown in Table 7.9. The blank cells under the percentage cells of objectives will be completed with the number of items/ questions which reflect each percentage when we do step 5 (Distribute items among levels of objectives). Before we do this, we need first to explain the second method of

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 3: Create a Table of Specifications

83

determining the relative weight of objectives on the basis of the test developer's judgment.

3.2. The Test Developer‟s Judgment Method

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

This is the second method for determining the relative weight of test objectives. Indeed, it is an easy method that takes few minutes to complete. It depends on the test developer‟s personal judgment who can from the start assign equal weights to objectives that will apply to all themes of content (units) as shown in Table 7.10. With the list of objectives method, the test developer needed to calculate the weights of objectives through calculations of objectives percentages. The case is different here Where the test developer does not need to make such calculations since all levels of objectives receive the same weight as explained below. First: assign percentages to levels of objectives. For example, we assigned these percentages to the six levels of objectives as follows: remember = 25% understand = 25% apply = 15% analyze = 10% evaluate = 15% create/ synthesize = 10% Second: insert weights in the table of specifications This is shown in Table 7.10. You notice that the eight units/ themes of content have been allocated the same percentage. For example, all the eight units have been allocated the same 25% at the remembering level of objectives (column three). This was not the case with the list of objectives method since each unit had a different percentage at the same level of objectives. A comparison between Table 7.10 and Table 7.9 at the remembering level across the eight units will make this difference clear. Again, the blank cells under the percentages allocated to each of the objectives will be completed with the number of items/questions which reflect each percentage. We do this when we discuss step 5 (Distribute items among levels of objectives).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

84

Saad F. Shawer

We now turn to discuss the remaining two steps for creating the table of specifications in the next chapter that continues the discussion of test table of specifications.

10%

Total

15%

create

10%

evaluate

15%

analyze

25%

apply

understand

% 25% 6% 9% 6% 6% 9% 24% 31% 9% 100%

100%

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Content Unit 1 Unit 2 Unit 3 Unit 4 Unit 5 Unit 6 Unit 7 Unit 8 Total

weight

Objectives

remember

Table 7.10. Table of specifications: Weight of objectives (the test developer‟s method)

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 8

STAGE 3: (CONTINUED) CREATE A TABLE OF SPECIFICATIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What are the steps involved in distributing items between themes of content? What are the steps involved in distributing items among levels of objectives? What is the difference between distributing items between themes of content and distributing items between levels of objectives? Which of the following is the correct formula for distributing items among levels of objectives? Weight of each unit × total test items ÷ 100. Total items per unit × weight of each level of objectives ÷ 100.

AIM(S) OF THE CHAPTER To create a table of specifications of achievement tests (continued).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

86

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

OBJECTIVES OF THE CHAPTER To state the five steps of creating a table of specifications (remembering level of objective). To mention the steps involved in allocating items to themes of content (remember). To mention the steps involved in distributing items among levels of objectives (remember). To use the steps involved in allocating items to themes of content to complete the relevant parts of a table of specifications (apply). To use the steps involved in distributing items among levels of objectives to complete the relevant parts of a table of specifications (apply). To draw a comparison between „allocating items to themes of content‟ and „distributing items among levels of objectives (understand).‟ To make a new table of specifications from a given set of information (create). To break down a given table of specifications into its main component parts (analyze). To act as a jury for a test content in your subject in terms of the test content specifications (evaluate).

INTRODUCTION The previous chapter explained three of the five steps involved in creating a table of specifications. These involved: Step 1: determine test content; Step 2: determine relative weight of content; and Step 3: determine relative weight of objectives. The reader should remember that the three steps in chapter 7, as well as the remaining two steps in this chapter, were based on actual data taken from the curriculum test. This chapter discusses these two remaining steps: 1. Step 4: Distribute Items among Themes of Content 2. Step 5: Distribute Items among Levels of Objectives

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 3: (Continued) Create a Table of Specifications

87

1. STEP 4: DISTRIBUTE ITEMS AMONG THEMES OF CONTENT This is step 4 in the process of creating the table of specifications. Here, we allocate a number of test items/questions to each theme of content, as follows: 1.1. Determine the Test Overall Number of Items 1.2. Determine Each Unit‟s items from the Test Overall Items 1.3. Insert Each Unit‟s Items in the table of Specifications

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.1. Determine the Test Overall Number of Items To determine the overall number of test items/ questions, test developers need to exercise their personal judgment. However, any test should usually be round 50 items. Of course you can make the total number of test items less than 50, bearing in mind covering the main elements of content (Hughes, 2003; McNamara, 2000). Remember, it is essential for test developers to decide on the number of questions they want to include in their test in order that they can make subsequent calculations. This step does not require test developers to take action more than just deciding on the number of items their test would involve. The author decided to include 60 items in the model curriculum test.

1.2. Determine Each Unit‟s Items from the Test Overall Items Decide on the number of items that each unit will take out of the test overall items (60). Use this formula: Weight of each unit × total test items ÷ 100= unit items. For example, unit one was assigned items this way: 6% (unit 1 content weight) × 60 (test sum of items) ÷ 100 = 3.6 rounded up to 4 (items). This meant that unit one took 4 questions out of the test overall 60 questions.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

88

Saad F. Shawer

1.3. INSERT EACH UNIT‟S ITEMS IN THE TABLE OF SPECIFICATIONS The calculations of the eight units were then inserted in Table 8.1 (final column, right). We can subsequently move on to distribute the number of questions assigned to each unit of the curriculum course between each of the six levels of objectives in step 5 (distribute items among levels of objectives).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 8.1. Table of specifications: Allocating items to eight themes of content Content Weight % remember 62.5% Unit 1 6% remember 33% Unit 2 9% remember 37.5% Unit 3 6% remember 37.5% Unit 4 6% remember 16.7% Unit 5 9% remember 22% Unit 6 24% remember 15% Unit 7 31% remember 33% Unit 8 9% Total 100%

Objectives Total understand apply analyze evaluate create 37.5% 0% 0% 0% 0% 100% 4 understand apply analyze evaluate create 100% 25% 8% 17% 17% 0% 5 understand apply analyze evaluate create 100% 25% 12.5% 12.5% 12.5% 0% 4 understand apply analyze evaluate create 100% 25% 37.5% 0% 0% 0% 4 understand apply analyze evaluate create 100% 16.7% 16.7% 16.7% 16.7% 16.7% 5 understand apply analyze evaluate create 100% 22% 22% 12.5% 9% 12.5% 14 understand apply analyze evaluate create 100% 10% 12.5% 17.5% 25% 20% 19 understand apply analyze evaluate create 100% 16.7% 16.7% 16.7% 16.7% 0% 5 60

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

89

Stage 3: (Continued) Create a Table of Specifications

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 8.2. Complete table of specifications: Distributing overall test items among themes of content (8 units) Content Weight % remember 62.5% 3 Unit 1 6% remember 33% 2 Unit 2 9% remember 37.5% 2 Unit 3 6% remember 37.5% (1.5)2 Unit 4 6% remember 16.7% (0.83)1 Unit 5 9% remember 22% Unit 6 24% 3 remember 15% 3 Unit 7 31% remember 33% (1.7)2 Unit 8 9% Total 100% 18

understand 37.5% 1 understand 25% 1 understand 25% 1 understand 25% 1 understand 16.7% (0.83)1 understand 22% 3 understand 10% 2 understand 16.7% (.83)1 11

Objectives apply analyze 0% 0% 0 0 apply analyze 8% 17% 0 1 apply analyze 12.5% 12.5% (0.5)1 (0.5)0 apply analyze 37.5% 0% (1.5)1 0 apply analyze 16.7% 16.7% (0.83)1 (0.83)1 apply analyze 22% 12.5% 3 2 apply analyze 12.5% 17.5% 2 3 apply analyze 16.7% 16.7% (.83)1 (.83)0 9 7

Total evaluate 0% 0 evaluate 17% 1 evaluate 12.5% (0.5)0 evaluate 0% 0 evaluate 16.7% (0.83)1 evaluate 9% 1 evaluate 25% 5 evaluate 16.7% (.83)1 9

create 0% 0 create 0% 0 create 0% 0 create 0% 0 create 16.7% (0.83)0 create 12.5% 2 create 20% 4 create 0% 0 6

100% 4 100% 5 100% 4 100% 4 100% 5 100% 14 100% 19 100% 5 60

2. STEP 5: DISTRIBUTE ITEMS AMONG LEVELS OF OBJECTIVES This final step of creating the table of specifications involved distributing the number of items assigned to each of the eight units among the six levels of objectives using this formula: Total items per unit × weight of each level of objectives ÷ 100.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

90

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

For example, the calculation of unit one was: 4 (unit 1 items) × 62.5% (remembering level percentage) ÷ 100 = 2.5 (rounded up to 3 items). Table 8.2 shows the calculations for the eights units. This meant that the remembering level took 3 questions from the 4 questions assigned to unit one. The calculations of the eight units are shown in Table 8.2 (cells under the six objectives), which shows the complete table of specifications that involves all the five steps needed to create the table of specifications. Some calculations had to be rounded up to zero although they were close to an integer to match the number of test items for each unit. See appendix B for each unit questions and their level of objectives. A look at the complete table of specification in Table 8.2 shows a number of things: The vertical axis shows themes of content (units) The horizontal axis shows levels of objectives. The second column (left) shows the relative weight of each theme of content. The percentage row under each level of objectives shows the relative weight of each level of objectives. Each row shows the number of items allocated to each level of objectives. The final column (right) shows the total number of items assigned to each unit (end of each row) as well as the total number of questions for whole test (end of column). The table of specifications showed and discussed throughout chapters seven and eight were developed according to the list of objectives method. If we wanted to create a table of specifications according to the test developer‟s judgment method, we need to follow the same procedures. The only difference between the two methods as shown in Table 8.2 and Table 8.3 is that in Table 8.2 each unit had different weights of objectives. By contrast, all objectives across all themes of content received the same weight in Table 8.3. For example, in Table 8.2 (column 3) each theme of content (unit) received a different weight at the remembering level of objectives (62.5%, 33%, 37.5%, 37.5%, 16.7%, 22%, 15%, and 33 respectively). In Table 8.3 all the eight units received the same weight at the remembering level of objectives (25%). So was the case with the remaining levels of objectives since all the eight units received the same weight at each level of objectives (understand 25%, apply 15%, analyze 10%, evaluate 15%, and create 10%).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 3: (Continued) Create a Table of Specifications

91

We now turn to chapter 9 to discuss stage four of test construction which concerns how test developers can determine the type of items/ questions for their test. Table 8.3. A complete table of specifications based on test developer‟s judgment

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Content Unit 1 Unit 2 Unit 3 Unit 4 Unit 5 Unit 6 Unit 7 Unit 8 Total

6% 9% 6% 6% 9% 24% 31% 9% 100%

15 % 1 1 1 1 1 2 3 1 11

Total create

10 % 0 1 0 0 1 2 2 1 7

evaluate

15 % 1 1 1 1 1 2 3 1 11

analyze

25 % 1 1 1 1 1 3 5 1 14

apply

25 % 1 1 1 1 1 4 5 1 15

understand

%

Objectives remember

Weight

10 % 0 0 0 0 0 1 1 0 2

100 % 4 5 4 4 5 14 19 5 60

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 9

STAGE 4: DETERMINE TYPE OF ITEMS PRE-READING REFLECTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What do we mean by type of items when we construct a test? What is the difference between objective and essay questions? Define objective questions/ techniques. Define each of the following techniques, say which type of techniques they are, and state the advantages and disadvantages of each: Multiple-choice questions (MCQ) YES/NO TRUE/FALSE questions Short-answer questions Gap filling questions Completion questions Matching questions

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

94

Saad F. Shawer

AIM(S) OF THE CHAPTER To use different types of techniques.

OBJECTIVES OF THE CHAPTER

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

To define items/ techniques/questions. To draw differences between essay and objective techniques. To know and use each of the following techniques: Multiple-choice questions (MCQ) YES/NO TRUE/FALSE questions Short-answer questions Gap filling questions Completion questions Matching questions

INTRODUCTION This chapter discusses the types of items according to which test developers can write the test questions. Types of items/questions are the techniques, which examiners use to elicit the target behaviors from examinees/ test-takers. In other words, items/ questions are the techniques that examiners use to check the extent to which examinees have acquired the target information and skills. According to Hughes (2003) and Haladyna (2004), types of items/ techniques involve (a) essay questions and (b) objective questions. The chapter proceeds according to this order: 1. Essay Questions 2. Objective Questions 3. The Curriculum (Reference) Test Type of Items

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 4: Determine Type of Items

95

1. ESSAY QUESTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

This book is not interested in essay questions because these do not require specific rules apart from being relevant to the content they seek to check and unequivocal. Essay questions are techniques that require the respondents to create a response that usually ranges between one to several paragraphs. The respondents are free to organize their answers to address the question at hand. Therefore, essay questions are difficult to score since each subject creates answers which usually differ from others who answer the same question. Moreover, essay questions require two or more scorers in order to provide a valid score that actually represents the examinee‟s ability. Valid scores could be reached when scorer agreement is achieved. In most cases, scorers are usually provided with a rubric against which they assess essay questions. It should be noted, however, that no calculations are made. Scorer agreement of essay questions is achieved when two scorers give a similar (but not necessarily identical) aggregate score to the same respondent (Fulcher & Davidson, 2007). We now turn to objective questions, our chief concern in this book.

2. OBJECTIVE QUESTIONS Objective questions are the techniques that examiners use to check the extent to which examinees have acquired the target information and skills in one of the following formats: 2.1. Multiple-Choice Questions (MCQ) 2.2. Yes/No 2.3. True/False Questions 2.4. Short-Answer Questions 2.5. Gap Filling Questions 2.6. Completion Questions 2.7. Matching Questions

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

96

Saad F. Shawer

2.1. Multiple-Choice Questions (MCQ)

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Multiple-choice questions are techniques that examiners use to check the extent to which examinees have acquired the target information and skills usually in the form of a stem followed by three or more options. Most often a single option is the correct answer but a combination of correct options is possible. The other options act as distracters (Haladyna, 2004). Look at these two examples: Multiple-choice questions: tick/circle one or two options as indicated: 1. Assessment is: (tick/circle one answer) A. A process of gathering information about individuals‟ ability using several data collection instruments. B. A tool for gathering information about individuals‟ ability by means of tests. C. A process of gathering information about individuals‟ ability by means of one instrument. 2. The scope of national assessments involves: (tick/circle two answers) A. A set of standards developed by national professional organizations. B. Measuring how schools meet governorate/state standards. C. A set of standards developed by professional organizations in a specific governorate/state. D. Various assessment tools to check the extent to which the national standards are met. E. Promoting students from grade to grade. Advantages of Multiple-choice Questions: Scoring is rapid, economical and precise. A test can cover a wide range of the content by involving many questions. They increase test reliability.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 4: Determine Type of Items

97

Disadvantages of Multiple-choice Questions: Students can guess the correct answer. They do not allow students to develop organizational skills. They are difficult to prepare and take so much time to write. Cheating is easy to happen.

2.2. Yes/No Yes/ No questions require examinees to choose between two options by writing (YES)/ticking () in front of the correct option or writing (NO)/ticking (×) in front of the wrong option. Look at the following example.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Assessments are used to measure how schools meet national and governorate standards. a) Yes ( Yes /  ) b) NO ( ) Informal tests are usually used in summative assessment. a) Yes ( ) b) NO ( No / × )

2.3. True/False Questions True/ false questions also require students to choose between two options by ticking () against the true option or ticking (×) against the false option. Look at the following example. Tick () to indicate a right answer or (×) to indicate a false one: ( × ) Formative assessment is conducted at the end of a learning course to measure what the students have achieved. ( × ) Summative assessment is conducted during a course to check on the progress achieved and using the resulting information to modify future learning/teaching plans. ( ) Assessment is used to compare students‟ performance across schools in a specific governorate.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

98

Saad F. Shawer Advantages of YES/NO and True/False questions: Scoring is rapid, economical and precise. A test can cover a range of the content by involving many questions. They increase test reliability. Disadvantages of YES/NO and True/False questions: Students have a 50 percent chance to guess the correct answer. They do not allow students to develop organizational skills. Cheating is very easy to happen.

2.4. Short-Answer Questions Short answer questions require students to provide a short and specific answer. Look at the following example.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Write short notes on: A) The differences between norm and criterion referenced tests …………………………………………………..……………………… …………………………………………………..……………………… …………………………………………………..……………………… …………………………………………………..……………………… …………………………………………………..……………………… Advantages of short answer questions: Guessing will not be an issue. Cheating is more difficult to happen. They require a production of language. Disadvantages of short answer questions: Scoring might be difficult as it may require some judgment. Scoring takes a longer time. Judgment may affect test reliability.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 4: Determine Type of Items

99

2.5. Gap Filling Questions Gap filling questions require students to fill a gap with a word or group of words. Look at this example: Fill in the spaces Stages of test construction involve: (1) writing test aim, (2) ………… …………………………, (3) writing type of test, (4) writing the table of specifications, (5) validating the test (6), ………………………..… ……............................................................., (7) test scoring, and (8), ……………………………….……………. (1 point each). Advantages of gap filling questions: Guessing will not be an issue. They require a production of language.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Disadvantages of gap filling questions Cheating is possible. Words to be filled can be confusing. Scoring might be difficult as it may require some judgment.

2.6. Completion Questions Completion questions provide students with part of a sentence that students are required to complete with the appropriate words. Look at this example: Complete the following: Testing is a process of gathering information about individuals‟ ………………………………………………………………………… ……………………………………………………………

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

100

Saad F. Shawer

2.7. Matching Questions Matching questions provide students with two columns. The first column (A) is assigned for terms or concepts whereas the second (B) is assigned for descriptions about the first column‟s terms. Students are required to match the terms in column A with the descriptions in column B. Look at this example: Match the descriptions in column (B) with the items in column (A) by inserting the correct letter from column (B) in the correct box of column (A): (1 point each) Column (A) 1. Testing 2. Assessment

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3. Informal evaluation 4. Evaluation research 5. Program objectives

6. Program outputs 7. Summative assessment

Column (B) A process of gathering information about individuals‟ ability by means of several data collection instruments. A process of gathering information during a course to check on the progress achieved and using the resulting information to modify future learning. A process of gathering information at the end of a learning program or course to measure what has been achieved. The subjective assessments made by ordinary people to judge the value, merit or worth of something The use of scientific procedures to collect, analyze and interpret data about a program (design (content and structure), implementation and outcomes) and using the resulting information to make decisions about program improvement or change. The services provided by the program.

A

The formally stated goals to which program resources are directed.

G

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

B

C

D

E

F

Stage 4: Determine Type of Items

101

3. THE CURRICULUM (REFERENCE) TEST TYPE OF ITEMS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Having discussed the main techniques used in writing test items, the items of the model test were of the multiple-choice type. As we explained in this chapter, types of items are the techniques test developers, teachers, or researchers use to elicit the target behaviors from the examinees (Hughes, 2003). In other words, test questions or items were the techniques used to check the extent to which examinees acquired the target information and skills (curriculum content and course design skills). Our test consisted of objective rather than essay questions to maintain answer and scoring reliability and to cover as much content as possible of the curriculum course. In particular, multiple-choice questions were used in the form of a stem followed by several options. We now turn to chapter 10 to explain various types and methods of test validation.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

PART IV TEST CONSTRUCTION: TEST VALIDITY AND TRIAL

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Stage 5: Test validation Chapter 10: Validate content Stage 6: Test trial Chapter 11: Test trial

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 10

STAGE 5: VALIDATE CONTENT PRE-READING REFLECTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Define validity. Show the differences between internal and external validity. What are the steps involved in establishing content validity? Define content, construct, predictive, concurrent, convergent, and discriminant validity. How many types does criterion validity involve? “Though face validity is not a reliable method to ensure instrument validation, it is vital that instruments show face validity”. Explain. How can you establish construct validity for your test? For which of the following do we need correlation coefficient to achieve instrument validation? Criterion validity Predictive validity Concurrent validity None of them All of them

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

106

Saad F. Shawer

AIM(S) OF THE CHAPTER To understand various validity terms. To validate instruments using the most appropriate validation methods.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

OBJECTIVES OF THE CHAPTER To define the term instrument validity in general. To define content, construct, predictive, concurrent, convergent, discriminant, and face validity. To draw differences between internal and external validity To use the steps involved in establishing content validity To use the steps involved in establishing construct validity To use the steps involved in establishing criterion validity To use the steps of establishing concurrent validity To establish convergent validity To conduct divergent validity To undertake predictive validity To state the four types of criterion validity

INTRODUCTION Validity is an essential requirement for any measurement procedures. We cannot accept the scores yielded by an invalid test. Validity means an instrument measures what it was exactly designed to measure (Cohen, Manion, & Morrison, 2000; Gall et al., 1996). Cohen et al. (2000) identify a number of methods of assessing validity: 1. Internal Validity 2. External Validity 3. Content Validity 3.1. Methods of Determining Content Validity 3.1.1. Content Experts/Jury Members 3.1.2. Table of Specifications

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 5: Validate Content

107

4. Construct Validity 5. Criterion Validity 5.1. Predictive 5.2. Concurrent 5.3. Convergent 5.4. Discrimnant/Divergent 6. Face Validity 7. The Curriculum (Reference) Test Validation Process

1. INTERNAL VALIDITY

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Internal validity is where the changes/effects resulting from introducing a particular variable are caused by that variable, not by any other variables. Internal validity could be achieved by controlling extraneous and contaminating variables that introduce undesirable and unintended effects, like maturation and history in experiments. This type of validity is particularly important in experimental studies where researchers and teachers are interested in assessing the impact of a particular variable on another or others variables, such as a teaching method or a new course on student achievement (Bloom, Fischer & Orme, 1995).

2. EXTERNAL VALIDITY External validity refers to the degree to which results can be generalized beyond the scope of the sample that went through the research experiences. We can achieve external validity through randomization and controlling extraneous variables. Internal validity is subsequently a prerequisite to external validity. This type of validity is particularly important to quantitative studies that seek to generalize results from samples to the population (Robson, 1993).

3. CONTENT VALIDITY Content validity refers to the degree to which test items adequately represent the content it was designed to measure. This does not mean that a single test would reflect every single element of the content. It, however,

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

108

Saad F. Shawer

means that a test would include representative samples of that content (Cohen et al., 2000; Gall et al. 1996). In other words, a test must represent the whole range not the whole details of content. For example, if the content you taught aimed at developing the four skills (reading, writing, speaking and listening), your test will only show content validity if it represents the four skills. In other words, the test needs to include items about reading, writing, speaking and listening. If it covers reading and writing but not speaking and listening then the test content is not valid. Similarly, if a math teacher taught a course aiming to develop students‟ ability to add, subtract, multiply and divide, then the test must reflect the four processes. Test developers should always remember that it is the test purpose, which determines what content to be included in the test. For example, if a test aimed to assess only reading and writing skills and reflected both in its content then the test is valid. However, if this same test involved speaking skills, which are not part of the test aim, the test will not be valid either!

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3.1. Methods of Determining Content Validity Test developers who seek to check the extent to which students mastered a taught content or achieved the objectives of a taught course (content validity) can use one of two methods: 3.1.1. 3.1.2.

Content Experts/Jury Members Table of Specifications

3.1.1. Content Experts/Jury Members The author proposed systematic procedures, which test developers, social researchers, and teachers could use to validate test content. Based on the test purpose, content experts/jury members could check instrument validity by first making sure that the test developer has: Stated the test purpose/aim precisely and clearly. Broken down the test purpose/aim into precise objectives. Defined precisely the domain the test sought to measure (making use of the test objectives).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 5: Validate Content

109

Based on the test purpose/aim and objectives, the jury member has second to check that the test developer:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Has spelled out the themes of content in the light of the test objectives. Has allocated each theme of content a specific weight. Has represented each theme of content according to the allocated weight. Has determined levels of thinking (objectives) (e.g., understand, create, etcetera). Has determined weights to levels of thinking (objectives percentages). Has distributed the items (of each theme of content) according to the levels of thinking weights.

3.1.2. Table of Specifications The table of specifications method is already included in the abovementioned method (Content experts/jury members) and should be part of it. A test table of specifications helps jury members to determine in scientific and systematic ways the specifications of test content in terms of percentages of content to be represented on the test and objectives levels each part of content addresses. Developing a table of specifications involves five steps. Though these steps have been explained in detail in chapters seven and eight, they are summarized here. Step 1: Determine test content Step 2: Determine relative weight of content Step 3: Determine relative weight of objectives Step 4: Determine items among themes of content Step 5: Determine items among levels of objectives Content validity is „particularly important in selecting tests to use in experiments involving the effect of instructional methods on achievement‟ (Gall et al., 1996, p. 251). Let‟s explain how the author validated the content of the (model) curriculum test. The curriculum test was content validated to ensure it measured curricular-content as conceptualized in chapter 7, section 1 (Cohen et al., 2000; Bloom et al., 1995). The author used content validity in particular because it best suits achievement tests. The test was content validated by

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

110

Saad F. Shawer

checking the degree to which the test items adequately represented the content it was designed to assess (as conceptualized in chapter 7, section 1) (Gall et al., 1996). The jury and table of specifications methods showed the test had a valid content. Three curriculum experts used systematic procedures (proposed by the author in chapters 7 and 8) to make sure that the test developer: stated the test purpose precisely, translated the test purpose into precise objectives, defined the test domain by spelling out the themes of content in the light of the test objectives, allocated each theme of content a specific weight and represented on the test according to that weight, allocated each theme of content levels (e.g., create) and weights (percentages) of objectives, distributed items of each unit according to objectives weights, created a specifications table showing content and objectives weights.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

The jury members agreed the test addressed the above issues but asked for repositioning and rewording of several questions.

4. CONSTRUCT VALIDITY According to Gall et al. (1996, p. 249), construct validity is “the extent to which a particular test can be shown to assess the construct that it purports to measure.” We need then to understand what a construct is. Gall et al. refer to a construct as “a theoretical construction about the nature of human behavior” (p. 249), whereas Cohen et al. (2000) define a construct as an abstract human attribute. In all cases, a construct is a construct because we cannot directly observe. We can only infer the effects of a construct from human behavior. The author refers to construct validity as the extent to which an instrument clearly and adequately shows it measures the operational aspects of the construct it sought to measure. To be able to assess a construct, test developers need to provide an operational definition of the concept. If they seek to assess creativity (a construct), they need first to precisely spell out what is meant by creativity. Creativity could be operationally defined as the ability to:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 5: Validate Content

111

produce new ideas produce original ideas look at existing things in different ways use existing things in different ways We now move to discuss methods of determining construct validity below.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

4.1. Methods of Determining Construct Validity According to Cohen et al. (2000), test developers can determine construct validity through reviewing the literature to reach agreement about an operational definition of the target construct. They also need to calculate coefficient correlations between a new instrument and other valid instruments that measure the same construct. This involves convergent and divergent validity (discussed below). Gall et al. (1996) advise test developers to conduct a study about the target construct, for example about creativity. They first develop hypotheses about students who are expected to show aspects of the construct (creativity). If a test developer claims, his/her test measures creativity, then he/she needs to write a hypothesis about that claim. For example, the hypothesis could be: students who use and look at existing things in new different ways are more creative than those who look at and use existing things in their traditional uses. Test developers then administer the test and analyze the data. If the test shows that a group reveals the creativity aspects (as defined in the test) while the other group does not, then there is evidence that the test shows construct validity. Gall et al., however, stress that a single piece of evidence is insufficient to prove construct validity. Other methods, therefore, are needed to provide evidence about the target construct, including concurrent and predictive validity (discussed below). Both kinds of validity require high correlations between the data gathered by the new instrument and the data gathered by another valid instrument to maintain construct validity.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

112

Saad F. Shawer

5. CRITERION VALIDITY Criterion validity comprises many validity types, which all involve the administration of two instruments. The first instrument (known as the criterion) has to be an established valid instrument. The second instrument is the new instrument test developers seek to establish its validity. Coefficient correlations between the valid/criterion instrument and the new instrument are calculated so that the validity of the new instrument could be determined. If the two instruments correlate high, this provides evidence that the new instrument is valid. In contrast, if the two instruments correlated low, this indicates that the new instrument is not valid. These are the known types of criterion validity: 5.1. Predictive 5.2. Concurrent 5.3. Convergent 5.4. Discrimnant/Divergent

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

5.1. Predictive Predictive validity involves the administration of a new instrument that makes predictions about a certain phenomenon at a specific point of time and running another valid instrument (criterion) at a later specific point of time. After running the new and criterion instruments, test developers calculate a correlation coefficient between the data gathered from the new and the valid instrument. If the calculation is, for example, .07, .08, or even 1.00, this indicates high correlation/relationship. A result like these gives evidence that the new instrument has predictive validity. If the calculation is, for example, .04, .03, .02, 01, or even 0.00, this indicates no correlation/ relationship. A result similar to these gives evidence that the new instrument has no predictive validity. The valid instrument is the criterion/standard against which the new instrument is validated. Cohen et al. (2000) gave a hypothetical example of predictive validity. A group of students took a test about a particular domain at the age of 16 and the same students took another test at the age of 18. If the results gathered from the new and criterion test correlated high, this indicates predictive validity of the new test.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 5: Validate Content

113

Table 10.1. Predictive and concurrent validity Points Measures Timing

Predictive validity Two or more Two points of time

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Administration (time span) Order of Administration

Long time Order must be established New instrument run first Criterion instrument run second

Concurrent validity Two or more One or two close points of time Same or very short Order has no importance Both instruments run at the same time Either instrument is run shortly before or after the other

In other words, if test developers seek to predict the success of candidate students to the college of medicine using a new test, they can ask the students to take the test when they first join the college. Those students who score high on the new test obtain admission to the college. When these admitted students are about to graduate, they take the valid test. If the two tests correlate high, this indicates predictive validity of the test. This means that those students who score high on the new test are expected to succeed at the college of medicine, while those scoring low have little chance to mange study at this college.

5.2. Concurrent Gall et al. (1996, p. 252) indicate that concurrent validity is the degree to which “individuals‟ scores on a new test correspond to their scores on an established test of the same construct that is administered shortly before or after the new test.” For example, a group of students took a new test that measured creativity. A week or month before or after, the same students took another valid test (criterion) that also measured creativity. Both tests defined creativity in similar operational terms. If the results gained from the two tests correlated high, this provides evidence that the new test shows concurrent validity. Table 10.1 draws a comparison between predictive and concurrent validity.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

114

Saad F. Shawer Table 10.2. Convergent and concurrent validity

Points Measures

Timing

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Administration time span Order of Administration

Convergent validity Two or more All instruments are valid and stand as a criterion for one another One or two close points of time Same or very short time Order has no importance All instruments could be run at the same time Instruments could be run shortly before or after one another

Concurrent validity Two or more New instrument (under validation) Criterion instrument (established/valid) One or two close points of time Same or very short time Order has no importance Both established and new instruments could be run at the same time Either established or new instrument could be run shortly before or after the other

5.3. CONVERGENT Convergent validity might be confused with concurrent validity or may be considered synonymous but there are slight differences. Convergent validity involves a simultaneous collection of data about a certain phenomenon using two or more instruments. Each instrument is supposed to be valid where each instrument stands as a criterion for the other. If the instruments yield similar data, this indicates convergent validity. This means that the data from the instruments converged. In contrast, concurrent validity involves an established or valid instrument (a criterion) and the new instrument that is under validation. Moreover, both concurrent and convergent validity involve the administration of instruments shortly before or after one another (Cohen et al., 2000). An example of convergent validity involves gathering data about student creativity through observations, interviews and tests. If the data converges through high correlations from the three instruments, this provides evidence of convergent validity. Convergent validity is more or less like methodological

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 5: Validate Content

115

triangulation where researchers gather data about the same phenomenon through two or more methods. Table 10.2 draws a comparison between convergent and concurrent validity.

5.4. Discrimnant/Divergent Contrary to convergent validity, Cohen et al. (2000) indicate that discriminant validity involves administering two or more similar instruments to measure different constructs where both instruments yield low correlations. For example, one instrument is developed to measure anxiety whereas the other is designed to measure introversion. If both instruments yield low correlation, this indicates discriminant validity. In other words, each instrument measured what it was designed to measure not another thing. In contrast, if they yielded a high or average correlation, this indicates lack of validity because each measured something it was not designed to measure.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

6. FACE VALIDITY Face validity involves a subjective look at the instrument to check if its items seem to reflect the content the instrument purported to measure. It does not involve any systematic or objective study of the instrument content. Though face validity is the least method for ensuring instrument validity, it is important that instruments show face validity. This is because instruments that are valid but lack face validity fail to enlist the subject cooperation to provide the required information simply because the subjects might think the instrument is irrelevant or trivial (Gall et al., 1996).

7. THE CURRICULUM (REFERENCE) TEST VALIDATION PROCESS Having thrown the light on various validity types that test developers can use to validate their instruments, we explain here how our actual test (the curriculum test) was validated. Our test was content validated to ensure it measured the curricular content conceptualized in chapter 7, section 1 (Cohen et al., 2000). Content validity was used in particular because it best suits

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

.

116

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

achievement tests. The content was validated by checking the degree to which the test items adequately represented the content it was designed to assess (Bloom et al., 1995; Gall et al., 1996). The jury and table of specifications methods showed the test had a valid content. Three experts checked the test content and made sure the curriculum test included a precise purpose that was translated into precise objectives, and that it had themes of content determined in the light of the set objectives. Each jury member was also convinced that each theme of content was allocated a specific weight that was represented on the test in actual test questions according to that weight. They also made sure that each objective was assigned a level of thinking (e.g., create) and weight (percentages) and that items of each unit were distributed according to objectives weights. The jury members agreed the test addressed the above issues but asked for repositioning and rewording of several questions. We now turn to chapter 11 to explain in detail various reliability types and methods and how tests as well as other instruments could be checked for reliability. Chapter 11 also shows how the curriculum test was checked for reliability.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 11

STAGE 6: TEST TRIAL PRE-READING REFLECTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Define reliability. What do we mean by measurement error? What is „error variance‟ „true variance‟ and „total variance‟? How many types of reliability are there? How many methods for determining reliability are there? What is in common between these reliability methods? Split-half reliability Kuder-Richardson Alpha coefficient Reliability = True variance – error variance Error variance – true variance Error variance + true variance Total variance = True variance – error variance Error variance – true variance Error variance + true variance

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

118

Saad F. Shawer 17 out of 20 students answered item number eight of a test correctly. Calculate item facility and item difficulty for item eight. Define item discrimination index, item analysis, item facility, and item difficulty. How can item analysis contribute to test reliability? A teacher determined item difficulty as 0.38. Explain if 0.38 is an appropriate item difficulty. Five students scored 1, 1, 0, 0, and 1 on item number four of a test. These five students' overall scores on the test are 3, 5, 6, 8, 4, and 7. Calculate item discrimination index. Which is the most and least discriminating item index of the following? Why? a) 0.601 b) 0.386 c) 0.734 d) 0.188 How can you determine the overall time of your test? Which methods of the following should be used to determine reliability as internal consistency? (Tick the right answer or answers): Split-half reliability Kuder-Richardson Alpha coefficient Test re-test Spearman-Brown Inter-rater Alternate forms

AIM(S) OF THE CHAPTER To understand various reliability types and methods. To determine the reliability of instruments using the most appropriate reliability methods. To determine the overall time of a test.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

119

OBJECTIVES OF THE CHAPTER

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

To define reliability. To explain the meaning of measurement error. To define „error‟ „true‟ and „total‟ variance. To state the types of reliability. To mention the methods that could be used for determining reliability. To show the similarities and differences between: Three methods of determining reliability Internal consistency methods To calculate item facility and difficulty. To calculate item discrimination index. To interpret item discrimination indices. To explain how item analysis contributes to test reliability. To determine the overall time of a test, using average of the first and last test taker. To determine the overall time of a test, using the average of all testtakers.

INTRODUCTION This chapter highlights the process of test development in terms of test reliability and item analysis through trying out. It proceeds according to the following order: 1. Determine the Test Reliability 2. Methods of Determining Reliability 3. Calculate Item Analysis (Coefficient of Commonness, Item Difficulty, Item Easiness) 4. Determine Test Length (Time) 5. The Curriculum (Reference) Test Reliability Process 6. The Curriculum (Reference) Test Time

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

120

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. DETERMINE THE TEST RELIABILITY Reliability refers to “how much measurement error is present in the scores yielded by the test.” When reliability correlation coefficient estimates the true score and measurement error as 1.00, such a correlation indicates a perfect reliability whereas a correlation of 0.00 indicates no reliability. Tests yielding 0.80 or higher are considered reliable (Gall et al., 1996, p. 254). Reliability then is the extent to which test/ instrument scores are free of measurement error (false/error variance). Reliability therefore means that an instrument provides consistent results when completed by the same or similar samples on different occasions. For example, when an instrument is carried out on a group and is administered again to a similar group on one or different occasions, it should give similar results. Reliability is subsequently a tool for checking data accuracy through separating error and true variances from the total variance. Total variance = error/false variance + true variance. Precisely, reliability involves (total variance – error/false variance = true variance). This means the difference obtained from administering an instrument is not totally a result of real differences. There is a false or error proportion of that difference. In other words, error differences result from something else other than the target issues a tool intends to measure. The more true variance there is, the more reliable an instrument is. The reverse is true. The more error or false variance there is, the less reliable an instrument will be. There are various types of error variances. Each requires a particular method of reliability to single out this error variance. This means a single reliability method cannot assess all reliability types (Abuhattab, Othman & Sadiq, 1987). We need to address some key questions here. What type of error variance is expected to reduce instrument reliability? Which reliability method is most suitable to single out that type of error?

2. METHODS OF DETERMINING RELIABILITY Gall et al. (1996) and Cohen et al. (2000) identify several methods of determining reliability. Each method addresses a particular type of reliability. This section discusses three reliability types and several methods of determining reliability according to this order: 2.1. Reliability as Stability/Test Re-Test/Pre-Posttest

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

121

2.2. Reliability as Equivalence/Alternate/Equivalent Forms 2.3. Reliability as Internal Consistency/Spilt-Half/Alpha/KuderRichardson

2.1. Reliability as Stability (Test- retest) / (Pre/ posttest)

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

The test/re-test method measures reliability in terms of stability by assessing error variance that results from time differences of instrument administration (Abuhattab et al., 1987). Reliability as stability seeks to measure stability of scores over time. When one instrument is carried out on a group on one occasion and is administered again on the same or similar group on another occasion, it should give similar results. This method is usually referred to as the test/retest reliability. When we run a reliable test and re-run it again after an appropriate period of time, it should yield similar results. Coefficient correlation (known as coefficient of stability) between pre and post administration is calculated to determine reliability in terms of stability. Reliability as stability usually occurs through the test-retest (pre/posttest) method, where an instrument is administered twice 'over time' and 'over sample' (Cohen et al., 2000; Shawer, 2010b). These are discussed as follows: 2.1.1. 2.1.2.

Over Time/Diachronic Reliability as Stability Over Sample/Synchronic Reliability as Stability

2.1.1. Over Time/Diachronic Reliability as Stability 'Over time' reliability (diachronic) is where the same instrument is administered on one sample as a pre-test on one occasion and is administered again to the same sample as a post-test on a different occasion during an appropriate time span. Only one particular group (sample) is selected for piloting. The time span should not be very long so as not for extraneous variables to intervene, such as maturation. On the other hand, it must not be very short lest students remember pre-test responses. This type of reliability is shrouded with some problems. Extraneous effects might take place between first and second instrument administration (Bloom et al., 1995; Cohen et al., 2000; Gall et al., 1996).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

122

Saad F. Shawer Table 11.1. Over time and over sample reliability as stability

Instrument

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Time span Sample

Over time reliability (Diachronic) Same for pre/post test Two different occasions First session Second session One/same sample

Over sample reliability (Synchronic) Same for pre/post test One/same occasion Two different samples Group 1 Group 2

2.1.2. Over Sample/Synchronic Reliability as Stability 'Over sample' (synchronic) reliability involves a simultaneous administration of the same instrument to two different samples on one occasion. The instrument is administered to one group for pre-testing. The same instrument is administered at the same time to a different equivalent group for post-testing. This means that two different samples take the test. One takes the same test for pre-testing but a different sample takes the same test for post-testing on the same occasion. Over sample reliability helps avoid pre and post testing effects. However, the two samples have to possess similar characteristics. This could be achieved through random assignment of respondents in the groups. Cohen et al. (2000, p. 118) confirm that “this form of reliability over a sample is particularly useful in piloting tests and questionnaires.” Table 11.1 highlights the similarities and differences between „over time‟ and „over sample‟ reliability. The reliability as stability (both over time and sample) coefficient of stability between pre- and post-testing should not be less than 0.05. According to Abuhattab et al. (1987), the test re-test method assesses error variance that results from time differences of instrument administration. It is most suitable to assess reliability of tests which are not affected by re-testing. Though a coefficient of correlation is calculated to assess the relationship between the first and second administration of a tool, the situation may require various administrations of the instrument at different times so that error variance could be precisely assessed. After calculating the coefficient of correlation of each two administrations, we need to calculate the average of correlations from various instrument administrations. It is not unusual that students‟ scores change between first and second instrument administration. This is not a problem so long as the relative position of the individuals does not change. The test/re-test method received these criticisms:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

123

Student performance may improve in the second administration due to experiencing the same situation in the first instrument administration. At very short time spans between first and second administrations, the respondents may remember their first responses. Consequently, this method is most suitable to assess reliability of tests that are not affected by re-testing.

2.2. Reliability as Equivalence/Alternate Forms/Inter-rater The alternate forms method measures reliability in terms of equivalence by assessing error variance that results from content differences of instrument administrations (Abuhattab et al., 1987). Cohen et al. (2000) indicate that reliability as equivalence involves two equivalent versions of one instrument geared to measure typical target issues. There are two methods of reliability as equivalence:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.2.1. Equivalent/Alternate Forms Reliability 2.2.2. Inter-Rater Reliability

2.2.1. Equivalent/Alternate Forms Reliability Equivalent or alternate forms reliability involves developing two versions of one instrument to measure the same content. Alternate forms reliability could be also over time (diachronic) and over sample (synchronic). Test developers can undertake the alternate forms method diachronically through administering the first form to a particular sample for pre-testing and the second/ equivalent form to the same sample for post testing. The two alternate forms are administered on two different occasions for one sample. On the other hand, test developers can undertake alternate forms reliability synchronically through administering the first form to one sample as a pre-test, but administering the second equivalent form to a different similar sample as a posttest. The two alternate forms are administered on one occasion for the two different samples. Coefficient of correlation (known as coefficient of equivalence) between the two forms should not be less than 0.05. A t-test between the two forms can be also calculated to reveal the differences in means and standard deviations. Abuhattab et al. (1987) point out that error variance results from the difference

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

124

Saad F. Shawer

in question format not due to the respondent‟s inability to answer the question. For example, the respondent answers the question correctly on the first version of the test but fails to answer the counterpart question on the second form.

2.2.2. Inter-Rater Reliability/Objectivity The alternate forms method could be also used to assess error variance that results from scorer‟s subjectivity or scorer differences (Abuhattab et al., 1987). Inter-rater reliability involves reaching agreement between two or more raters with regard to the data collected through grids of observations or structured interviews. Inter-rater reliability means that all raters have to observe the same event or collect the same data and that they all enter them into the same categories. Researchers and teachers can determine inter-rater reliability through calculating a coefficient correlation (known as coefficient of objectivity) between the scores given by two or more raters (researchers) to the same respondents on the same test. This means each respondent receives two or more independent scores on the same test by two different raters (Cohen et al., 2000).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.3. Reliability as Internal Consistency/Spilt-Half/Alpha/KuderRichardson Internal consistency methods measure reliability in terms of instrument internal consistency by assessing error variance that results from content sampling and content homogeneity differences of the same instrument. Reliability as internal consistency means the respondent‟s performance on all test items keeps consistent. This means the respondent‟s performance does not improve on some sections rather than the others (Abuhattab et al., 1987). Reliability as internal consistency is appropriate for homogeneous tests which do not have different sets of content. For example, a multiplication test is homogeneous whereas a test assessing addition, subtraction, multiplication and division is not. Lack of homogeneity is therefore a source of error variance. Moreover, though a test could be homogeneous, it might have progressive difficulty levels of items. This can also be a source of error variance since the respondent performs well at the first set of questions while performing poorly at subsequent more difficult sets of questions. Reliability as internal consistency could be assessed in many ways, which serve different purposes. In other words, they are not

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

125

interchangeable. Each method is suitable in certain cases but not in the others. These are:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2.3.1. Split-Half 2.3.2. Kuder-Richardson 2.3.3. Alpha Coefficient

2.3.1. Split-Half According to Cohen et al. (2000) split-half reliability involves dividing a test into two halves, but the dividing process takes place at the test scoring stage not at the administration stage. The difficulty level of each item needs to be determined beforehand to obtain two matched halves in terms of difficulty in addition to content and number of questions. Test developers then distribute items according to difficulty levels between the two halves. The first half involves the odd-numbered items whereas the second includes the evennumbered items. Spilt-half reliability differs from the test/ retest and alternate forms methods of reliability as both involve administering an instrument twice whilst split-half involves running the instrument once. Split-half involves dividing an instrument into two halves, matched in item number, difficulty level and content. The instrument is run once with each half marked independently from the other. Each individual student‟s scores on the first half must be correlated with the scores obtained from the second half for the same student. Gall et al. (1996) indicate that reliability is calculated through a coefficient of correlation between the two halves (known as coefficient of internal consistency). It should be noted, however, that the correlation is just for one half of the instrument. A formula is, therefore, needed to correct the reliability coefficient to estimate reliability of the whole test. Spearman-Brown or Guttman formulas could be used for this purpose. Indeed, the Guttman formula could be used directly for calculating reliability without a need to calculate the coefficient correlation first. Cohen et al. (2000, p. 118) used the Spearman-Brown formula to calculate the coefficient correlation between two halves of an instrument in the following example: ²r Reliability = ____________ 1+r

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

126

Saad F. Shawer

The actual correlation between the two halves is referred to by (r). Cohen et al. point out that Spearman or Pearson coefficient need to be calculated as appropriate. They calculated Spearman coefficient correlation as 0.85 and worked out reliability this way:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

²(0.85) 1.70 Reliability = _____ _____ = 0.919 (rounded up to 0.92) 1 + 0.85 1.85 A quick look at the calculated reliability (0.92) indicates a high reliability since the maximum value of coefficient correlation is 1.00. Like the alternate forms method, split half could mean preparing two typical halves where each question in the first form measures the same issue that a counterpart question on the second form measures. In this case, two sets of questions are prepared where the first set (say 20 items, supposing the test comprises 40 items) is assigned to the first half and the second set (the other 20 items) into the second half. By contrast, the first half of questions could measure different issues from those the counterpart second half of questions measures. Though each peer questions on both halves measure different issues, each item must be matched to the other at the difficulty level. In this case, test questions are written in the logical order, but the first of the two matched questions has to be assigned an odd number and the other an even number.

2.3.2. Kuder-Richardson/(Yes/no Type of Questions) According to Abuhattab et al. (1987) and Gall et al. (1996), split-half is suitable only for assessing internal consistency between the two halves of a test (two groups of items) not inter-item consistency (individual items). Like split-half, Kuder-Richardson requires the administration of the test once. Unlike split-half, it does not require splitting the test into sections. The respondent‟s responses are assessed on all items. Kuder-Richardson checks consistency of the respondent‟s responses of all the test items. This method is suitable only for dichotomously scored items, like yes/no and true/false types of items. Therefore, it is used with instruments that require the respondent to choose between two options only.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

127

Table 11.2. A comparison between different reliability methods Type Function

Method

Analysis

Reliability as stability Check measurement error resulting from time error Test/ re-test Pre- posttest Coefficient correlation

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

(stability coefficient)

Versatility

Most tests

Reliability as equivalence Check measurement error resulting from content error

Reliability as internal consistency Check measurement error resulting from content, & homogeneity error

Alternate forms Inter-rater Coefficient correlation (equivalence coefficient) Coefficient correlation (objectivity coefficient) t-test

Split-half Kuder-Richardson Alpha coefficient Split-half: Coefficient correlation (internal consistency) & a correction formula

Most tests

Kuder-Richardson: Coefficient correlation (internal consistency coefficient) Alpha: Coefficient correlation (internal consistency coefficient) Split-half: most type of items Kuder-Richardson: dichotomous items only Alpha: multiple-choice & scaled responses

2.3.3. Alpha Coefficient According to Abuhattab et al. (1987) and Gall et al. (1996), this method developed by Cronbach and modified by Michelle and Kaiser. Like split-half and Kuder-Richardson, Alpha checks internal consistency of instruments and also requires the administration of instruments once. Unlike split-half but similar to Kuder-Richardson, Alpha coefficient does not require splitting instruments into sections. Unlike Kuder-Richardson which is suitable only for dichotomous types of items, Alpha coefficient is suitable for multiple-choice items. Alpha coefficient is also suitable for scaled types of items which provide several possible options. Each option carries a different weight (like

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

128

Saad F. Shawer

strongly agree, agree, undecided, disagree, strongly disagree). It checks the variances of all items from first to last.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

3. CALCULATE ITEM ANALYSIS (COEFFICIENT OF DIFFICULTY/FACILITY AND DISCRIMINATION) Item analysis is “a set of procedures for determining the difficulty, validity and reliability of each item in the test” (Gall et al., 1996, p. 278). Item analysis subsequently shows the contribution of each question/item to the test/ instrument overall reliability. This means that test developers should remove or revise faulty items to increase the instrument reliability (Hughes, 2003). Test developers calculate item facility by dividing the number of individuals answering an item correctly by the total number of test takers (sum of correct answers of one item ÷ total number of test takers). By contrast, test developers calculate item difficulty by dividing the number of individuals answering an item incorrectly on the total number of test takers (sum of incorrect answers of one item ÷ total number of test takers). The following step is to develop a difficulty/facility index for a whole test. For example, 13 out of 20 students answered item number four of a test correctly, while seven students gave the wrong answers. Item facility is calculated as 13 ÷ 20 = 0.65 whereas item difficulty is calculated as 7 ÷ 20 = 0.35. You need to continue doing this with each item. The facility index means that 65% answered item four correctly while the difficulty index means 35% answered it wrong. This also means that a 35% is a suitable difficulty level. However, judging the appropriateness of a facility/ difficulty index depends on the purpose of the test. If you need to identify the top 10% students, an item of a 35% difficulty will not meet the test purpose. You will need to make it more difficult to single out or discriminate all students who do not fall within the top 10%. In this case, each item in the test must result in a 0.10 value to contribute to a proficiency test capable of identifying the top 10% of students. On the other hand, if our purpose of a test is to assign students to placements, then we need difficulty/facility levels of a wide range to cover mixed-ability students. The most important thing in this case is that the differences between items should not be that big. For example, we can get a facility/difficulty level of an item 0.50. We can get another facility/difficulty level of 0.55 and a third of 0.60 and so on (Hughes, 2003).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

129

Item discrimination index is a comparison of all the test takers' performance on each item with their performance on the test as a whole. Item discrimination (discrimination index) is an indication of the extent to which an item can discriminate between weak and strong students by using correlation coefficients. Test developers should remove an item that does not discriminate between weak and strong test takers. They calculate item discrimination by correlating between the test takers' scores on one item and the test takers' overall score on the test minus that item. If an item correlates well with the overall test scores, this gives an indication of a good item and test discrimination. The maximum discrimination index is 1.00 (strong discrimination) whereas the minimum discrimination index is 0.00 (no discrimination). The discrimination index, which is a comparison between all the test takers' performance on each item and their performance on the test overall items minus that item, contributes to the test reliability. Table 11.3 shows part of a discrimination table taken and adapted from (Hughes, 2003, p. 227).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 11.3. A discrimination index table of four items Item number Item 1 Item 2 Item 5 Item 6

Discrimination index (correlation coefficient) 0.386 0.601 0.734 0.188

The table shows that the most discriminating item is item number 5 (0.734), whereas the least discriminating item is item number 6 (0.188). There is no a satisfactory discrimination index cutoff. We usually conduct item discrimination analysis when we think it has an effect on reliability. When reliability is negatively affected, we conduct item discrimination analysis to single out or revise particular items so that we can increase the test reliability. When we determine an item as too easy or too difficult through item analysis (facility/difficulty coefficients), the discrimination index becomes low. Sometimes easy items are kept and placed at the beginning of a test to give test takers confidence. By contrast, sometimes difficult items are kept (though they do not discriminate well between all the test takers) to discriminate between strong test-takers.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

130

Saad F. Shawer

4. DETERMINE TEST LENGTH (TIME) Test developers can determine the overall time a test should be by means of one of these two methods: A) Calculate the average of first and last test-taker B) Calculate the average of all test takers

A) Calculate the Average of the First and Last Test-Taker

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

One way the author used to determine the maximum time a test should take was a calculation of the average time of the first student who first completed the test and the student who finished the test last during piloting. For example, while you are trying out a test, the first test taker completed the test after 50 minutes while the last test taker finished the test after 90 minutes. You need now to calculate the average time by adding 50 and 90 minutes and dividing them by 2 (50 + 90 = 140 ÷ 2) = 70. The test time should be 70 minutes. If you are in doubt because the first student finished too early, you can calculate the test time based on a second or third student who finished.

B) Calculate the Average of All Test Takers You could also determine the test length by calculating the average time of all the test takers, not just the first and last test takers. Table 11.4 shows the time each test taker spent to finish a reading test. Table 11.4 shows the time spent by each of 15 test-takers who took a reading test. It also shows the total time these 15 test-takers spent. Now, we calculate the average time as follows: 981 (minutes) ÷ 15 (total number of testtakers) = 65.4 (rounded up to 65 minutes). This means the time of the actual reading test was 65 minutes. Now, let‟s explain how we checked our curriculum test for reliability in addition to determining the test length in the coming section.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 6: Test Trial

131

Table 11.4. The time each test taker spent on a reading test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Test takers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total

Time spent on the reading test in minutes 40 43 51 55 77 63 65 59 60 60 67 80 82 89 90 981 minutes

5. THE CURRICULUM (REFERENCE) TEST RELIABILITY PROCESS Having prepared the test, the author tried it out on 30 student-teachers from a similar population to calculate reliability, conduct item analysis by singling out weak items affecting reliability and to determine the test time. As explained earlier on in this chapter, reliability refers to “how much measurement error is present in the scores yielded by the test” (Gall et al., 1996, p. 254). The Cronbach‟s Alpha reliability method checked the extent to which test scores were free of measurement error (false/ error variance) to ensure consistent results. The test was checked for internal consistency to ensure the subjects‟ performance on all the test items was consistent without having performance improved on some sections than others. Though split-half, Kuder-Richardson and Cronbach‟s Alpha all check internal consistency and require running instruments once, Kuder-Richardson and Cronbach‟s Alpha differ from splithalf in not splitting the test into two halves. Moreover, Kuder-Richardson suits dichotomous item types only (e.g., yes/no questions). As a result, Cronbach‟s

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

132

Saad F. Shawer

Alpha was particularly used because it suited the curriculum test multiplechoice items and to check the variances of all items from first to last. The author calculated the test reliability using SPSS, version 14. Cronbach's Alpha of the test was (0.83). A test of .80 or higher coefficient is considered reliable (Gall et al., 1996).

6. THE CURRICULUM (REFERENCE) TEST TIME

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

The author determined the test length (time) through calculating the average time spent by all the 30 students who took the test during the trying out process. Table 11.5 shows the time spent by each of the 30 studentteachers who took the curriculum test. It shows these 30 students spent 2070 minutes. The average time was calculated as: 2070 (minutes) ÷ 30 (number of test-takers) = 69 (rounded up to 70 minutes). Therefore, the actual curriculum test took 70 minutes. We now turn to chapter 12 to discuss the test administration process and discuss how the administration of instruments affects the overall reliability and the accuracy of scores. Table 11.5. The time test takers spent to complete the curriculum test in min. Test Min. Test Min. taker taker 35 6 50 1 41 7 54 2 41 8 59 3 45 9 59 4 50 10 59 5 Min.= time spent in minutes

Test taker 11 12 13 14 15 N=30

Min.

Test Min. Test Min. taker taker 60 16 73 21 77 60 17 74 22 77 60 18 74 23 77 60 19 76 24 77 67 20 77 25 81 Total time spent by the 30 students= 2070

Test Min. taker 26 92 27 100 28 105 29 105 30 105 Average time = 69.00 (70)

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

PART V

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

TEST CONSTRUCTION: TEST ADMINISTRATION, SCORING, ANALYSIS AND INTERPRETATION Stage 7: Test Administration Chapter 12: Administer the Test Stage 8: Test Scoring, Analysis & Interpretation Chapter 13: Score the Test Chapter 14: Analyze the Test Chapter 15: Interpret the Test Chapter 16: A Curriculum Test Scoring, Analysis & Interpretation

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 12

STAGE 7: ADMINISTER THE TEST PRE-READING REFLECTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Who are the examiners, examinees and proctors? What should examiners know about the test? What should examinees be clear with regard to the test? What are the responsibilities of invigilators during a test? How proctors can influence test reliability? How can testing rooms influence test reliability?

AIM(S) OF THE CHAPTER To minimize measurement errors during test taking/ administration.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

136

Saad F. Shawer

OBJECTIVES OF THE CHAPTER To provide clear instructions to proctors. To provide clear instructions to examiners. To provide clear instructions to examinees.

INTRODUCTION

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

The process of final test administration has crucial importance to students, teachers, schools and researchers, because test scores will have serious implications for all of them. Precisely, test administration could influence the test reliability and give false scores as a result. Therefore, the trying out procedures of tests should provide clear indications about the optimal conditions of final or actual test administration (Hughes, 2003). In the light of test piloting, careful procedures should be considered in regard to: 1. Examiners 2. Invigilators or Proctors 3. Test Takers 4. Rooms 5. The Curriculum (Reference) Test Administration Process

1. EXAMINERS Examiners are people like teachers who use tests to examine students in content related to the test. Of course, the examiner could be the test developer. It should be out of question that examiners are familiar with the test instructions, know how to read them, and are capable of using any equipment related to test taking. Oral examiners in particular should be trained well.

2. INVIGILATORS OR PROCTORS Invigilators or proctors are those people who watch the test takers or examinees during taking a test. They should be familiar with what the students

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 7: Administer the Test

137

can do, what they are not allowed to do, and what materials the test takers are authorized to use, like dictionaries, calculators, and so on. Proctors can easily turn a test unreliable if they do not observe exam rules.

3. TEST TAKERS Test takers are those individuals who are asked to provide answers to the items of the test. They need to receive clear instructions about the venue of the test, allowed materials and equipment, and where to sit exactly. Test takers or examinees should know what is appropriate for them to do during the test and what is prohibited. They should arrive before the beginning of the test.

4. ROOMS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Rooms should be spacious to allow test takers to sit far from one another. Furthermore, test rooms should be well equipped with the necessary materials and equipment, like cassettes and speakers if they were necessary in addition to light, a clock, and ventilation.

5. THE CURRICULUM (REFERENCE) TEST ADMINISTRATION PROCESS We briefly show how the curriculum test was administered. Because all students studied the curriculum course and they all took this as the final exam, the author (examiner) and two other proctors or invigilators watched the test when it was administered, as part of the college exam policy. Clear written instructions on test paper and answering sheet spared oral instructions (see instructions in appendices A, B, C and D). The students were allowed to use calculators to work out content balance. Other invigilators were familiar with test procedures. Test takers received college instructions about the test venue, materials, allowed equipment, and where exactly to sit. Running the piloting version was similar to the actual test administration (Hughes, 2003). We now turn to the final stage of test construction, which is scoring, analyzing and interpreting the test (chapter 13).

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 13

STAGE 8: SCORE, ANALYZE AND INTERPRET THE TEST

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What is scoring? What are the types of test scoring? What are the types of essay test scoring? What do impressionistic and analytic test scoring have in common? Draw a comparison between essay and objective test scoring in terms of their advantages and disadvantages. What is objective test scoring? Define essay test scoring. How can you score an essay test according to the holistic scoring system? How can you score an essay test according to the analytic scoring system?

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

140

Saad F. Shawer

AIM(S) OF THE CHAPTER To score different types of tests.

OBJECTIVES OF THE CHAPTER To define scoring. To state types of test scoring. To mention the two types of essay test scoring. To compare impressionistic and analytic test scoring. To draw a comparison between essay and objective test scoring. To define objective test scoring. To define essay test scoring. To score an essay test according to the holistic scoring system. To score an essay test according to the analytic scoring system.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

INTRODUCTION This and the next three chapters discuss the final stage of test construction (score, analyze and interpret test scores). This chapter solely discusses the procedures of scoring tests, while chapters 14 and 15 will shed the light on the process of tests analysis. The final chapter (16) will discuss the interpretation of test scores. This chapter (13) proceeds according to this order: 1. Essay Test Scoring 2. Objective Test Scoring 3. The Curriculum (Reference) Test Scoring

1. ESSAY TEST SCORING Scoring is the process of assigning points to each of the test items and aggregating the individual points into an overall test score. Scoring can take place in the form of an essay or objective scoring format. Since both scoring

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test

141

systems involve different scoring procedures, this section highlights the procedures involved in each of them. Hughes (2003) states two approaches to essay test scoring: holistic and analytic scoring. The common theme in both scoring systems is the development of a scoring scale. The process involves developing the elements of a scale in addition to training scorers in how to use it. The following paragraphs discuss holistic and analytic scoring in this order: 1.1. Holistic/Impressionistic Test Scoring 1.2. Analytic Test Scoring

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1.1. Holistic/Impressionistic Test Scoring Holistic or impressionistic scoring is where a scorer reads the whole essay (test) and gives the test a score based on his/her overall impression. Most often, two scorers score an essay test independently. As a result, two independent scores are given and matched. A need for a third scorer or more may arise when a big disparity arises between the two independent scores. In all cases, there is a need to establish a scoring system or rubric to guide the scorers. Table 13.1 shows a holistic scoring system of a writing test inspired by Hughes (2003, pp. 96-97).

1.2. Analytic Test Scoring In analytic scoring a task is divided into a number of elements with each allocated a separate score. For example, a writing task will be first divided into ideas, grammar, vocabulary, organization, fluency and mechanics (punctuation/spelling errors etc.). Second, each of these elements will be assigned a separate score. Moreover, each element could be subdivided into smaller elements with each smaller element allocated a point. Table 13.2 shows a score of 20, which was allocated to one aspect (ideas) of a writing piece and the sub-scores assigned to each of its sub-elements.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

142

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 13.1. A holistic scoring system of a writing test [6] Examinee demonstrates proficiency at the rhetorical and structural levels, despite making occasional errors. He/she competently addresses the writing task. competently develops the main ideas the task should usually cover. competently develops each main idea by writing enough supporting ideas. competently organizes the writing task by addressing each main idea in a separate paragraph. avoids repetition of ideas. demonstrates consistent facility in the use of language. displays competence in sentence construction. displays competence in word choice. makes no punctuation mistakes. shows smooth transition between ideas and paragraphs. well organizes and develops the task into introduction, message and conclusion. uses a balanced objective tone in presenting ideas. [5] Examinee demonstrates proficiency at the rhetorical and structural levels, despite making few errors. He/she addresses aspects of the writing task more competently than others. develops most of the main ideas the task should usually cover. develops each main idea by writing many but not all the necessary supporting ideas. organizes the writing task by addressing each main idea in a separate paragraph. repeats few details. demonstrates facility in the use of language. displays acceptable sentence construction. displays acceptable word choice. makes few punctuation mistakes. shows transition between ideas and paragraphs. organizes and develops the task into introduction, message and conclusion. uses an objective tone in presenting ideas. [4] Examinee demonstrates average competence at the rhetorical and structural levels. He/she addresses most of the writing task aspects but misses some aspects.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Stage 8: Score, Analyze and Interpret the Test

143

develops most of the main ideas the task should usually cover. develops the main ideas but slights few necessary supporting ideas. addresses each main idea in a separate paragraph thought not being complete. repeats a few details. demonstrates adequate but inconsistent facility in the use of language. displays acceptable sentence construction but makes few structural mistakes. displays acceptable word choice but uses other few ambiguous words. makes a few punctuation mistakes. shows transition between ideas and paragraphs but not adequate. organizes and develops the task into introduction, message and conclusion to some extent. uses an objective tone in presenting most but not ideas. [3] Examinee demonstrates inadequate competence at the rhetorical and structural levels. He/she addresses a few aspects of the writing task and misses a great part. develops a few number of the main ideas the task should usually cover. slights most of the necessary supporting details. discusses two main ideas in a single paragraph. makes lots of repetition. demonstrates inability in the use of language. displays serious sentence construction mistakes. displays inappropriate word choice. makes many punctuation mistakes. shows inadequate transition between ideas and paragraphs. unclearly organizes and develops the task into introduction, message and conclusion. uses an inconsistent tone in presenting ideas. [2] Examinee demonstrates deficiencies at the rhetorical and structural levels. He/she reveals confusion in ideas development. shows disorganization. uses little and irrelevant details. reveals frequent structural and usage mistakes. shows a little awareness of transition in writing. is unable to link the writing elements.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

144

Saad F. Shawer Table 13.1 (Continued)

[1] Examinee demonstrates incompetence in writing. He/she reveals serious incompetence in ideas development. shows serious disorganization. uses very little and irrelevant details. reveals serious and frequent structural and usage mistakes. shows no awareness of transition in writing. fails to appropriately link the writing elements.

Supporting ideas = 10

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Main ideas = 10

Table 13.2. Scores allocated to a writing piece main and sub-elements

Total

Addresses 5 ideas

Addresses 4 ideas

Ideas score = 20 Addresses 3 Addresses 2 ideas ideas

Addresses 1 idea

Addresses no ideas

10

8

6

4

2

0

Addressing all aspects of each main idea without repetition

Addressing most aspects of each main idea with few repetitions

Missing many aspects of each main idea with many repetitions

Addressing few aspects of each main idea with many repetitions

Addressing no relevant aspects of each main idea

10 20

8 16

Missing some aspects of each main idea with little repetition 6 12

4 8

2 4

0 0

2. OBJECTIVE TEST SCORING Scoring objective tests, like multiple-choice questions, is an easy task and can be 100% reliable, because each question is assigned a specific point. There is no judgment on part of the scorers. An answer key is developed and provided to scorers, like that one in appendix C. All that the scorers need to do is to match answers on the test with answers on the answer key. There are cases, however, where scoring is faulty when scorers are careless in matching items to the answer key. Therefore, there should be two scorers.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test

145

3. THE CURRICULUM (REFERENCE) TEST SCORING

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Although we provide the whole curriculum test in appendix A, and provide the answer key in appendix C, we give a mini example below. A) Multiple-choice questions: Tick () the correct option in each of the following: (1 point each) 1. Assessment is: a) a process of gathering information about people's ability via several data collection instruments. b) a tool for gathering information about individuals‟ ability by means of tests. c) a process of gathering information about individuals‟ ability by means of one instrument. 2. Formative assessment is a process of gathering information: a. at the end of a learning program or course to measure what the students have achieved. b. during and at the end of a learning program or course to measure what the students have achieved. c. during a program or course to check on the progress achieved and using the resulting information to modify future learning/teaching plans. 3. Which of the following instruments are usually used in summative assessment: a) Informal tests and quizzes. b) Formal tests. c) All of them. d) None of them. B) Matching questions 1. Write terms in column (A) that match the descriptions in column (B): (1 point each)

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

146 N 1

Saad F. Shawer Column (A)

2

3 4

Column (B) Individuals possess an advanced level of the knowledge and skills which enable them to achieve a particular task with greater ability and higher performance. A tool for gathering information about individuals‟ ability in a structured situation to maintain consistency of the test administration and scoring. A tool for gathering information about individuals‟ aspects of personality to reveal a trait or feeling. A tool for gathering information about individuals‟ difficulties, weaknesses and strengths to determine what learning still needs to take place.

Here is the answer key for the above two sections of a test (also see appendix C). A) Multiple-choice questions

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Question Answer

1 A

2 C

3 B

B) Matching questions N 1 2 3 4

Column (A) Proficiency A standardized test A self-reporting measure A diagnostic test

Column (B)

Having obtained precise scores from the test, we turn to chapter 14 to shed the light on the process of test analysis.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 14

STAGE 8: SCORE, ANALYZE AND INTERPRET THE TEST (CONTINUED)

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: What are the steps of creating a simple frequency distribution table? What are the steps of creating a frequency distribution table of grouped scores? What is descriptive statics used for? What is inferential statics used for? Calculate the average, median, standard deviation, range and mode from these scores. 1, 3, 4, 5, 5, 5, 6, 10, 15, 23, 27, and 30. Ten out of 80 students obtained the score of 18 out of 30. Calculate the percentage of these students who obtained 18. 33 percent of students scored 19 on a vocabulary test out of 24 students. If the teacher wants to know the actual number of students, what should he/she do? A teacher wrote a grammar test of 13 questions and assigned each question a point. The teacher wanted the total test score to be 10. A student who answers all the 13 questions will score 13 out of 10,

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

148

Saad F. Shawer which is not logical. Calculate a factor score of a student who obtained 11 out of 13. What should this student obtain out of 10? Use the Excel or SPSS program to make a bar, line and pie chart of the students who obtained these grades F = 5 students, D = 3 students, D+ = 4 students, C = 0 students, C+ = 9 students, B = 1 student, B+ = 2 students, A = 3 students, and A+ = 1 student.

AIM(S) OF THE CHAPTER To analyze the scores obtained from tests and other instruments in the form of findings.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

OBJECTIVES OF THE CHAPTER To create a simple frequency distribution table of ungrouped scores. To create a frequency distribution table of grouped scores. To mention what descriptive statics is used for. To mention what inferential statics is used for. To calculate the average, median, standard deviation, range and mode from these scores. 1, 3, 4, 5, 5, 5, 6, 10, 15, 23, 27, and 30. To calculate the percentage from this example. Ten out of 80 students obtained the score of 18 out of 30. Calculate the percentage of these students who obtained 18. To calculate raw scores and raw numbers from percentages as in this example. 33 percent of students scored 19 on a vocabulary test out of 24 students. If the teacher wants to know the actual number of students, what should he/she do? To calculate a factor score as in this example: a student who obtained 11 out of 13 to mention what should this student obtain out of 10? To use the Excel or SPSS program to make a bar, line and pie chart of the students who obtained these grades F = 5 students, D = 3 students, D+ = 4 students, C = 0 students, C+ = 9 students, B = 1 student, B+ = 2 students, A = 3 students, and A+ = 1 student.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

149

INTRODUCTION This chapter continues the discussion of stage 8 of test construction. Chapter 13 focused on test scoring whereas this chapter discusses the procedures of test analysis. To analyze test scores, teachers and researchers can make use of all or some of the following: 1. Tabulate Scores/ Enter Data into Computer 2. Use Descriptive Statistics 3. Use Inferential Statistics 4. Graphical or Visual Representation of Scores

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. TABULATE SCORES/ENTER DATA INTO COMPUTER Create an ungrouped frequency distribution table to manage the process of test analysis and interpretation. An ungrouped frequency distribution table is used to organize the scores of a small number of scores and often consists of three columns: (a) raw score (b) tally, and (c) frequency (see Table 14.1). The raw score column is allocated for writing raw scores. Arrange raw scores in a descending order on a draft paper. You need then to write the first raw score in the first cell of the raw score column. The tally column is used for inserting tallies that represent the frequency of each score. After writing the first score in the raw score column, inspect the remaining scores on the draft paper. Each time you see a score repeated insert a tally opposite to that score in the tally column. Every time you insert a tally cross out the score from draft paper. Repeat this step till all scores are inserted in the raw score column and all tallies that represent them are also inserted in the tally column. The frequency column is used to turn the tallies of scores into numbers. Count the tallies of each cell in the tally column and write their number in each frequency cell of the frequency column. Table 14.1 shows the scores, tallies and frequencies of 20 students who took a reading test. Table 14.1 shows the frequency of scores of 20 students. Out of these 20 students, two students scored 4 out of 30, two obtained 7, one obtained 9, another obtained 10, three scored 15, one obtained 17, three obtained 19, two obtained 22, two obtained 25, two obtained 27 and one scored 29 out of 30.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

150

Saad F. Shawer Table 14.1. A frequency distribution table of ungrouped scores

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Scores 4 7 9 10 15 17 19 22 25 27 29 Total test score = 30

Tally // // / / /// / /// // // // /

Frequency 2 2 1 1 3 1 3 2 2 2 1 20 (students' number)

Teachers will definitely need to create a frequency distribution table of grouped scores to manage the scores of a large number of students because a simple frequency table will not be practical. Teachers can follow these steps to create a grouped frequency distribution table: Calculate the range by subtracting the smallest score from the highest plus one. Divide the range value by the number of proposed classes. Use this simple formula: Range _________________ = class interval Proposes classes To make sure your calculation is correct subtract the smallest score from the highest score of the first class of scores plus one. If the resulting score matches the class interval you calculated, then your calculation is correct. Round up fractions to the nearest integer. Table 14.2 shows the scores of 70 students that range between 17 and 53. The range was first calculated: 53 − 17 = 36 + 1 = 37. We second divided the range value (37) by the number of proposed classes. We suggested 10 classes between which all scores will fall. Here is the calculation:

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

151

Table 14.2. A frequency distribution table of grouped scores Class 17− 20 21 − 24 25 − 28 29 − 32 33 − 36 37 − 40 41 − 44 45 − 48 49 − 52 53 − 56 Total

Tally // //// //// // //// /// ///// ///// ///// ///// ///// ///// /// //// /// ///// //// // 70

Frequency 2 4 6 7 15 18 7 5 4 2 70

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

37 ___________ = 3.7 rounded up to (4) 10 It is clear that class interval is 4. We need to start from the smallest score 17 to form the first class interval by adding 4 to 17. The first class will be 17 – 20, the second will be 21 – 24 and so on till we include the highest score 53. Every time we added 4. To make sure the calculation of class interval is correct, we subtracted the smallest score from the highest score of the first class interval plus one (20 – 17 = 3 + 1 = 4). Indeed, there is no need these days to create manual frequency tables, because computers do the job more efficiently. A computer excel, SPSS or another spreadsheet will enable you to tabulate whatever scores you may have.

2. USE DESCRIPTIVE STATISTICS Descriptive statistics is used for describing the data a test and other data collection methods yield. This usually involves statistical techniques like the mean, median, mode, range, percentages, and standard deviation. It just gives estimates of the status quo without making inferences about that data. Most often teachers and researchers need to know the typical score of students so that they can compare students' scores, as the case in norm-referenced tests. Moreover, teachers need to know the extent to which scores disperse or

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

152

Saad F. Shawer

deviate from the average. Therefore, teachers need to use descriptive statistics like those discussed below to analyze and interpret test scores of their classrooms.

Calculate the Average/Mean Though all calculations could and should be done using computers, there are times that we can do simple but important calculations manually. The average could be calculated from grouped and ungrouped scores. If scores are ungrouped, add them up and divide them by students' number as follows: Sum of scores Average = __________________ Number of students For example, 17 students who took a writing test obtained the scores shown in the box below.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Scores

59 72

61 85

71 75

60 67

62 70

74 62

86 78

59 81

73

Total 1195

The average can be calculated as 1195 ÷ 17 = 70. The average score is, therefore, 70. If scores are grouped you need to: (a) multiply each score by its frequency; (b) add up score × frequency; and (c) divide the outcome (sum of score × frequency) by student number as follows: Sum of (score × frequency) Average= ______________________ Number of students Table 14.3 shows the 17 students' scores who took a writing test. We divided 1189 ÷ 17 = 69. 94 rounded up to 70.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

153

Table 14.3. Grouped scores for calculating the average Score

Frequency

60 62 67 70 71 72 75 80 81 85 Test score 100

5 1 1 1 1 2 2 1 1 2 17

Score × frequency 300 62 67 70 71 144 150 80 81 164 1189

Average

1189 ÷ 17 = 69. 94 rounded up to 70

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Calculate the Average from Grade Point Average (GPA) Teachers also need to calculate the average of a particular grade point average (GPA). They first need to have each student's score and the letter grade that represents it available. These are shown in Table 14.4. Table 14.4. Seventeen students' grades student 1 score 60 grade D

2 60 D

3 71 C

4 60 D

5 62 D

6 75 C+

7 85 B+

8 60 D

student 9 72 score grade C

10 72 C

11 85 B+

12 75 C+

13 67 D+

14 70 C

15 60 D

16 80 B

17 81 B

They second calculate grade point average. Table 14.5 shows the grade point average calculations of the 17 students. The table is organized as follows: write grades in letters (column 1)

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

154

Saad F. Shawer write the numeric value (GPA) of the letter grade (column 2) write the frequency of students who obtained the same grade multiply the frequency of students who got the same grade by GPA value add up (GPAs × frequency of students sharing a grade) to get the total divide that total by the number of students to get the average Table 14.5. Grade point average calculations

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Grade A B+ B C+ C D+ D F Total

GPA 4 3.5 3 2.5 2 1.5 1 0

GPA × F 0×4=0 2 × 3.5 = 7 2×3=6 2 × 2.5 = 5 4×2=8 1 × 1.5 = 1.5 6×1=6 0×0=0 33.5

Frequency (F) 0 2 2 2 4 1 6 0 17

GPA × F ÷ sum of F

33.5 -------- = 1.97 (2) 17

Average = 1.97

Table 14.5 shows that the average is 1.97 (rounded up to 2). Since the minimum GPA is 1, while the maximum is 4, we can say that the majority of students' performance was round the average since their average score of 2 (1.97) matched the GPA average.

Calculate the Mode Since the mode is the most frequent score (Charles, 1988), the mode of the 17 scores listed in table 14.4 above is 60 because it is the score that is repeated most.

Calculate the Median The median is the middle score. To calculate the median of ungrouped scores, teachers need to arrange all scores in an ascending or descending order as shown below. 60

60

60

60

60

62

67

70

71

72

72

72

75

80

81

85

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

85

Stage 8: Score, Analyze and Interpret the Test (Continued)

155

If the number is odd, choose the middle score. If the number is even, calculate the average of the two middle scores. The median of the above 17 students‟ scores is 71, which is close to the above calculated average 70 (see table 14.3 above).

Calculate the Range Central tendency measures (average, median, and mode) do not show the whole picture of scores a test yields. They only show the typical score of a group, but they do not show the dispersion of scores. Variability measures such as the range and standard deviation show the extent to which scores disperse. According to Charles (1988, p. 155), range is a simple measure of variability. It is the difference between the highest and lowest score. The formula is (R = (highest score – lowest score + 1)). We can calculate the range from the scores shown in the following box.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

A 60 B 50

62 52

67 53

70 56

71 57

72 58

75 59

80 60

81 61

85 62

R = 85 – 60 = 25 + 1 = 26 R = 62 – 50 = 12 + 1 = 13

The calculation for row A scores is: R = 85 – 60 = 25 + 1 = 26. This range shows a high spread or variability of scores (26). By contrast, the calculation for row B scores is: R = 62 – 50 = 12 + 1 = 13. This range of 13 shows low variability, dispersion or distance of scores from the average. However, range is not enough to show variability or score dispersion. The standard deviation does a better job in this respect. Table 14.6. Five students' scores for the standard deviation calculation Score 3 4 5 5 8 25

mean

5

mean – score (deviation) 5–2=3 5–4=1 5–5=0 5–5=0 5 – 8 = -3

Squared (deviations) 9 1 0 0 9 19

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

156

Saad F. Shawer

Calculate the Standard Deviation (S.D.) Standard deviation shows the extent to which scores deviate from the mean. Hughes (2003) gives an example of two groups of students. The first group scored 48, 49, 50, 51 and 52 whereas the second scored 10, 20, 40, 80 and 100. Though both groups had the same average 50, the dispersion or distribution of scores was different. Scores of the first group are close to the mean, whereas those of the second group are far from the mean. Therefore, the average alone is misleading. Hughes (2003, p. 221) explains "just as the mean can be seen as a typical score on a test, the standard deviation can be seen as a typical distance from the mean." Table 14.6 shows the calculations of the standard deviation of five students' scores on a 10 scores test. Standard deviation can be calculated through this formula:

S.D. =

2 _____

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

N

First, we calculated the average: 25 ÷ 5 = 5. Second, we subtracted each score from the mean (deviation), for example 5 – 2 = 3 (column 3). Third, we squared each deviation. For example, 3 became 9 (column 4). Fourth, we added up these squared deviations: 9 + 1+ 9 = 19 (column 4). Fifth, we divided the total of squared deviations by the sample number minus one (19 ÷ 4 = 4.75). For a sample less than 15, the sum of deviations are divided by N – 1. For samples over 15 the sum of deviations are divided by N. Sixth, we obtained the square root of 4.74: 4.75 = 2.18. This standard deviation shows that scores disperse round the mean (5).

Calculate the Percentage A percentage is calculated through dividing a score by the sum of scores and multiplying the result by 100 through this formula: Score ___________ × 100 = Sum of scores

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

157

Table 14.7. The percentages of 24 students' scores Scores 4 10 15 19 27 29 Test score 30

Frequency (students) 1 4 6 8 2 3 24

Percentage 1 ÷ 24 × 100 = 4.17% 4 ÷ 24 × 100 = 16.7% 6 ÷ 24 × 100 = 25% 8 ÷ 24 × 100 = 33% 2 ÷ 24 × 100 = 8% 3 ÷ 24 × 100 = 12.5% 100%

Table 14.7 shows the percentage of 24 students who obtained scores on a vocabulary test out of 30. Table 14.7, for example, shows that the single student who scored 4 out of 30 represents about 4 percent of the 24 students. By contrast, the eight students who scored 19 represent 33 percent of the 24 students.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Calculate the Raw Score or Number of Students from a Given Percentage Sometimes teachers need to know the actual number of students or actual score from a given percentage. In this case, teachers need to multiply the percentage by the total number of students or scores and divide it by 100 using this formula: Percentage × total number (of students/scores) __________________________________________ 100 For example, Table 14.7 above shows that 33 percent of students scored 19 on a vocabulary test out of 24 students. If the teacher wants to know the actual number of students, he or she has to use the above formula as follows: 33% × 24 ____________= 7.92 (rounded up to 8) 100

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

158

Saad F. Shawer Table 14.8. Factor scores of students on a grammar test

Student score 6 8 13

Maximum score on the test

Assigned new score

Factor score

13

10

6 ÷13 × 10 = 4.5 8 ÷13 × 10 = 6 13 ÷13 × 10 = 10

It is clear now that the real number of 33% of students is 8, which matches the real number in Table 14.7 above.

Calculate a Factor/ Converted Score

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

At other times, teachers need to calculate a factor score. For example, a teacher wrote a grammar test of 13 questions and assigned each question a point. The teacher wanted the total test score to be 10. A student who answers all the 13 questions will score 13 out of 10, which is not logical. You need to calculate a factor score as follows. Student's score ______________ × assigned new score = Total test score Table 14.9. Twenty students‟ grades Scores 4 7 9 10 15 17 19 22 25 27 29

Students 2 2 1 1 3 1 3 2 2 2 1

Grade F F F F F F D C B A A+

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

159

Table 14.8 shows the factor scores of students on the grammar test. Table 14.8 shows the student's score of 13 has been converted to 10 out of 10. It is as if the student scored 10 out of 10. Similarly, the student's score of 8 out of 13 has been converted to 6 out of 10.

3. USE INFERENTIAL STATISTICS In many cases, researchers and teachers also need to use inferential statistics to compare the means of groups of students in order to generalize findings from samples to the populations from which samples were drawn. In other words, they seek to generalize the characteristics found in a sample drawn from a population to the whole population from which this sample was drawn. Depending on their purposes, teachers and researchers can use various types of statistical techniques including, chi-square, t-tests, analysis of variance and others.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

4. GRAPHICAL OR VISUAL REPRESENTATION OF SCORES We represent the scores from Table 14.9 graphically in the following three ways to interpret the scores more clearly:

Representation of data through a bar chart Representation of data through a pie chart Representation of data through a curve chart

Representation of Data through a Bar Chart Representing the data visually through graphs like histograms, bar, pie, line charts, and other graphs make the results or figures clear. Figure 14.1 shows the grades of the 20 students on a criterion-referenced test on a bar chart. The vertical axis shows the number of students scoring within a specific range of scores. The horizontal axis shows the grades, which represent the number of students who obtained a particular score.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

160

Saad F. Shawer

10 8 6 Grade frequency 4

Series1

2 0 A+ A B+ B C+ C D+ D F Grades Figure 14.1. A bar chart of 20 students‟ grades on a criterion-referenced test.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Column F shows that the largest number of students (10) failed the test, whereas column D shows that the largest number of students who passed the test obtained grade D. Columns D+, C+, and B+ show no student obtained grades D+, C+, or B+. Almost an equal number of students (2) obtained grades C, B, and A as columns C, B, and A show. Finally, only a single student obtained the A+ grade as column A+ indicates.

F A+ A B+ B C+ C D+ D

Figure 14.2. A pie chart of 20 students‟ grades on a criterion-referenced test.

Representation of Data through a Pie Chart Figure 14.2 shows the grades of the same 20 students listed in table 14.9 who took a test on a pie chart. The pie chart is consistent with the bar chart as

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

161

Stage 8: Score, Analyze and Interpret the Test (Continued)

the big dark blue part of the pie shows that the largest number of students (10) obtained grade F. Other parts of the pie reflect the same grades as indicated by Figure 14.2.

Representation of Data through a Curve Chart

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Figure 14.3 shows the grades of the same 20 students, who took a criterion-referenced test, on a line/ curve chart. The line chart matches the results represented by both the bar and pie charts. It also shows the largest number of students (10) obtained grade F, whereas other points of the curve also match those of the bar and pie charts. This curve is not normally distributed, as the majority of the scores are not clustered round the center. Rather, the majority of scores are clustered at the low scores part of the curve (negatively skewed). This indicates a problem with the test itself. It tells the test was too difficult for most students. It could mean the test was fine but instruction was a failure. We now move to chapter 15 to interpret or make sense of the scores and findings we get from tests and other data collection methods.

12 10 8 Grade 6 frequency 4 2 0 A+

A

B+

B

C+

C

D+

D

Grades

Figure 14.3. A line chart of 20 students‟ grades on a criterion-referenced test.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

F

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 15

STAGE 8: SCORE, ANALYZE AND INTERPRET THE TEST (CONTINUED)

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

PRE-READING REFLECTIONS Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Raw scores 4 7 9 10 15 17 19 22 25 27 29

Score frequency (students) 2 2 1 1 3 1 3 2 2 2 1

What steps should be followed to interpret a norm-referenced test? What steps should be followed to interpret a criterion -referenced test? Can we interpret a single test as a norm- and criterion-referenced test?

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

164

Saad F. Shawer What are the differences between norm- and criterion-referenced test interpretations? Interpret the scores listed in the following table according to normand criterion-referenced test interpretations.

AIM(S) OF THE CHAPTER To interpret test scores according to norm- and criterion-referenced test interpretations.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

OBJECTIVES OF THE CHAPTER To mention the steps used for interpreting a norm-referenced test. To mention the steps used for interpreting a criterion-referenced test. To interpret a single test first as a norm-referenced test and second as a norm- referenced test. To compare between norm- and criterion-referenced test interpretations. To interpret the scores listed in the following table according to normand criterion-referenced test interpretations. Raw scores 4 7 9 10 15 17 19 22 25 27 29

Score frequency (students) 2 2 1 1 3 1 3 2 2 2 1

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

165

INTRODUCTION Like chapters 13 and 14, this chapter continues the discussion of stage 8 of test construction. Chapter 13 focused on test scoring, chapter 14 discussed the procedures of test analysis. This chapter discusses the process of test interpretation according to the following order: 1. Interpret Norm-referenced Test Scores 2. Interpret Criterion-referenced Test Scores

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. INTERPRET NORM-REFERENCED TEST SCORES Test developers interpret their tests as norm-referenced to compare a student‟s performance with those students who took the same test. For example, a student out of 20 students took a grammar test and obtained 25 out of 30. This score (25) will be compared with other students‟ scores. A student‟s score may place him/her at the top or bottom 10 percent group or at the average group of students. This means that a student‟s score or performance is just compared to other students‟ scores rather than what the student can actually do. This also means that a student might obtain a low score and his score is fine when compared with others if the student who obtained the highest score is not high (e.g., 15 out of 30). To interpret the scores obtained by students on a norm-referenced test, teachers should follow these steps: (A) Create an Upward Frequency Table (B) Calculate the Average (C) Calculate the Percentage

(A) Create an upward Frequency Table An upward frequency table consists of the same three columns that grouped and ungrouped frequency tables comprise in addition to an upward cumulative frequency column and a percentage cumulative frequency column. Test developers or teachers make an upward cumulative frequency column through a progressive or cumulative process in which they add up each

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

166

Saad F. Shawer

frequency to the next. Moreover, test developers or teachers make a percentage cumulative frequency column by calculating percentages from the upward cumulative frequency column. Table 15.1 shows the scores of the 20 students who took the reading test (out of 30) in an upward frequency table. Before making sense of this table, we need to calculate the average first as shown below.

(B) Calculate the Average After creating the upward cumulative frequency table, teachers need to calculate the average score of the group who took the same test to establish a comparison norm against which scorers could be compared. We calculated the average from grouped scores in simple steps (See chapter 14, section 2 above for average calculations). Table 15.2 shows the steps used in calculating the average. Teachers need first to multiply each raw score by its frequency. They need to add up the multiplications to get a total of (score × frequency) and divide this sum by the number of students. Teachers and test developers could use this formula to calculate the average:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 15.1. An upward frequency table Raw scores

Tally

Score frequency (students)

Upward cumulative frequency

Percentage cumulative frequency

Raw score percentage

4 7 9 10 15 17 19 22 25 27 29

// // / / /// / /// // // // /

2 2 1 1 3 1 3 2 2 2 1

2 4 5 6 9 10 13 15 17 19 20

10 20 25 30 45 50 65 75 85 95 100

13.33 23.33 30 33.33 50 56.66 63.33 73.33 83.33 90 96.66

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

167

Table 15.2. Average calculations of 20 students Raw scores Score frequency (students) 4 2 7 2 9 1 10 1 15 3 17 1 19 3 22 2 25 2 27 2 29 1 Test score 30 20 Average = 337 ÷ 20 = 16.85 (rounded up to 17)

Score × frequency 8 14 9 10 45 17 57 44 50 54 29 337

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Sum of (score × frequency) ________________________ = Average Total number of students The average was calculated (337 ÷ 20 = 16.85 (rounded up to 17)). This average (17) formed a standard against which we compared students' performance on the test. Students' scores were compared as above or below this average. As pointed out above, we interpret norm-referenced test scores through comparing a student‟s score with the scores of students who took the same test. This means that no cut-off pass score in norm-referenced tests is used. The passing score is determined after test scoring and average calculation to compare students‟ scores against the calculated average. Table 15.1 above shows a comparison between the scores of 20 students. For example, the two students who scored 4 failed the reading test because their score fell far below the average score (17) of the group. This means they fell in the bottom 10 percent category of students, since 90 percent (100% – 10 % = 90 %) of the students‟ scores exceeded their score (4). By contrast, the two students who scored 27 fell in the top 5 percent group who took the reading test since 95 percent (100% – 5 % = 95 %) of the students scored below their score. Table 15.3 below shows the interpretation of all students‟ scores on the reading test.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

168

Saad F. Shawer

Table 15.3 shows that those students scoring 4, 7, 9, 10, and 15 on the reading test, failed that test because their scores fell below the average (17).

2. INTERPRET CRITERION-REFERENCED TEST SCORES Examiners including teachers and researchers use criterion-referenced tests to determine whether examinees are able to reach a cutoff on a test. For example, if students take a test for which the cutoff pass score is 80 percent, then all students who obtain less than 80% fail, whereas those obtaining 80% or higher succeed. It does not matter whether all students fail or succeed. Unlike norm-referenced tests, comparing a student‟s performance with those taking the same test has no place in the assessment process here. As Hughes (2003, p. 21) point out, criterion-referenced tests have two positive virtues:

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 15.3. The interpretation of 20 students‟ scores on a reading test according to a norm-referenced test Raw Score scores frequency (students) 2 4

Upward cumulative frequency 2

Percentage cumulative frequency 10

Result

Fail

7

2

4

20

Fail

9

1

5

25

Fail

10

1

6

30

Fail

15

3

9

45

Fail

17

1

10

50

Pass

19

3

13

65

Pass

22

2

15

75

Pass

25 27 29

2 2 1

17 19 20

85 95 100

Pass Pass Pass

Bottom 10%, 90% above them Bottom 20%, 80% above them Bottom 25%, 75% above him/ her Bottom 30%, 70% above him/ her Bottom 45%, 65% above them Middle 50%, 49.5% above & below him/her Upper 45%, 65% below them Upper 25%, 75% below them Top 15%, 85% below them Top 5%, 85% below them Top 1%, 99% below him/ her

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Test (Continued)

169

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

1. they set meaningful standards in terms of what people can do, which do not change with different groups of candidates, and 2. they motivate students to attain those standards Interpreting the same 20 students‟ scores who took the reading test according to criterion-referenced testing would definitely yield different results. However, this depends on the grading system (criterion) adopted. Table 15.4 shows a grading system of 30 scores and its equivalence on a 100 score scale. If this system was adopted the scores shown in table 15.1, 15.2, or 15.3 could now be interpreted differently. For example, Table 15.3 shows that the two students who scored 4 (13.33%) failed the reading test because they could not reach the cutoff pass score 18 (60%) (see Table 15.4 below). Their score reached only 13%, whereas the passing percentage is 60% of the total test score (30). These two students, therefore, obtained grade F (F = a score between 0 and 18 or 0 and 60%). By contrast, the single student who scored 29 (96.66%) succeeded. She got grade A+ (A+ = a score between 28.5 and 30 or 95% and 100%). Row F indicates that 10 students failed the test for obtaining grade F. This means 10 out of 20 students obtained a score that did not reach the test criterion (60 on a scale out of 100 or 18 on a scale out of 30). Row D indicates that three students obtained grade D, whereas row D+ shows no student obtained grade D+. Row C indicates that two students obtained grade C but row C+ indicates no student obtained grade C+. Row B shows two students obtained grade B while row B+ indicates no student obtained grade B+. Row A shows two students had grade A, whereas row A+ indicates only one student obtained A+. Table 15.4. A grading system Grade A+ A B+ B C+ C D+ D F

Out of test overall score (30) 28.5-30 27-28.2 25.5-26.7 24-25.2 22.5-23.7 21-22.2 19.5-20.7 18-19.2 17.7

Out of 100 95-100 90-94 85-89 80-84 75-79 70-74 65-69 60-64 59

Number 1 2 0 2 0 2 0 3 10

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

170

Saad F. Shawer

Table 15.5 below shows the grades of the 20 students according to a criterion-referenced test analysis and interpretation. A comparison between Table 15.1 above and Table 15.5 below makes the interpretation of scores on norm and criterion referenced tests clear. According to a norm-referenced test interpretation, a close look at Table 15.1 above shows that the students who scored 9, 10, and 15 failed the reading test because their scores were below the average (17). In contrast, the student who scored 17 succeeded the reading test since his score matched the average. On the contrary, Table 15.5 below indicates that based on a criterion-test interpretation, the very student who passed the reading test failed the very reading test because his score of 17 did not reach the cutoff pass criterion of 18. We now turn to the final chapter to provide actual analysis and interpretation of the curriculum test that acted as our reference test throughout this book.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 15.5. Students' grades Scores 4 7 9 10 15 17 19 22 25 27 29

Students 2 2 1 1 3 1 3 2 2 2 1

Grade F F F F F F D C B A A+

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Chapter 16

STAGE 8: SCORE, ANALYZE AND INTERPRET THE CURRICULUM TEST PRE-READING REFLECTIONS

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Before starting to read this chapter, attempt to jot down your thoughts about the issues in the box. Please try not to look at the content of the chapter. Just write down your initial thoughts on a sheet of paper. There is no wrong or right answer at this stage: Score the curriculum test in appendix A using the answer key in appendix C. Using a computer SPSS data entry spreadsheet, key in the test scores. Calculate the average test score, the sum of scores, the top 10 percent and the bottom 10 percent of students' scores. Using the classification of students into three groups of low, average, and high Self-Regulated Learning (SRL) students calculate the ANOVA to compare the differences between the groups. Interpret the scores of the students according to norm and criterion referenced testing, telling the difference between the two interpretations. Represent student grades graphically in the form of a histogram, pie chart, and a line chart.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

172

Saad F. Shawer

AIM(S) OF THE CHAPTER To score, analyze and interpret the scores of the curriculum test.

OBJECTIVES OF THE CHAPTER

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

To score the curriculum test in appendix A, using the answer key in appendix C. To key in the test scores into a computer SPSS data entry spreadsheet. To calculate the average test score, the sum of scores, the top 10 percent and the bottom 10 percent of students' scores. To use student classification into three groups of low, average and high SRL students to calculate the ANOVA (compare the differences between the groups). To interpret the scores of the students according to norm and criterion referenced test interpretations telling the difference between the two interpretations. To represent the grades of the students graphically in the form of a histogram, pie chart, and a line chart.

INTRODUCTION Chapters 13, 14, and 15 discussed the procedures of scoring, analyzing and interpreting tests (stage 8 of test construction). Precisely, chapter 13 focused on scoring tests, chapter 14 focused on test analysis, while chapter 15 focused on test interpretation. Based on Shawer (2010c), this chapter provides actual examples of how stage 8 was put into action through the curriculum test (our example). The curriculum test scoring, analysis, and interpretation proceeds as follows: 1. Curriculum Test Scoring 2. Curriculum Test Analysis 3. Curriculum Test Interpretation

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Stage 8: Score, Analyze and Interpret the Curriculum Test

173

1. CURRICULUM TEST SCORING Scoring the curriculum test was easy and strictly reliable because the test was of the objective type. Each question was assigned a specific point, which required no scorer‟s judgment. An answer key was developed and piloted where two curriculum instructors answered the test in addition to the author (see appendix C). The author and two instructors answered the 60 questions in line with the key apart from one who chose a wrong option. After discussing the answer with him, he agreed his answer was incorrect. All students answered in a standardized answering-sheet (see appendix D).

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

2. CURRICULUM TEST ANALYSIS The author analyzed the test first by entering the data into the computer. There was no need to create frequency tables. Second, he calculated descriptive statistics, including percentages, mean and standard deviation. Further, the author employed inferential statistics, including planned and posthoc analysis of variance (ANOVA) to test group differences for significance by comparing the means of the groups. Throughout the analysis process, the author used SPSS, version 14. The author first classified student-teachers (research sample) into three groups of low, average and high self-regulated learning (SRL) groups. He wanted to know if there were differences among them in their SRL. The author asked the students to complete the Motivated Strategies for Learning Questionnaire (MSLQ) developed by Pintrich, Smith, Garcia, and McKeachie (1991). He then used descriptive statistics in the form of percentages to classify students into the three groups he wanted. Table 16.1 shows student score percentages based on the total score each student obtained on the MSLQ divided by the maximum possible score a student could achieve on it multiplied by 100. The minimum possible score a student could achieve on the MSLQ was: (81 (items) × 1 (minimum possible score)) = 81. The maximum possible score a student could achieve was (81 (items) × 5 (maximum possible score)) = 405. Data analysis of the MSLQ resulted in classifying students into a low SRL group (40 students scoring between 30 and 54%), an average SRL group (40 students scoring between 55 and 64%) and a high SRL group (40 students scoring between 65 and 84%). This way, the students were categorized into three groups.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

174

Saad F. Shawer

Table 16.1. A breakdown of 120 students‟ percentages on the MSLQ

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Group 1: Low SRL No. % N % o. 30 50 35 38 39 40 42 43 43 43 43 46 46 47 47 48 48 48 49 49 50

50 50 51 51 52 52 52 52 53 53 53 53 53 54 54 54 54 54 54

Group 2: Average SRL

Group 3: High SRL

No.

No.

%

No.

%

%

No.

%

57

60

65

70

58 58 58 58 58 58 59 58 59 59 59 59 59 59 60 60 60 60 60

61 61 61 61 62 62 62 62 62 62 62 63 63 63 63 63 63 63 63

65 65 65 65 66 66 66 66 67 67 67 67 68 68 68 68 68 69 69

70 70 71 72 72 72 73 73 75 76 77 80 80 80 81 81 81 81 82

Table 16.2 shows students‟ number, actual minim and maximum score on the MSLQ (122 and 331 respectively), minimum and maximum possible score a student could achieve on the MSLQ (81 and 405), mean for the whole sample and standard deviation. Table 16.2. Descriptive statistics of 120 students SRL

No. 120

Minimum Maximum Sum 122 331 29080 81 405

Mean 242.3333

Std. Deviation 43.13908

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

175

Stage 8: Score, Analyze and Interpret the Curriculum Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table 16.3. Individual groups‟ descriptive statistics (SRL) Group

Mean

N

Std. Deviation

Sum

Minimum

Maximum

1

194.2000

40

23.62875

7768

122

220

2

244.6500

40

8.19803

9786

232

257

3

288.1500

40

22.76249

11526

262

331

Total

242.3333

120

43.13908

29080

Moreover, Table 16.3 shows individual group means, numbers, and standard deviations. Though descriptive statistics showed differences between group means, these differences were further tested for significance (inferential statistics) using an independent between-groups and post-hoc one-way ANOVA to ensure the differences were true. However, ANOVA assumptions should be met first to check it is the appropriate test to test significance of differences between the groups. The ANOVA population normality and variance homogeneity assumptions were maintained. Table 16.4 shows that homogeneity assumptions were not violated since the Levene‟s F-ratio was not significant (p ˃ .05). The null hypothesis was therefore accepted that group variances were equal. Moreover, Table 16.5 shows the three groups were drawn from a normally-distributed population (another required assumption) since a Kolmogorov-Smirnov statistic with a Lilliefors significance level was greater than .05 (p ˃ .05), which assumed normality (Coakes, & Steed, 2007; Shawer, 2010d, 2008e). Table 16.4. Test of Homogeneity of Variances (SRL) Levene Statistic .812

df1 2

df2 117

Sig. .446

Table 16.5. Tests of Normality

SRL

Kolmogorov-Smirnov(a) Statistic df Sig. .073 120 .180

Shapiro-Wilk Statistic df .989 120

Sig. .428

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Histogram

20

Frequency 15

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

10

5

Mean =242.3333 Std. Dev. =43.13908 N =120

0 150.00

200.00

250.00

300.00

LearningSelfregulation

Figure 16.1. Samples drawn from a normally-distributed population.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test Design, Administration, Scoring, Analysis, and Interpretation : The Complete .

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Stage 8: Score, Analyze and Interpret the Curriculum Test

177

The histogram in Figure 16.1 also shows graphically that the sample was drawn from an almost normally-distributed population. The histogram in figure 16.1 shows the distribution was normal. For example, the first bar value of 150 and the second bar value of 200 indicated that each bar covered a range of 50. Table 16.6 shows a significant ANOVA F-ratio (p < .01), which meant the three groups differed in SRL. Having obtained a significant ANOVA value, the Scheffe post-hoc test of multiple comparisons was used to determine where the differences lied. Table 16.7 shows significant F-ratios (p < .01). Post-hoc comparisons clearly indicated that group three self-regulated their learning more than group one and two. The comparison also showed that group two outperformed group one in SRL. We can now conclude that student-teachers differed in their self-regulation of learning. This finding gave strong ground to examine the relationship between SRL and student curriculum achievement and course design skills. The author also wanted to examine if student-teachers‟ low, average and high self-regulated learning levels result in differences in their curriculum achievement and course design skills. The author needed to use one-way between-groups ANOVA with planned comparisons (inferential statistics) for making expectations that high SRL students would outperform average and low SRL counterparts in their curricular-content knowledge and course design skills. Table 16.8 shows the ANOVA F-ratio was not significant (p > .05). The null hypothesis was therefore accepted that the three groups (low, average and high SRL) did not differ in curricular-content knowledge and course design skills. Table 16.6. ANOVA (SRL)

Between Groups Within Groups Total

Sum of Squares 176854.067 44602.600 221456.667

df 2 117 119

Mean Square 88427.033 381.219

F

Sig.

231.959

.000

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

178

Saad F. Shawer

Scheffe

Table 16.7. Multiple Comparisons (Dependent Variable: SRL) (I) (J) Mean Groups Groups Difference (I-J) 1.00 2.00 -50.45000(*) 3.00 -93.95000(*) 2.00 1.00 50.45000(*) 3.00 -43.50000(*) 3.00 1.00 93.95000(*) 2.00 43.50000(*)

Std. Error Sig. 4.36588 4.36588 4.36588 4.36588 4.36588 4.36588

.000 .000 .000 .000 .000 .000

95% Confidence Interval Lower Upper Bound Bound -61.2749 -39.6251 -104.7749 -83.1251 39.6251 61.2749 -54.3249 -32.6751 83.1251 104.7749 32.6751 54.3249

Table 16.8. One-way between-groups ANOVA with planned comparisons (test scores)

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Between Groups Within Groups Total

Sum of Squares 791.150

df

Mean Square

F

Sig.

2

395.575

1.784

.172

25942.150 26733.300

117 119

221.728

The alternative hypothesis assuming group differences was therefore rejected. This surprising result suggested that high SRL did not improve student academic achievement because high SRL students performed equally as average and low SRL students in curricular-content knowledge and course design skills. The author can now conclude that student-teacher self-regulated learning levels did not result in differences in their academic achievement (curricular-content knowledge and course design skills). The author further wanted to examine if low-achieving student-teachers could be high SRL students; and if high-achieving student-teachers could be low SRL students. As shown in Table 16.9, the 120 students were re-grouped into low, average and high achievers (not SRL) to check if each of these groups could have low, average and high SRL students.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

179

Stage 8: Score, Analyze and Interpret the Curriculum Test Table 16.9. 120 students‟ scores grouped into low, average & high achievers

Groups

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Low achievers Average achievers High achievers Total

Score range

No.

Low SRL

high SRL

% 39%

Average SRL N % 13 29%

0 - 39

44

N 17

N 14

% 32%

40 - 59

57

19

33%

18

32%

20

35%

60 - 80

19

4

21%

9

47%

6

32%

120

40

40

40

Table 16.9 shows the 120 students were regrouped based on their test scores to low-achieving (44 students), average-achieving (57) and highachieving (19). The low-achieving group comprised students scoring between zero and 39 on the test. The average-achieving group included students scoring between 40 and 59 whereas high-achieving students scored between 60 and 80. The table shows that low-achieving students were similar to high and average-achieving students in SRL. Out of 44 low-achieving students 17 (39%) were low SRL, 13 (29%) average SRL, and 14 (32%) high SRL students. This meant that 32% of the low achievers were high SRL students while 29% of them were average SRL students! Table 16.9 also shows that average-achieving students were also similar to high and low achieving students in SRL. Out of 57 average achievers 19 (33%) were low SRL, 18 (32%) average SRL and 20 (35%) high SRL students. This meant that 35% of the average achievers were high SRL learners while 33% of them were low SRL students! Moreover, Table 16.9 shows that high-achieving students were similar to low and average achieving students in SRL. Out of 19 high-achievers 4 (21%) were low SRL, 9 (47%) average SRL and 6 (32%) were high SRL students. This meant that only 32% of the high achievers were high SRL students, 21% of them were low SRL students and 47% of the high achievers were average SRL learners! The author then concluded that the three low, average and high achieving groups were almost similar in SRL.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

180

Saad F. Shawer

3. CURRICULUM TEST INTERPRETATION This curriculum test was interpreted as a criterion-referenced test through comparing students' performance against cut-off scores (e.g., 0 – 39 = F and 75 – 80 = A+) rather than with students who took the same test (Hughes, 2003). Table 16.10 shows actual scores on the test out of 80. The two students‟ score of 70 was far above the cutoff pass criterion (40). Their score placed them in category A, since it met the cutoff criterion 70-74. The six students who scored 52 also exceeded the cutoff score, placing them in category C. In contrast, the three students‟ score of 30 was below the cutoff (40), placing them in category F. This meant the two students who scored 70 and the six who scored 52 passed the test whereas the three students who scored 30 did not pass. Table 16.10. Interpretation of scores based on a criterion-referenced test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Grade Student frequency score

70

2

52

6

30

3

A+ A B+ B C+ C D+ D F

%

mean

N

cutoff Test Sum overall of test score scores

75-80 70-74 1.7(2) 65-70 60-64 55-60 44.65 120 40 50-54 5 45-50 40-44 0-39 2.5

80

53.58

Of course, the same test scores could be interpreted as norm-referenced. As Table 16.10 shows, the two students‟ score of 70 was far above the average (44.65). Their score placed them at the top 2 percent of students who took the same test. The six students‟ score of 52 was somehow above the average, placing them at the above-average 5 percent group. The three students‟ score of 30 was far below the average, placing them at the below-average 2.5 percent group. Therefore, those students scoring 70 and 52 passed the test whereas those scoring 30 did not. Had the average score been 30, those three students who failed could have passed.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

APPENDIX A THE CURRICULUM ACHIEVEMENT TEST College of Education End-of-term exam (80 points) Curricula planning & Development Time: 90 min.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Name:………………………… Student ID……………..……………… Choose the correct answer by inserting A, B through P in the correct box of each question in the answer-sheet. Each question has only one correct answer. Submit the answer-sheet and keep the questions sheet for yourself. A) Multiple-choice questions: (Choose one answer only) (1 point each) (Total: 40) 1. Curriculum could be referred to as: a) The evolution of curriculum in the processes of planning, implementation and evaluation. b) The creation of a totally new curriculum. c) The arrangement of the curriculum major components (e.g., content). d) A written plan for action which includes strategies for achieving desired goals or ends. e) The selection and gradation (sequencing) of instructional content.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

182

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

f)

A school‟s responsibility over the planning, design, implementation & evaluation of a study program for its students. 2. An instructional program determines the: a) Learning that students should acquire (objectives). b) Learning activities which result from curriculum implementation. c) Criteria of student admission (entry) and graduation (exit). d) Criteria of passing a course and promoting students from one level to another. e) Materials and equipment to be used and teacher qualities. f) Objectives, content, instructional activities and methods of assessment and evaluation. g) All of them h) None of them 3. The curriculum specialist‟s responsibilities differ from those of the school principal in: a) Setting taskforces/ committees to achieve specific curriculum jobs. b) Setting timetables and organizing human and material resources. c) Supervising curriculum planning, implementation and evaluation phases. d) Setting policies and curriculum change and innovation directions. e) Focusing on policy setting rather than methods, content or materials. f) Developing the school educational goals, assessing student needs and coordinating efforts. g) Designing study programs, selecting and assessing course books and preparing curriculum and teacher guides. h) Training others (e.g., how to write objectives) and developing professional development activities. i) Supervising curriculum implementation and acting as a resource agent in the school. j) A & C k) A & B l) F through I m) A through E n) C through G o) None of them 4. Which of the following DOES NOT apply to an instructional block? a) A self-contained learning sequence with its own aim, objectives, materials and progressive difficulty levels.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

183

b) It could be too specific (e.g., a lesson) or broad like a unit/ chapter that comprises several lessons. c) A module is a self-contained and independent learning sequence with its own objectives. d) A course consisting of 120 hours can be divided into four modules each takes 30 teaching hours. e) A module involves a number of units (schemes of work)/chapters. f) The arrangement of the curriculum major components. g) The selection and gradation of instructional content. h) What to be studied as elements of curriculum (e.g., planning, research and evaluation). i) How curriculum evolves in terms of planning, implementation and evaluation. j) The creation of a totally new curriculum. k) A through E l) F through J m) C through G n) All of them o) None of them 5. Which of the following principles underpins essentialist-based curricula? p) Teaching the basic subjects, like reading, writing, counting, computing, math, English and science. q) Activities and interests are not part of the curriculum as they do not prepare students to life. r) Standardized testing is an integral part of learning assessment. s) Teaching essential knowledge and skills to help students function well in society regardless of their needs. t) Curriculum knowledge changes by changing its worth to society. u) Students learn to meet education standards through mastery learning. v) Vocational learning is downgraded whereas student interests are a waste of time. w) B & G only. x) A & D only. y) All of them z) None of them

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

184

Saad F. Shawer 6. Which of the following underlies progressivist curricula? a) Broadening and developing the intellect by means of problemsolving and critical thinking skills. b) Instruction hinges on learner needs, interests, whole development, and developmental stage. c) Teaching focuses on acquiring concepts and skills necessary for future learning. d) Curriculum connects to the workplace and real life and learning occurs through experimentation and experience. e) Using the scientific method to help students conduct scientific inquiry in first hand. f) Reflection that stems from experiences and contributes to problem solving. g) Looking backward and ahead as part of reflection. h) Open-mindedness by considering any ideas before accepting or rejecting them. i) Responsibility by thinking of the long-term consequences of actions. j) Wholeheartedness by looking at ideals with an eye on materializing them into practice and real actions. k) Encouraging project, self-directed and discovery learning. l) Making judgments after careful consideration. m) A & C n) B & C o) All of them p) None of them 7. With which of the following principles does perennialism agree with essentialism? a) Teaching topics of everlasting importance to humans everywhere. b) Curriculum is focused on teaching the basics, like reading, writing and counting. c) Teaching liberal arts before vocational topics because people are first humans. d) Teaching is focused on meaningful conceptual and rational thinking rather than memorization. e) Teaching constant and changeless principles that are useful to everyone.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

185

8. Pestalozzi and Froebel's curriculum ideas agreed with regard to which of the following? a) Learning through play, activities and manipulation of real objects. b) Learning through the senses or experience rather than theory or words. c) Emphasizing the affective support of learners by those who educate children. d) Curriculum is focused on teaching the basics and ignoring learner interests. e) Teaching is focused on abstract thinking. f) g) A through C h) D & E i) All of them j) None of them 9. Learning through affective support, play, activities and manipulation of real objects formed the basis for: a) Early Childhood Education b) Secondary Education c) Higher education d) All of them e) None of them 10. The objectives model differs from the situational analysis and process models in that: a) It does not predetermine what should be taught beforehand. b) The teacher and students decide on the curriculum aims and content. c) Aims and objectives are determined in the light of student needs and interests. d) All of them e) None of them 11. After constructive discussions, a group of students agreed with their teacher to explore some issues in their setting. They decided on the content that reflected their aims and needs. Which curriculum model does this reflect? a) The process model b) The objectives model c) Both of them

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

186

Saad F. Shawer 12. Which of the following reflects the centralized/ top-down curriculum implementation strategy? a) Developing curriculum by a school for a particular group of learners. b) Diffusion of curriculum via textbooks, teacher guides and curriculum guidelines from the centre to the periphery. c) Both of them. 13. Which of the following is a weakness of grass-roots curriculum implementation strategies? a) Planning on a massive scale. b) Teacher resistance to grass roots curricula. c) Difficulty to maintain equal opportunity. 14. Needs assessment is: a) A serious malfunction that occurs to something or someone for missing part or whole of something. b) A procedure used to identify, validate and establish priorities among needs. c) Those things without which the individual‟s state is significantly less than satisfactory. d) A discrepancy between what is and what should be. 15. A need could be: a) A want or an interest b) Discrepancy or basic c) A & B 16. A teacher noticed a large number in her class can read but at a poor level. She designed a remedial module to improve their reading from the poor to a good level. This means: a) The students had a discrepancy need. b) The students had a basic need. c) The students had both discrepancy and basic needs. 17. In your classroom, you noticed some students make recurrent serous writing problems. Which of the following would you follow to assess the needs of your students? a) Assess the writing needs of the whole school. b) Assess the reading needs of all students in your district. c) Assess the writing needs of only those students in your classroom. d) Assess the reading needs of only those students in your classroom.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

187

18. Which of the following IS NOT an objective (learning outcome/ performance objective)? a) The general cognitive, affective and psychomotor changes students experience by the end of a course. b) What the students will be precisely able to do by the end of a course or each lesson of a course. c) Statements of what learners can do by the end of a course or each lesson of a course. d) The specific cognitive, affective and psychomotor changes students experience by the end of a course. e) A comparison of the specific changes in learning before and after taking a course or each lesson of a course. f) A & E g) B through D h) B through E 19. „Work out major, main and minor themes of a single-subject curriculum scope.‟ This objective is at the level of: a) Remember b) Understand c) Apply d) Analyze e) Evaluate f) Create 20. „Students will be able to compare between curriculum sequence and continuity.‟ This objective is at the level of: a) Remember b) Understand c) Apply d) Analyze e) Evaluate f) Create 21. „Students will calculate a single-subject curriculum balance by equivalence.‟ This objective is at the level of: a) Remember b) Understand c) Apply d) Analyze e) Evaluate f) Create

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

188

Saad F. Shawer 22. „Students will be able to assess curriculum weaknesses and strengths.‟ This objective is at the level of: a) Remember b) Understand c) Apply d) Analyze e) Evaluate f) Create 23. Which of the following IS NOT a phase of selecting curriculum content? a) Deciding on the course/curriculum rationales. b) Selecting topics in the light of curriculum/course aims. c) Determining the curriculum/course scope. d) Calculating curriculum balance. e) Determining curriculum/course sequence and continuity. 24. Scope is a horizontal process through which we decide on: a) The number of topics or subjects a curriculum is expected to comprise. b) The number of topics a single-subject curriculum is expected to comprise. c) The number of subjects a multi-subject curriculum is expected to comprise. 25. Curriculum continuity means: a) Equal emphasis is placed on each subject or major theme in a curriculum. b) Equal emphasis is placed on each major theme in a single-subject curriculum. c) Teaching the same topics or subjects in different grades and education stages. d) Teaching the same topics or subjects in different grades and education stages at different difficulty levels. 26. A multi-subject curriculum of 6 subjects was allocated 270 teaching hours. How many hours would each take? a) 35 b) 55 c) 45 d) 60

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

189

27. A multi-subject curriculum of 225 teaching hours comprised 5 subjects. The subjects received these weights. (Science 24%, math 22%, history 15%, English 27% and geography 12%). Which is the correct balance? a) Science 50, math 41.5, history 33.75, English 60.75, geography 27 teaching hours and min. b) Science 54, math 20.5, history 33.75, English 60.75, geography 27 teaching hours and min. c) Science 54, math 49.5, history 33.75, English 60.75, geography 27 teaching hours and min. d) Science 54, math 49.5, history 33.75, English 50, geography 33 teaching hours and min. 28. An English language curriculum of 50 teaching hours comprised 5 major themes, which received these weights (Reading 27%, Writing 19%, Listening 23%, Speaking 20% and Grammar 11%). The READING theme comprised 3 main themes. The correct balance in minutes of each of the Reading 3 main themes is: a) 200 minutes b) 270 minutes c) 210 minutes 29. Balance of a multi-subject curriculum differs from balance of a single-subject curriculum in that the former: a) Involves allocating each subject a number of hours. b) Involves allocating each major theme a number of hours. c) All of them d) None of them 30. Balance by equivalence of both single-subject and multi-subject curricula are similar in that: a) Developers allocate each subject in the curriculum different weights (hours). b) Developers allocate each major theme or subject the same weights (hours). c) Developers allocate each topic in the curriculum different weights (hours). 31. Balance by equivalence is calculated through dividing the sum of hours allocated to a curriculum by the sum of major themes or subjects. What can be inferred from this? a) Percentage allocated to each subject the curriculum sum of hours ÷ 100.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

190

Saad F. Shawer

32.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

33.

34.

35.

b) Calculating the average of hours allocated to each major theme or subject. c) Calculating the percentage of hours allocated to each major theme or subject. The major themes that form part of determining curriculum scope ARE NOT worked out through: a) Translating curriculum aim/aims into a number of major topics/ themes. b) Making each theme address one or more dimensions of the curriculum aims. c) Making all the major themes together address the dimensions of the curriculum aims. d) Using behavioral objectives. The main themes that form part of determining curriculum scope are worked out through: a) Using a medium-level type of objectives to determine the main topics. b) Writing minor topics from objectives. c) Writing minor topics based on personal judgment (without objectives). d) Selecting printed, audio, or visual materials that address each minor topic (lesson point). e) Making each set of medium-level objectives together address the dimensions of one of the major topics. f) Creating materials from the start. g) A & E h) A through B Which instructional content of the following WOULD you NOT choose for your classroom? a) A content addressing curriculum aims and objectives. b) A substantive, balanced, deep and comprehensive content. c) A content claiming suitability for teaching all age groups. d) An updated, relevant and overlap and redundancy free content. Which of the following WOULD NOT be part of a Writing Skills Course you seek to design? a) Determine scope and balance of content including major, main and minor themes and selecting or creating materials. b) Select a textbook on writing skills based on your judgment and teach it all.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

191

c) Derive course aims from needs assessment and other relevant sources. d) Select content in the light of the course aims. e) Assess students‟ needs. f) Determine continuity of course themes and sequence of content. g) Decide on the course rationales including reasons of course development, target group and entry and exit levels. 36. Watching order, design a course by choosing a combination of steps of the following: a) Step 1: Write the course aims. Step 2: Decide on the course rationales. b) Step 3: Select content based on personal judgment. Step 4: Assess students‟ needs. c) Step 5: Determine scope and balance of content. Step 6: Determine content continuity and sequence. d) Step 1: Select content. Step 2: Write course aims. Step 3: Assess students‟ needs. e) Step 4: Decide on course rationales. Step 5: Decide on course scope. Step 6: Decide on content continuity and sequence. f) A, B & C g) D & E h) None of them 37. Which of the following do methods of sequencing/organizing content involve? a) Decide if the curriculum is at an entry or medium level. b) Decide if the curriculum is at an entry, medium or exit level. c) Decide if the curriculum is at a medium-level only. d) Decide if the curriculum is at an exit-level only 38. Advantages of the single-subject design involve which of the following? a) It overemphasizes cognitive development and ignores other areas of development. b) It overemphasizes content and ignores learner differences, needs, interests and experiences. c) Pedagogy hinges on teacher authority and learner passivity via lecturing and recitations. d) Knowledge is divorced from experience and real life. It is gained through verbal channels.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

192

Saad F. Shawer e) Focus is placed on lower thinking levels via remembering and it leads to fragmentation of knowledge. f) Learners gain second-hand experiences while first-hand or direct experiences are nowhere possible. g) Learning has no value or link to real life and community. h) It does not promote teacher development. i) All of them j) None of them 39. The project design rationales involve which of the following? a) Curriculum constructed round learner needs & interests. b) Curriculum implementation relies on learner‟s activities. c) Activities are organized around projects or problems. d) Curriculum promotes integration of knowledge and drops barriers among subjects. e) The project extends to the learner‟s community. f) Curriculum is not planned beforehand. g) All of them h) None of them 40. An English language curriculum is needed to be developed for the first-grade elementary school pupils who did not join pre-school education. Which of the following should the sequence of the curriculum be? a) Take the preceding curriculum a prerequisite and itself forms a prerequisite for the subsequent curriculum. b) Take the preceding curriculum a prerequisite and itself does not form a prerequisite for subsequent curricula. c) Form a prerequisite for itself and a prerequisite for the subsequent curriculum.

B) Multiple-choice questions: (Choose one answer only) (2 points each) (Total: 40) 41. Compulsory or general studies in the core curriculum characterize by which of the following? a) They take almost one-third of the school timetable. b) All students have to purse compulsory courses regardless of their interests.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Appendix A

42.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

43.

44.

45.

193

c) They are organized in the form of teaching units which revolve round the problems and needs of students. d) The students choose from compulsory units a unit that addresses their needs. e) The students plan the unit, implement it, do the activities, collect data, analyze the data and write their reports. f) All of them g) None of them Which of the following would you use to determine content continuity? a) Deciding if the same topic is needed to be taught in subsequent grades and stages. b) Deciding if the same subject is needed to be taught in subsequent grades and stages. c) All of them. d) None of them Which of the following WOULD you NOT use to implement the project curriculum deign? a) Select a project. b) Draw an action plan. c) Implement the project. d) Assign and teach a textbook. e) Evaluate the project. Which of the following is the correct analysis of the core curriculum structure? a) Compulsory courses which are allocated about one-third of the school timetable. b) Electives which involve academic, practical hobbies and physical education. c) Academic advising and grouping students who choose the same courses to pursue their study. d) Only A & C is the correct and logical analysis. e) Only B & C is the correct and logical analysis. f) A through C is the correct and logical analysis. g) None of them is a correct or logical analysis. Which of the following is the correct analysis of the electives part of the core curriculum? a) Based on the major, academic courses involve science, mathematics or economics.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

194

Saad F. Shawer b) The Practical Hobbies Part involves choosing music, art, painting, carpentry, or printing. c) The Compulsory Part of Physical Education involves physical fitness, including running, jumping and climbing. d) The Electives Part of Physical Education involves choosing a game of preference, like football and wrestling. e) Students choosing same courses are grouped to study together. f) Only A though C is the correct and logical structure. g) Only D is the correct and logical structure. h) A through E is the correct and logical analysis. i) None of them is a correct or logical analysis. 46. Work out student roles in implementing the instructional unit by selecting the correct cluster of options? a) Planning unit implementation and deciding on its objectives, content and teaching and learning activities. b) Guiding and advising rather than delivering information. c) Setting a time framework for unit implementation and determining implementation roles. d) Showing how to implement and evaluate the unit. e) Collecting, analyzing, and discussing the data and writing reports and evaluating unit implementation. f) Coordinating collaborative work and assisting in overcoming the implementation problems. g) A, B & C is the correct cluster. h) D, E & F is the correct cluster. i) A, C & E is the correct cluster. 47. Which of the following is your assessment of the correlated-subject design strengths? a) Knowledge integration between single subjects. b) Coordination between teachers is not a problem. c) Drawing relationships between taught subjects. d) Demanding little time and effort. e) A & C f) B & D g) All of them h) None of them 48. Which of the following could be your criteria for defending the instructional unit design? a) Selecting and teaching a single course book.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

195

b) Removing barriers among subjects. No single subject is particularly studied. c) Knowledge integration by using knowledge from various subjects as a means to achieve the unit aim. d) Relevance to community problems and student needs since a unit is designed to solve a real or perceived problem. e) Addressing several objectives, including developing planning ability, thinking, teamwork & constructive discourse. f) Providing teacher and students with a general framework of work and leaving the details to them. g) A & D. h) B through F. i) All of them. j) None of them. 49. A critique defending essentialism and perennialism to underpin single-subject designs should comprise: a) Teaching the basic subjects, like science and arithmetic and teaching the basics of each subject. b) Basic knowledge and skills are essential to students in order to function well in society. c) Activities and interests are considered a waste of time, as they do not prepare students to life. d) Standardized testing is used to assess cognitive development against predetermined outcomes. e) Meeting education standards via mastery learning. f) Focus on personal growth and interests. g) A through E. h) F only. i) None of them. 50. Which of the following do you think are advantages of a single-subject design? a) Providing students with essential knowledge and skills. Being easy to design. b) Being easy to develop through adding new themes and substituting, adapting and deleting existing ones. c) It helps students to perceive the unity of knowledge and encourages teachers‟ professional development. d) Being easy to implement through textbooks and accompanying instructions.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

196

Saad F. Shawer e) Compatibility with current teacher training systems, school structure and admission procedures. f) It is easy to train teachers to teach separate subjects and to evaluate both teachers and students. g) A, B & C. h) D, E & F. i) A, B, D, E & F. 51. The broad-fields design is most suitable for which education stages and courses? a) Elementary stage for not requiring deep knowledge in the subject. b) College introductory courses to provide students with background knowledge in a field of study. c) Secondary education to help students specialize in a specific field. d) A & B. e) A & C. f) B & C. 52. Compose a correlated-subject curriculum through combining the relevant parts of the following. a) Subject experts write curriculum aims. b) Experts determine scope, balance, continuity and sequence of their subject independently from other subjects. c) Experts select or create materials of their subject independently from other subjects. d) Experts determine scope, balance, continuity, and sequence round themes cutting across two or more independent subjects. e) Experts select or create materials round particular themes that cut across two or more subjects. f) A, B & C. g) A, D & E. 53. Design a fused-subject curriculum through combining the relevant parts of the following. a) Subject experts write curriculum aims. b) Experts determine scope, balance, continuity & sequence of merged themes cutting across two or more subjects. c) Experts select or create materials that merge particular themes from two or more subjects in a single design. d) A & C. e) A, B & C.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Appendix A

197

54. Design a project curriculum through combining the relevant parts of the following. a) Designing a module for students to study and to be tested in. b) Selecting a project and drawing an action plan for it. c) Merging two related subjects in a single curriculum. d) Implementation of the project plan and evaluation of the project elements. e) C & D. f) A & B. g) B & C h) B & D 55. Draw an action plan to implement a project curriculum by combining the relevant parts of the following. a) Set a clear aim for the project. b) Set objectives to identify activities and resources. c) Determine the project requirements. d) Divide the project into clear steps. e) Set out what is required to be achieved in each step and decide on the role of each group. f) Decide on each member‟s role in the group. g) Determine the scope and sequence of content. h) A through F. i) B & C only. j) B & E only. 56. Curriculum evaluation means the collection, analysis and interpretation of data and information: a) About student progress and using resulting information to make decisions about student future learning. b) About program implementation to use resulting information to make decisions about program improvement or change. c) To decide how well a curriculum is and using resulting information to decide its improvement or change. d) About teacher performance and using the resulting information to make decisions about teacher career. 57. Which of the following are techniques/types that teachers use to assess student learning from a course? a) Multiple-choice, true/false, yes/no, short-answer, gap filling, completion, and matching questions only. b) Essay and objective techniques.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

198

Saad F. Shawer c) None of them. d) All of them. 58. After comparing essay and objective questions, which of the following are strengths of essay questions? a) Scoring is rapid, economical and precise. b) Covering a wide range of the content. c) Students express critical thinking & organization skills. d) Difficult and less reliable scoring. 59. Thirty students took a reading test for which there was no a cut-off pass score. The test score was 40. Out of 30 students, 5 scored 21, 15 scored 19, and 10 scored 14. Interpret their scores in the light of a norm-referenced test. a) Students scoring 21 fell at the top 17% group. b) Students scoring 19 fell in the 50% group. c) Students scoring 14 fell at the bottom 33% group. d) Students scoring 21 fell in the 50% group. e) Students scoring 19 fell at the bottom 33% group. f) Students scoring 14 fell in the 50% group. g) A, B & C are the correct interpretations. h) D, E & F are the correct interpretations. i) None of them is a correct interpretation. 60. Which of the following tools a principal WOULD NOT use to evaluate the school teachers. a) Asking teachers to sit for a test. b) Using student scores on tests. c) Observation sheets. d) Student questionnaire feedback about teacher performance

GOOD LUCK

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

APPENDIX B

Unit 1 curriculum conceptualization (4 items) Unit 2 Curriculum philosophies (5 items)

      

Unit 3 Curriculum models and strategies (4 items) Unit 4 Needs assessment (4 items)

       

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Create

Evaluate

Analyze

Apply

 

Understand

Remember

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Units (themes of content)

Questions

ALIGNMENT MATRIX OF TEST ITEMS WITH THE SIX LEVELS OF COGNITIVE OBJECTIVES

200

Saad F. Shawer

Unit 5 Writing curriculum aims and objectives (5 items)



Unit 6 Selection of curriculum content (14 items)

  

Create

Evaluate

Analyze

Apply

Understand

Remember

Units (themes of content)

Questions

Table (Continued)

   

  

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

     

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

 

201

Unit 7 Organization of curriculum content (19 items)

Create

Evaluate

Analyze

Apply

Understand

Remember

Units (themes of content)

Questions

Appendix B

         

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

         Unit 8 Curriculum evaluation (5 items)

    

Total = 60

18

11

9

8

8

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

6

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

APPENDIX C ANSWER KEY

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

College of Answer Key Curricula planning & Education (80 points) Development ---------------------------------------------------------------------------------------------Multiple-choice questions: (question 1 to 40, 1 point each) (question 41 to 60, 2 points each) (Total: 80) Question 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Answer D G L L J O B F A E A B C B

Question 31 32 33 34 35 36 37 38 39 40 41 42 43 44

Answer B D G C B H B J G C F C D F

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

204

Saad F. Shawer Table (Continued) Answer B A C A D B C E E A C C C B A B

Question 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Total

Answer H I E H G I D G E H H C B C G A

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Question 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Total

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

APPENDIX D ANSWER SHEET College of Education

Answer Key (80 points)

Curricula planning & Development

Time: 90 min.

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Name:………………………… Student ID……………..……………… Answer all the questions in this answer sheet by inserting A, B through P in the correct box of each question. Each question has only one correct answer. Submit this answer-sheet and keep the questions sheet for yourself. Multiple-choice questions: (question 1 to 40, 1 point each) (question 41 to 60, 2 points each) (Total: 80) Question 1 2 3 4 5 6 7 8 9

Answer

Question 31 32 33 34 35 36 37 38 39

Answer

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

206

Saad F. Shawer

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Table (Continued)

Total

Question 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Answer

Total

Question 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Answer

Score Scorer‟s name: ………

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

REFERENCES Abuhattab, F., Othman, S., & Sadiq, A. (1987). Psychological measurement. Cairo: Anglo Bookshop. Adkins, D. C. (1974). Test construction: Development and interpretation of achievement tests (2nd Ed.). Columbus, OH: Charles E. Merrill Publishing. Alderson, C. J., Clapham, C. & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. Anderson, L., Krathwohl, D., Airasian, P., Cruikshank, K., Mayer, R., Pintrich, P., Raths, J., & Wittrock, M. (Eds.) (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives. Boston, MA: Allyn & Bacon. Assessment, Evaluation & Curriculum Re-design Workshop (2009). What is the range of assessments and evaluations that students and teachers face? http://www.thirteen.org/edonline/concept2class/assessment/index_sub2.ht ml, accessed 08/10/2009. Bloom, B. S. (ed.) (1956). Taxonomy of educational objectives. Handbook I: Cognitive domain. The classification of educational goals. London: Longman. Bloom, M., Fischer, J., & Orme, J. (1995). Evaluation practice: Guidelines for accountable professional (2nd Ed.). Boston: Allyn and Bacon. Brown, A. (1987). „Metacognition, executive control, self-regulation, and other more mysterious mechanisms‟. In F. Weinert & R. Kluwe (Eds) Metacognition, motivation, and understanding. Hillsdale, NJ.: Lawrence Erlbaum. Bruner, J. (1978). Towards a theory of instruction. Cambridge: Harvard University Press. Charles, C. (1988). Introduction to educational research. London: Longman.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

208

References

Clarke, A. (1999). Evaluation research: An introduction to principles, methods and practice. London: Sage. Coakes, S. J., & Steed, L. (2007). SPSS Version 14.0 for windows: Analysis without anguish. Milton, Australia: John Wiley & Sons. Cohen, L , Manion, L., & Morrison, K. (2000). Research methods in education (5th Ed.). London: Routledge. Dave, R. (1970). Psychomotor levels. In R. J. Armstrong (Ed.). Developing and writing behavioral objectives. Tucson, AZ: Educational Innovators Press. Flavell, J. (1979). Metacognition and cognitive monitoring. A new area of cognitive-developmental inquiry. American Psychologist, 34 (10), 906911. Flavell, J. (1985). Cognitive development (2nd Ed.). Englewood Cliffs: Prentice-hall. Frederiksen, N., Mislev, R.J., & Bejar, I. I. (1993) Test theory for a new generation of tests. Hillsdale, NJ.: Lawrence Erlbaum Associates, Publishers. Fulcher, G. & Davidson, F. (2007). Language testing and assessment: An advanced resource book. New York: Routledge. Gall, M., Borg, W., & Gall, J. (1996). Educational research: An introduction (6th Ed.) New York: Longman. Gallagher, R. E., & Smith, D. U. (1989). Formulation of teaching/learning objectives useful for the development and assessment of lessons, courses, and programs. Journal of Cancer Education, 4(4): 231-234. Goh, C. (1997). Metacognitive awareness and second language listeners. ELT Journal, 51 (4), 361-369. Gross, R. (1996). Psychology: The science of mind and behavior (3rd Ed.). London: Hodder and Stoughton. Guilbert, J. J. (1984). How to devise educational objectives. Medical Education, 18(3),134-41. Haladyna, T. M. (2004). Developing and validating multiple-choice test items (3rd Ed.). Mahwah, NJ.: Lawrence Erlbaum Associates. Hedge, T. (2000). Teaching and learning in the language classroom. Oxford: Oxford University Press. Horwitz, E. (1987). „Surveying student about language learning‟ in A. Wenden, & J. Rubin, (Eds) Learner strategies in language learning (119129). Englewood Cliffs: Prentice-Hall. Hughes, A. (2003). Testing for language teachers (2nd Ed.). Cambridge: Cambridge University Press.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

References

209

Krathwohl, D. R., Bloom, B.S., & Masia, B. B. (1964). Taxonomy of educational objectives. Handbook II: Affective domain. The classification of educational goals. New York: David McKay. Kuhn, D. (2000). Metacognitive development. Current Directions in Psychological Science, 9 (5), 179-181. Lemcool, K. E. (2007). Effects of coaching on self-regulated learning strategy use and achievement in an entry-level nursing class. Unpublished Doctoral Dissertation, University of South Alabama. Mager, R. (1962). Preparing instructional objectives. Palo Alto, CA: Fearon Publishers. Mager, R. (1975). Preparing instructional objectives (2nd Ed.). Belmont, CA: Fearon-Pitman Publishers, Inc. McNamara, T. (2000). Language testing. Oxford: Oxford University Press. Moely, B., Santulli, & Obach, M. (1995). „Strategy instruction, metacognition, and motivation in the elementary school classroom‟ in F. Weinert, & W. Schneider (Eds): Memory performance and competencies: Issues in growth and development (301-321). Mohwah: NJ., Lawrence Erlbaum. Murray, B. (2007). Prior knowledge, two teaching approaches for metacognition: Main idea and summarization strategies in reading. Unpublished doctoral dissertation, Fordham University. Patton, M. (1990). Qualitative evaluation and research methods (2nd Ed.). Newbury Park: sage. Pintrich, P., Smith, D., Garcia, T., & McKeachie, W. (1991). A manual for the use of the motivated strategies for learning questionnaire (MSLQ). Amm Arbor: University of Michigan, National Center for Research to Improve Postsecondary Teaching and Learning. Pollard, A., & Triggs, P. (1997). Reflective teaching in secondary education: A handbook for schools and colleges. London: Cassell. Richards, J. (2001). Curriculum development in language teaching. Cambridge: Cambridge University Press. Robson, C. (1993). Real world research: A resource for social scientists and practitioner-researchers. Oxford: Blackwell. Rossi, P., & Freeman, H. (1982). Evaluation: A systematic approach (2nd Ed.). Beverly Hills: Sage. Rutman, L. (1984). Evaluation research methods: A basic guide (2nd Ed.). Beverly Hills: Sage. Shawer, S. F. (2000). A small-scale qualitative evaluation study of the national professional qualifications for headship (NPQH) in Wales. Unpublished MA Thesis, Faculty of Education, University of Wales, Cardiff.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

210

References

Shawer, S. F. (2001). Teachers‟ curriculum development: A framework for examining the effects of teachers‟ conceptualization of the communicative approach on the development and use of curriculum materials. Unpublished MPhil Thesis, Faculty of Education, University of Cambridge. Shawer, S. F. (2003). Bringing curriculum-in-action to the classroom: A Study of teachers‟ curriculum development approaches and their implications for student and teacher development. Unpublished PhD Dissertation, University of Manchester. Shawer, S. F. (2006). Effective teaching and learning in generic education and foreign language teaching methodology: Learners' cognitive styles, foreign language skills instruction and teachers' professional development. Cairo, Egypt: Dar El-Fikr El-Arabi. Shawer, S. F. (2010a). Classroom-level curriculum development: EFL teachers as curriculum-developers, curriculum-makers and curriculum-transmitters. Teaching and Teacher Education: An International Journal of Research and Studies, 26(2), 173-184. Shawer, S. F. (2010b). The influence of assertive classroom management strategy use on student-teacher pedagogical skills. Journal of Academic Leadership, 8(2), 1-12. Shawer, S. F. (2010c). The influence of student-teacher self-regulation of learning on their curricular content-knowledge and course design skills. The Curriculum Journal, 21(2), 201-232. Shawer, S. F. (2010d). The relationship between EFL student-teacher selfefficacy and their language teaching skills. Journal of Academic Leadership, 8(3), 1-29. Shawer, S. F. (2010e). The relationships between EFL student cognitive functioning, curriculum diversification, and ethnic culture differences. Non-Partisan Education Review, 6(2), 1-19. Shawer, S. F., Gilmore, D., & Banks-Joseph, S. (2009). Learner-driven EFL curriculum developments at the classroom level. International Journal of Teaching and Learning in Higher Education, 20(2), 125 - 143. Shawer, S. F., Gilmore, D., & Banks-Joseph, S. (2008). Student cognitive and affective development in the context of classroom-level curriculum development. Journal of the Scholarship of Teaching and Learning, 8(1), 1-28. Struening, E. & Guttentag, M. (1975). Handbook of evaluation research. (VL 1). Beverly Hills: Sage.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

References

211

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

Thornbury, S. (1996). Teachers research teacher talk. ELT Journal, 50 (4), 279-289. Tyler, R. W. (1949). Basic principles of curriculum and instruction. Chicago: The University of Chicago Press. Vygotsky, L. S. (1978). Mind in society. The development of higher psychological processes. Cambridge, MA: Harvard University Press. Wenden, A. (1986). Helping language learners think about learning. ELT Journal, 40, (1), 3-12. Wenden, A. (1998). Metacognitive knowledge and language learning. Applied Linguistics, 19 (4), 515-537. Westberg, J., & Jason, H. (1993). Collaborative clinical education. New York: Springer Publishing Company.

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved. Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

INDEX

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

A achievement test(s), 3, 4, 10, 11, 64, 65, 69, 74, 85, 109, 116, 207 affective, vii, 16, 17, 18, 23, 24, 25, 40, 41, 43, 44, 45, 46, 49, 53, 67, 81, 185, 187, 210 affective objectives, vii, 16, 24, 41, 43, 49, 53 alternative hypothesis, 178 ancient world, 68 ANOVA, 171, 172, 173, 175, 177, 178 aptitude, 5, 10 arithmetic, 63, 195 articulation, 57 assessment, iv, vii, 3, 4, 5, 6, 9, 21, 23, 28, 59, 62, 66, 75, 77, 96, 97, 100, 145, 168, 182, 183, 186, 191, 194, 199, 207, 208 assessment tools, 6, 96 Attitude Scales, 66, 67 automaticity, 58

C candidates, 65, 169 classes, 28, 65, 150 classification, 16, 29, 31, 32, 53, 80, 171, 172, 207, 209

classroom, 5, 7, 17, 46, 47, 62, 186, 190, 208, 209, 210 classroom management, 210 cognition, 25, 29, 30, 31, 33, 40 cognitive development, 30, 191, 195 cognitive function, 210 cognitive objectives, vii, 24, 25, 26, 27, 28, 32, 39, 40, 43, 49, 54, 74, 81, 82 cognitive style, 210 cognitive tasks, 30, 32 college students, 9 colleges, 209 complexity, 39, 48 comprehension, 28, 29, 32, 33, 39, 64 computer, 151, 171, 172, 173 computing, 183 conceptualization, 7, 49, 67, 75, 76, 77, 199, 210 concurrent, 105, 106, 111, 113, 114, 115 Construct Validity, 105, 106, 107, 110, 111 construction, vii, viii, 3, 4, 10, 62, 63, 69, 74, 91, 99, 110, 137, 140, 142, 149, 165, 172, 207 content analysis, 75 Content Experts/ Jury Members, 106, 108 Content Validity, 106, 107 convergent, 105, 106, 111, 114, 115 coordination, 52, 53, 55, 56, 57 correlation, 68, 105, 112, 115, 120, 121, 122, 123, 124, 125, 126, 127, 129 correlation coefficient, 105, 112, 120, 129

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

214

Index

correlations, 111, 112, 114, 115, 122 course content, 58 creativity, 27, 37, 110, 111, 113 Criterion Validity, 107, 112 critical thinking, 6, 37, 184, 198 cultural differences, 47 cumulative frequency, 165, 166, 168 curriculum development, 76, 210

D

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

data collection, 4, 9, 17, 18, 96, 100, 145, 151, 161 developmental process, 30 deviation, 155, 156 direct observation, 56, 66 discrimination, 11, 118, 119, 129 dispersion, 155, 156 disposition, 67 distracters, 96 distribution, 81, 149, 150, 156, 177 diversification, 210 draft, 149 drawing, 24, 26, 55, 197

E economics, 193 education, 9, 53, 183, 185, 188, 192, 195, 196, 208, 210, 211 educational objective, 207, 208, 209 educational research, viii, 207 educators, 9 elementary school, 192, 209 elementary students, 9 Enjoyable response, 42 equal opportunity, 186 essay question, 93, 94, 95, 101, 198 evidence, 5, 6, 9, 19, 26, 33, 37, 111, 112, 113, 114 External Validity, 105, 106, 107 extraneous variable, 107, 121

F Face Validity, 105, 106, 107, 115 foreign language, 20, 21, 22, 210 formation, 35, 53, 57 formative assessment, 5, 64 formula, 34, 35, 77, 79, 81, 85, 87, 89, 125, 127, 150, 155, 156, 157, 166 foundations, 59, 76 freedom, 48 frequency distribution, 147, 148, 149, 150, 151

G Gap filling questions, 99 general knowledge, 30, 31 geography, 189 GPA, 153, 154 grades, 7, 64, 148, 153, 154, 158, 159, 160, 161, 170, 171, 172, 188, 193 grading, 169 graduate students, vii, viii group variance, 175 grouping, 65, 193 growth, 195, 209 guidelines, 186

H harmony, 57 histogram, 159, 171, 172, 177 history, 107, 189 homework, 46 homogeneity, 124, 127, 175 human, 9, 30, 31, 67, 110, 182 human behavior, 110 hypothesis, 111

I imitation, 52, 54, 56 individual differences, 30

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

215

Index individuals, 4, 5, 7, 62, 64, 65, 66, 67, 68, 69, 96, 99, 100, 113, 122, 128, 137, 145, 146 inferences, 6, 62, 151 instructional activities, 182 instructional methods, 9, 109 integration, 49, 192, 194, 195 intellect, 184 intelligence, 63 intelligence quotient, 63 internal consistency, 118, 124, 125, 126, 127, 131 Internal Validity, 106, 107 intervention, 7, 8 introversion, 115 inventors, 67

K kindergartens, 9

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

L language ability, 5, 62 language skills, 210 languages, 53 leadership style, 35 learning behavior, 21 learning difficulties, 64 learning outcomes, 5, 19, 23 learning process, 29 learning styles, 30 learning task, 36, 42, 44, 45, 46, 47, 48 lesson plan, 6

M Matching questions, 198 material resources, 182 materials, 7, 9, 53, 137, 182, 190, 196, 197, 210 mathematics, 193 matrix, 75

measurement, vii, 3, 4, 9, 106, 117, 119, 120, 127, 131, 135, 207 median, 147, 148, 151, 154, 155 memory, 32 mental development, 30 metacognition, 30, 31, 209 methodology, 210 Methods of Determining Content Validity, 106, 108 microscope, 24, 53, 58 models, 29, 59, 75, 77, 185, 199 modernity, 68 modules, 183 motivation, 24, 31, 40, 207, 209 motor skills, 53 Multiple-choice questions (MCQ), 93, 94, 95, 96, 101, 144, 145, 146, 181, 192, 203, 205 multiplication, 124 music, 68, 194

N national assessments, 96 negative reinforcement, 42, 44 null hypothesis, 175, 177

O objective tests, 144 objectivity, 47, 124, 127 observable behavior, 23 observed behavior, 54 operations, 38, 51, 54, 55, 57 organize, 7, 31, 34, 47, 48, 95, 149 overlap, 31, 190

P pedagogy, 5 Performance assessments, 66 personality, 48, 61, 62, 63, 66, 68, 146 personality inventories, 66

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

216

Index

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

personality measures, 61, 62 physical education, 53, 193 physical fitness, 194 Placement tests, 65 portfolio assessment, 66 positive attitudes, 67 positive correlation, 31 positive reinforcement, 41, 45 post-hoc analysis, 173 predictive validity, 106, 111, 112, 113 principles, 29, 36, 183, 184, 208, 211 problem-solving, 63, 184 professional development, 182, 196, 210 professional qualifications, 209 Proficiency tests, 65 project, 39, 59, 184, 192, 193, 197 Projective Techniques, 66, 67 pronunciation, 21, 22, 33 psychological processes, 211 psychologist, 29 psychomotor, vii, 16, 17, 18, 23, 24, 25, 40, 51, 52, 53, 54, 81, 187 psychomotor objectives, vii, 16, 23, 24, 40, 51, 52, 53, 54, 81

Q questionnaire, 198, 209 quizzes, 5, 145

R random assignment, 122 reading, vii, 3, 15, 20, 21, 25, 41, 45, 51, 61, 63, 64, 73, 85, 93, 105, 108, 117, 130, 131, 135, 139, 147, 149, 163, 166, 167, 168, 169, 170, 171, 183, 184, 186, 198, 209 recall, 17, 32 redundancy, 190 relevance, 48, 68 reliability, viii, 10, 96, 98, 101, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125,

126, 127, 128, 129, 130, 131, 132, 135, 136 repetitions, 144 requirements, 9, 31, 197 researchers, vii, 74, 101, 107, 108, 115, 124, 136, 149, 151, 159, 168, 209 resistance, 186 resources, 7, 8, 31, 37, 100, 197 response, 17, 42, 45, 67, 95 rubrics, 6 rules, 26, 35, 45, 95, 137

S SAT (scholastic aptitude test), 5 schema, 47 school, 4, 5, 6, 7, 9, 33, 35, 65, 76, 96, 97, 136, 182, 186, 192, 193, 196, 198, 209 schooling, 6 science, 18, 53, 183, 193, 195, 208 scientific method, 184 scope, 16, 18, 19, 20, 59, 96, 107, 187, 188, 190, 191, 196, 197 second language, 208 secondary education, 209 Self-Concept, 66, 67 self-efficacy, 210 self-esteem, 67 self-regulated learning (SRL), 31, 171, 172, 173, 174, 175, 177, 178, 179, 209 self-regulation, 177, 207, 210 senses, 185 sequencing, 181, 191 services, iv, 8, 100 showing, 34, 110 significance level, 175 single test, 107, 163, 164 skills base, 190 social workers, 74 society, 68, 183, 195, 211 software, 7 solution, 27, 37 Spearman-Brown, 118, 125 specific knowledge, 31

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test

217

Index

Copyright © 2011. Nova Science Publishers, Incorporated. All rights reserved.

specifications, vii, 10, 65, 69, 73, 74, 78, 79, 80, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 99, 109, 110, 116 speech, 33 spelling, 110, 141 split-half reliability, 125 stability, 68, 121, 122, 127 standard deviation, 123, 147, 148, 151, 155, 156, 173, 174, 175 standardized testing, 5 state, 4, 5, 6, 7, 19, 20, 32, 33, 38, 61, 68, 74, 86, 93, 96, 106, 119, 140, 186 states, 21, 141 statistics, 151, 159, 173, 174, 175, 177 stimulus, 67 strategy use, 30, 209, 210 structure, 7, 26, 36, 54, 68, 100, 193, 194, 196 student achievement, 6, 107 student creativity, 114 student motivation, 43 subjectivity, 124 subtraction, 124 summative assessment, 5, 64, 97, 145 synthesis, 27, 28, 29, 32, 37, 38, 40, 64

test items, vii, 28, 85, 87, 89, 90, 101, 107, 110, 116, 124, 126, 131, 140, 208 test procedure, 137 test scores, 129, 131, 136, 140, 149, 152, 164, 167, 171, 172, 178, 179, 180 testing, vii, 4, 5, 9, 11, 17, 18, 21, 28, 32, 35, 62, 63, 122, 123, 135, 169, 171, 183, 195, 208, 209 textbook(s), 9, 79, 186, 190, 193, 196 The Curriculum (Reference) Test Validation Process, 107 training, 141 translation, 33 trial, viii, 10, 103, 117 triangulation, 115 TRUE/ FALSE questions, 93, 94

V validation, viii, 101, 103, 105, 106, 114 variables, 30, 66, 107 ventilation, 137 venue, 137 vocabulary, 22, 141, 147, 148, 157 Vocational Interest Measures, 66, 69 Vygotsky, 30, 211

T Table of Specifications, v, vii, viii, 71, 73, 85, 88, 106, 108, 109 target, 5, 6, 17, 18, 20, 40, 44, 51, 52, 54, 55, 94, 95, 96, 101, 111, 120, 123, 191 target behavior, 5, 20, 94, 101 task demands, 31 taxonomy, 29, 43, 207 teacher performance, 7, 8, 197, 198 teacher training, 196 techniques, 5, 10, 33, 37, 61, 62, 67, 93, 94, 95, 96, 101, 151, 159, 198 test construction, iv, vii, viii, 3, 4, 10, 62, 69, 74, 91, 99, 137, 140, 149, 165, 172, 207

W water resources, 57 weakness, 66, 186 worry, 52, 55 wrestling, 194

Y YES/ NO, 93, 94 yield, 62, 114, 115, 121, 151, 169

Shawer, Saad F.. Standardized Assessment and Test Construction without Anguish: The Complete Step-By-Step Guide to Test