Improving Item Validity through Modification in Terms of Test Accessibility 9786057691064

293 110 912KB

English Pages 18 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Improving Item Validity through Modification in Terms of Test Accessibility
 9786057691064

Citation preview

Current Studies in Educational Measurement and Evaluation

Editors Prof. Dr. Salih ÇEPNİ Assoc. Prof. Dr. Yılmaz KARA

Paradigma Akademi – August 2019

Current Studies in Educational Measurement and Evaluation Editors: Salih ÇEPNİ, Yılmaz KARA ISBN: 978-605-7691-06-4 Certificate Number: 32427 Printing House Certificate Number: 43370 The responsibility of each chapter belongs to its author(s). Paradigma Akademi Basın Yayın Dağıtım Fetvane Sokak No: 29/A ÇANAKKALE Tel: 0531 988 97 66 Layout: Fahri GÖKER [email protected] Typesetting: Gürkan ULU [email protected] Cover design: Gürkan ULU Printing House Address Ofis2005 Fotokopi ve Büro Makineleri San. Tic. Ltd. Şti. Davutpaşa Merkez Mah. YTÜ Kampüsiçi Güngören / Esenler İSTANBUL

This book is sold with the banderole and ISBN obtained from the Ministry of Culture. Do not buy books without bandrole.

Paradigma Akademi – August 2019

Content

PREFACE ........................................................................................ V

Part I Problem Solving in Education Chapter 1 Problem Solving Procedure in terms of Cognitive Theories Salih ÇEPNİ & Yılmaz KARA Introduction ............................................................................................ 1 Cognitive Theories for Learning and Problem Solving ............................ 1 Human Cognitive Architecture ................................................................ 3 Cognitive Load Theory ......................................................................... 11 Cognitive Processes in Problem Solving................................................ 16 Conclusion ............................................................................................ 19 Chapter 2 Improving Item Validity through Modification in Terms of Test Accessibility Yılmaz KARA Introduction .......................................................................................... 25 Conceptual Understanding of Test Accessibility .................................... 26 Test Accessibility Model ....................................................................... 27 Item Modifications for Accessible Test Items: Theory to Practice.......... 31 Conclusion ............................................................................................ 36 Chapter 3 Traditional Measurement and Evaluation Tools in Mathematics Education Cemalettin YILDIZ Introduction .......................................................................................... 41 Verbal Exams ....................................................................................... 43 Long-answer Written Exams ................................................................. 46 Short-Answer Written Examinations ..................................................... 50 True-False Tests.................................................................................... 53 vii

Yılmaz KARA

Chapter 2 Improving Item Validity through Modification in Terms of Test Accessibility Yılmaz KARA

Introduction High expectations about test scores of central examinations, contestant atmosphere mostly effected by the national wishes to get better international examination scores, and adoption of performance based, activity related formative classroom assessment approaches brought increasing attention to the item and test development efforts (Wößmann, 2005). In recent years, increasing and differentiated demands of society related to the assessment and measurement processes conducted in education area mostly focused on more reliable and valid question writing efforts. Efforts about the complexity of preferred learning objects embedded into the questions stimulated the implementation of principles of cognitive load theory to the question development techniques (Paas et al., 2003). At the same time, accessibility concept needs to be comprehended to understand the approach of performance assessment suggested by the cognitive load theory. Test accessibility is the measure to indicate the degree of provided opportunities to display proficiencies over the target frameworks for the all test takers (Beddow et al., 2008). Being successful in an examination requires giving true answers to the test questions by using physical and cognitive sources to some extent. If there is not any convenience between students’ cognitive capacities and questions’ cognitive levels, test accessibility most possibly will negatively affect the test results due to misunderstandings of what is really stated in the questions. Thus, the increase of test accessibility briefly means reduction of question elements possible to accumulate excessive cognitive work load and getting more valid test scores (Kettler et al., 2009).

25

Improving Item Validity through Modification in Terms of Test Accessibility

Cognitive capacity levels of structures used in questions need to be balanced to the level that can all test takers cognitively comprehend. Beddow et al. (2008) studied on increasing test accessibility through focusing on to elements of test questions, giving place only to the necessary components for giving answer, removing unnecessary texts, visuals, graphs, or tables. In this chapter, it was aimed to focus on accessibility term at first. Then, adaptation of accessibility theory was introduced to the test items for educational measurement and assessment. Finally, item accommodation for test accessibility was exampled through the assumption of availability to evaluate directly all test takers that have highly wide and various specifications. Conceptual Understanding of Test Accessibility Access is more than participation to the general education with learning program standards and standard based assessments as one of the basic principles of education. Access is the term that is used to underline overcoming the impeding factors which limit learner characteristics and skills to achieve purposed and tested objectives of an educational teaching program. In terms of teaching, access represents a whole range of opportunities for a learner to learn the gains related to the intended teaching program. Considering current educational system, this means all learners have proper opportunity to achieve the objectives in the relevant learning program and to demonstrate performance in objective-related achievement tests. In this process, teachers are encouraged to teach learning program objectives by designing learning which will increase the likelihood of learning for each student rather than testing. Unfortunately, there are many obstacles to student lea rning (Akbulut & Çepni, 2013; Elliott et al., 2018). Accessibility is essential for effective teaching and fair testing which expressed as a measure of how a system clears impeding factors and allows the fair use of elements and services for different individuals. Learning, learning materials and tests should be accessible for all students participating in the learning process. Otherwise, it is highly probable that inferences from observation and test results may be wrong as well as incomplete learning. Therefore, educators have important responsibilities in achieving the best possible accessibility (Solano-Flores et al., 2014). In terms of the educational evaluation, accessibility is considered as a measure of the ability of the student to indicate their gaining related to the tested standard. 26

Yılmaz KARA

If the student is enabled to indicate their gaining in the educational evaluation process, the educational evaluation is considered as accessible. Test accessibility is a measure of how much a test event allows the participant to show information about the target structure (Carney et al., 2016). Therefore, an accessible test or test item does not include any structure that prevents the test participant from showing how much student has the qualifications measured by the test. The balance between the required physical, material, or cognitive facilities for a test and the designed structure to measure determines the likelihood of the inferences from test scores that reflect to test accessibility. The implications of such test accessibility concerns are salient particularly for test-takers for whom extraneous test or item demands preclude them from demonstrating what they know. Indeed, extraneous demand reduces a test’s accuracy and precision as a measuring tool for students for whom extraneous demand poses a hindrance, while test accessibility is not reflected in the inferences made from test scores for students for whom the extraneous demand does not reduce the accessibility of the test. Test accessibility is at the highest level as all the students exhibit their gains on the tested construct without any obstacles (Cawthon et al., 2013). Thus, item access minimizes the bias and increases the justice for the learning and evaluation process. Test Accessibility Model A test item that permits access for a student is free from features that reduce the students’ ability to represent their qualification of the target construct mostly described in terms of the standards on the related learning program. Test accessibility should be considered as a correlation between the student and the test item. More precisely, it is an interplay between properties of the test item and qualifications of the students. Figure 2.1 presents the model that situates access to assessment in the educational measurement and assessment. Viewing the figure from left to right, a student in school receives access to learning through the instruction he or she receives in the classroom in the context of the standards which are defined on the learning program. The purpose of this instruction is to provide the student (or, in our terms, the test-taker) with the knowledge, skills, and abilities that will be needed to participate successfully in the test event, which involves a set of interactions between the student and the test. The outcome of this test is a score, from which an inference is made about the student knowledge, skills, and abilities as they pertain to the tested content. Based on these inferences,

27

Improving Item Validity through Modification in Terms of Test Accessibility

decisions are made that may influence the subsequent learning program and/or instruction the student receives (Elliott et al., 2018).

Figure 2.1 Unified model of educational access (Beddow, 2011). The central part of this model is the notion of the test event (Figure 2.2). A test event is supposed to consist of the student’s engagement with the test materials with the purpose of generating a result that accurately reflects their qualification of the tested content. An optimal test event, therefore, produces a score that represents only the interaction between the student qualification of the tested content and the test itself. If a students’ improper access to the test event influences their score on the test, then the test event consisted of not only the targeted interaction (i.e., the interaction intended for measurement) but also one or more ancillary interactions. The accessibility of a test for an individual testtaker is based on the impact of these interactions on the test score. Therefore, the accessibility of a test necessarily differs from one test-taker to another based on the individual differences between those test-takers (Laughton, 2014).

28

Yılmaz KARA

Figure 2.2 Test Event In scope of assessment development and evaluation, test error refers to the discrepancy between a students’ “true score”(i.e., his or her score if the test or test item represented a perfect measurement of the students’ qualification in terms of the tested content, yielding a score that is free of construct-irrelevant variance) and their actual score. In the theoretical model of test error resulting from accessibility, the test-taker characteristics (i.e., potential interactors with features of the test) are considered in the five groups: perceptive, physical, receptive, emotive, and cognitive. Each of these sources of error can be linked with one or more categories of test or test item features, including the mode or means of response, mode of delivery, setting, consequences, and the demand for cognitive resources. Although physical access is an important dimension of accessible testing, most physical access needs can be addressed through typical accessibility or universal design methods (Kavanaugh, 2017). For example, the intended construct for a university entrance assessment may be to solve problems involving measurement and estimation. Such items may 29

Improving Item Validity through Modification in Terms of Test Accessibility

contain substantial printed text that students must read. For a student to understand the problem presented and subsequently respond, he or she must be able to detect and decode this printed text. A test-taker with a reading disability or visual impairment may be unable to do so (Harayama, 2013). The students are prevented from indicating their qualifications due to an inability to access item content. According to Beddow (2011), “the test-taker characteristics that interact with test or test item features and either promote or inhibit one’s access to the test event are referred to as access skills” (p. 381-382). As illustrated, often implicit in the design of many state tests is an assumption that test takers will possess certain access skills (e.g. ability to decode printed text, see a graph, hold a pencil and legibly handwrite their responses, maintain attention and motivation throughout the test) that are necessary for meaningful participation. However, the extent to which individual students possess these skills can vary substantially. The influential nature of categories on subsequent test scores is often discussed in terms of access skills, defined as the specific qualifications required to engage a test for the purpose of accurate measurement. The measurement purpose of most educational tests is to examine the degree of a test-taker’s mastery of qualifications, referred to as the target construct. Similarly, each of the items on the test is either implicitly or explicitly designed to measure part of this target construct, with the assumption that the sum of the test items represents an enough sample to measure standards of the learning program. Although the target construct of a complete test often is relatively clear (e.g., Year 3 Grammar, Advanced University Biology), the specific construct targeted for measurement by an individual test item within the test may be less so. It is of critical importance, therefore, to determine the target construct not only for the test, but also for each of the test items. Ideally, these definitions are generated by the test developers. When the target construct is sufficiently set apart (i.e., defined in terms of the level of knowledge tested, cognitive demand, reading level, context), it is easier to discern the various access skills that are necessary to engage the construct. However, many achievement tests set apart the target construct to the level of clusters or strands of knowledge, but the item writers are given great latitude in terms of how the items measure them. As a result, the items may measure ancillary constructs that are not explicitly defined in the target construct specified by the developer (Perlman et al., 2016).

30

Yılmaz KARA

Item Modifications for Accessible Test Items: Theory to Practice A group of students has different members in terms of accessibility skills. For this, it is hard to write test items which are accessible for all students. Still, an item writer needs to consider the accessibility to ensure that students enough interacted with the item elements through using their access skills. Otherwise, students will not be provided an evaluation process to demonstrate their knowledge on target construct. In order to enable optimal accessibility, item writer need to identify accessibility level of the item and modify the elements to increase the accessibility for more students (Vanchu-Orosco, 2012). The elements of an item are presented in Figure 2.3.

Figure 2.3 Anatomy of an item 31

Improving Item Validity through Modification in Terms of Test Accessibility

As seen in Figure 2.3, an item needs to be considered in five categories: paragraph, visuals, item stem, answer choices, and layout (Beddow, 2011). In order to modify the item, each category need to be identified in terms of item accessibility. First, item should be reviewed to identify accessibility barriers. Then, item need to be modified through following the principles of accessibility theory and considering the characteristics of accessible test items (Elliott et al., 2018). The characteristics of accessible items presented in Table 2.1. Table 2.1 Element and characteristics of accessible test items. Elements Passage

Stem

Visuals

Answer choices

Layout

Characteristics Includes only required words. Clearly written through minimum words. Consisted sentence structure in appropriate with student taker grade. Contains clear instructions or pre-reading. Clearly written through minimum words as much as possible. Considers the target structure or related teaching program standard. Includes distinct target construct. Set up with positively worded verb in active voice. Includes only required visuals. Consisted of clearly pictured and simple figures. Supported with necessary text. Purified from any components possible to distract students Clearly written through minimum words as much as possible. Includes key and distractor choices which are almost at the same length, order and content. Contains equally reasonable distractors. Consist of one correct choice. Arranged to present an item with all of its elements. Includes the visuals engaged with other elements of the item. Designed to facilitate answering. Promises enough empty area to comprehend item elements. Set up with large and readable item elements.

Practicing the item accessibility principles an on a sample item will be useful to comprehend item modification for accessible test item for more students. Figure 2.4 displays the items included in a 9th grade biology test. The target construct of the interested item was defined in terms of biology curriculum standards. The item was aligned with a targeted standard defined as “Student 32

Yılmaz KARA

explain the cellular constructs and their function” by the education council. The item begins with a short information about the discovery of cell. Then, the information underlines that the cell has specialized parts which has different functions. The item requires to imagine a cell and their parts at first. Then, the function of nucleus was directly asked to be known. The students must imagine the nucleus and its function to find the correct answer choice.

Figure 2.4 An item in a 9th grade biology test The same item in the test was modified in terms of accessibility theory to ensure the accessibility for the students who has various qualifications about the 33

Improving Item Validity through Modification in Terms of Test Accessibility

related standard of the learning program. The modified item presented in Figure 6. It is better to make modification for more accessible items by a team. The members of the team preferred to be experienced in educational measurement and assessment in addition to expertise in educational research and practice after having training to modify test items in terms of accessibility theory. The team members expected to follow the element and characteristics of accessible test items (Table 2.1).

Figure 2.5 Revised item in a 9th grade biology test While making modification for item stimulus, the team argued whether intended modification support to measure the target construct more generously in terms of accessibility. The modified item has shortened item stimulus against to original form of the item. The team decided to remove the history of cell exploration since it’s not directly related with the target construct. As defined in standard, the target construct should include knowledge of cell constructs and its functions. Thus, the stimuli should include the information to highlight the cell constructs and their functions. All other information was removed which are not directly related with the target construct (Figure 2.5). Item stem was integrated with item stimulus in the initial form of the item. Considering the length of the text, the modification team decided to separate the 34

Yılmaz KARA

item stem. The visual used as separator. The item stem was shortened in text and presented in larger fonts. The main purpose was highlighted trough bold font which is “function of cell part” in this case. Thus, the item was arranged to guide cognitive load from visuals to item stem until to the item choices. Before the modification, the item had no visual to support the measured target construct. Students were expected to imagine a cell and their parts, then had to know their functions. The modification team decided to add a cell figure to increase item accessibility. The figure contains the constructs of a cell. The figure help students to imagine the cell and its parts. In addition, it provides contextual support for students who are unfamiliar with the parts of cells but their functions. The figure also separates the item stimulus from item stem to facilitate the identification of item construct. The answer choices were also reviewed by the modification team. All answer choices were minimalized to reduce cognitive load and increase item accessibility. First, the cognitive load was distributed onto visual and item stem. For this, the visual was included a mark to indicate the targeted cell construct. Second, all answer choices were shortened by removing other cell constructs than the target construct. In the modified form, students are required to recognize the target construct from the visual and find its function among the answer choices instead of trying to imagine the cell and making comparison between the given cell construct and function. Modification team also decided to eliminate the most implausible or least preferred distractor by considering the meta-analysis study on answer choices (Rodriguez, 2005). The study revealed that three answer choices is optimal for reliability and item discrimination without reducing the item difficulty. In this case, the distractor which did not include a cell part eliminated among the answer choices. The modification team also arranged the page layout of the test. Instead of using separate answer sheet, students were enabled to answer directly on the same page. In this manner, students were free from misaligning the question and the answer. In the original test format, eight questions were placed in a single page at average. In the modified format, the item presented on a single page. Also, text and item elements were larger and more readable. Thus, the white space was increased to facilitate more accessibility. Through considering the principles of accessibility theory, the modification team arranged the item and generated more accessible test item in terms of 35

Improving Item Validity through Modification in Terms of Test Accessibility

cognitive demand, item difficulty, and item discrimination. While the processed arrangements considered, it can be concluded that the item writer and the modification team mostly focused on target construct only in terms of cognitive objectives. In recent years, most of the curriculums constructed not only on cognitive dimension and content but also additional dimensions such as attitudes, values, science process skills, science-technology-society, and skills such as 21st century, engineering and thinking. If the target construct was included additional dimensions, it is also necessary to measure the attitudes, values, and various skills through modifying the item. In this case, the modification team need find ways to measure the additional dimensions without reducing the item accessibility. As understood, the item modification is an endless effort to improve the quality of the item. The modification is always necessary and there are various points to consider. Modifications can be still necessary for the high-quality items with the changes in the target construct. The test Items can be contextualized in real-world or real-life context, visualized through pictures or figures, and facilitated in terms of accessibility by considering the characteristics of accessible test items. Conclusion Examinations including complex structures making difficult to interact for students should possibly negatively reflect to the test scores. In this case, students cannot exhibit target learning outcomes even if they have already gained during educational procedures. However, it is possible to write more cognitively accessible test questions simply by developing preventing unnecessary explanations that could make extreme cognitive load, writing balanced answer choices in each other, and placing all the question components convenient to target learning outcome(s) (Beckmann, 2010). Eventually, questions asked in large-scale, central examinations that hundred thousand of test takers making applications also required to be developed in terms of the cognitive load theory. It is necessary to consider test accessibility opportunities provided to test takers through evaluating with all dimensions starting from beginning of question writing procedures. In this way, the assumption of being accessible for all test takers could be fulfilled in large scale, nationwide or central assessments. Placement of question enabling maximally test accessibility for all test takers in an examination would positively affect the results of test takers also the validity and reliability. High test accessibility also 36

Yılmaz KARA

would be characteristic for the comparisons among different kind of large scale national and international examinations. Thus, the gladness could be increased in parallel to the positive opinions of teachers, parents, administrators, and public opinion about education in addition to students directly under effect of examination atmosphere. Consequently, education researchers are expected to reveal whether questions that have low level test accessibility are truly understood or not. Number of studies discussing the student achievement need to be increased about the effectiveness of test questions arranged according to the cognitive load theory and accessible for all test takers. Existing researches can be considered as the forerunner of the studies about effects of arrangements to the different kind of student groups, paradigm interactions, or results of various arrangement strategies over student achievements and power of understandings.

37

Improving Item Validity through Modification in Terms of Test Accessibility

References Akbulut, H. İ., & Çepni, S. (2013). Bir üniteye yönelik başarı testi nasıl geliştirilir: İlköğretim 7. sınıf kuvvet ve hareket ünitesine yönelik bir çalışma. Amasya Üniversitesi Eğitim Fakültesi Dergisi, 2(1), 18-44. Beckmann, J. F. (2010). Taming a beast of burden - On some issues with the conceptualisation and operationalisation of cognitive load, Learning and Instruction, 20, 250-264. Beddow, P. A., Kettler, R. J., & Elliott, S. N. (2008). Test Accessibility and Modification Inventory. Nashville, TN: Vanderbilt University. Beddow, P.A. (2011). Effects of Testing Accommodations and Item Modifications on Students' Performance: An Experimental Investigation of Test Accessibility Strategies. ProQuest Dissertations and Theses. Carney, M.B., Smith, E., Hughes, G.R., Brendefur, J.L., & Crawford, A. (2016). Influence of Proportional Number Relationships on Item Accessibility and Students' Strategies. Mathematics Education Research Journal, 28(4), 503522. Cawthon, S., Leppo, R., Carr, T. & Kopriva, R. (2013). Toward Accessible Assessments: The Promises and Limitations of Test Item Adaptations for Students with Disabilities and English Language Learners. Educational Assessment, 18 (2), 73-98. Elliott, S.N., Kettler, R.J., Beddow, P.A., & Kurz, A. (2018). Handbook of Accessible Instruction and Testing Practices: Issues, Innovations, and Applications (2nd ed. 2018 ed.). Cham: Springer International Publishing. Harayama, N. (2013). An Analysis of the Performance and Accommodations for Students Who Are Non-verbal Taking Pennsylvania's Statewide Alternate Assessment (PASA). ProQuest Dissertations and Theses. Kavanaugh, M. (2017). Examining the Impact of Accommodations and Universal Design on Test Accessibility and Validity. ProQuest Dissertations and Theses. Kettler, R. J., Elliott, S. N. & Beddow, P. A. (2009). Modifying Achievement Test Items: A Theory-Guided and Data-Based Approach for Better Measurement of What Students with Disabilities Know, Peabody Journal of Education, 84 (4), 529–551. 38

Yılmaz KARA

Laughton, S. (2014). Accessibility of Tests in Higher Education Online Learning Environments: Perspectives and Practices of U.K. Expert Practitioners. ProQuest Dissertations and Theses. Paas, F., Renkl, A. & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments, Educational Psychologist 38: 1–4. Perlman, A., Hoffman, Y., Tzelgov, J., Pothos, E., & Edwards, D. (2016). The notion of contextual locking: Previously learnt items are not accessible as such when appearing in a less common context. Quarterly Journal of Experimental Psychology, 69(3), 410-431. Rodriguez, M.C. (2005). Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement: Issues and Practice, 24(2), 3-13. Solano-Flores, G., Wang, C., Kachchaf, R., Soltero-Gonzalez, L., & Nguyen-Le, K. (2014). Developing Testing Accommodations for English Language Learners: Illustrations as Visual Supports for Item Accessibility. Educational Assessment, 19(4), 267-283. Vanchu-Orosco, M. (2012). A Meta-analysis of Testing Accommodations for Students with Disabilities: Implications for High-stakes Testing. ProQuest Dissertations and Theses. Wößmann, L. (2005). The Effect Heterogeneity of Central Examinations: Evidence from TIMSS, TIMSS-Repeat and PISA. Education Economics, 13(2), 143–169.

39