Download presentation
Presentation is loading. Please wait.
1
VERTICAL SCALING H. Jane Rogers Neag School of Education University of Connecticut Presentation to the TNE Assessment Committee, October 30, 2006
2
Scaling Definition: Scaling is a process in which raw scores on a test are transformed to a new scale with desired attributes (e.g., mean, SD)
3
Scaling Purposes: 1.Reporting scores on a convenient metric 2.Providing a common scale on which scores from different forms of a test can be reported (after equating or linking)
4
Scaling There are two distinct testing situations where scaling is needed
5
Scaling SITUATION 1 Examinees take different forms of a test for security reasons or at different times of year Forms are designed to the same specifications but may differ slightly in difficulty due to chance factors Examinee groups taking the different forms are not expected to differ greatly in proficiency
6
Scaling SITUATION 2 Test forms are intentionally designed to differ in difficulty Examinee groups are expected to be of differing proficiency EXAMPLE: test forms designed for different grade levels
7
EQUATING Equating is the process of mapping the scores on Test Y onto the scale of Test X so that we can say what the score of an examinee who took Test Y would have been had the examinee taken Test X (the scores are exchangeable) For SITUATION 1, we often refer to the scaling process as EQUATING
8
EQUATING This procedure is often called HORIZONTAL EQUATING
9
LINKING This process is sometimes called VERTICAL EQUATING, although equating is not strictly possible in this case For SITUATION 2, we refer to the scaling process as LINKING, or scaling to achieve comparability
10
REQUIREMENTS FOR SCALING In order to places the scores on two tests on a common scale, the tests must measure the same attribute e.g., the scores on a reading test cannot be converted to the scale of a mathematics test
11
EQUATING DESIGNS FOR VERTICAL SCALING 1.COMMON PERSON DESIGN Tests to be equated are given to different groups of examinees with a common group taking both tests 2.COMMON ITEM (ANCHOR TEST) DESIGN Tests to be equated are given to different groups of examinees with all examinees taking a common subset of items (anchor items)
12
EQUATING DESIGNS FOR VERTICAL SCALING EXTERNAL ANCHOR OR SCALING TEST DESIGN Different groups of examinees take different tests, but all take a common test in addition
13
Example of Vertical Scaling Design (Common Persons) Test Level GradeStudent234 2 (November testing) 12..N212..N2 Mean = 26.6 SD = 4.1 3 (November testing) 12..N312..N3 Mean = 34.7 SD = 4.3 Mean = 26.1 SD = 4.7 4 (November testing) 12..N12..N Mean = 35.3 SD = 5.1 Mean = 25.9 SD = 4.8 4 (November testing) N+1 N+2. N 4 Mean = 26.0 SD = 5.0
14
Example of Vertical Scaling Design (Common Items) Item Block 1 2 3 4 Year 1 Year 2 Year 3
15
Problems with Vertical Scaling If the construct or dimension being measured changes across grades/years/ forms, scores on different forms mean different things and we cannot reasonably place scores on a common scale May be appropriate for a construct like reading; less appropriate for mathematics, science, social studies, etc.
16
Problems with Vertical Scaling Both common person and common item designs have practical problems of items that may be too easy for one group and too hard for the other Must ensure that examinees have had exposure to content of common items or off-level test (cannot scale up, only down in common persons design)
17
Problems with Vertical Scaling Scaled scores are not interpretable in terms of what a student knows or can do Comparison of scores on scales that extend across several years is particularly risky
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.