Download presentation
Presentation is loading. Please wait.
Published byTyler Washington Modified over 9 years ago
1
Assessment Training Nebo School District
2
Assessment Literacy
3
Test Acronyms CRT - Criterion Referenced Test 1-11 IOWA –Iowa Test of Basic Skills and Iowa Test of Educational Development 3,5,8,&11 UBSCT - Utah Basic Skills Competency Test 10-12 DWA - Direct Writing Assessment 6&9 UAA – Utah Alternate Assessment 1-12 with severe cognitive disabilities UALPA - The Utah Academic Language Proficiency Assessment 1-12 ELL
4
Norm-Referenced Tests Standardized Tests Scores interpreted in terms of comparison to a specific group Percentile scores are most common measurement of achievement Percentile scores range from 1st to 99 th with the 50 th percentile being used to represent the national average ITBS and ITED (IOWA) tests are the state adopted Norm-Referenced Assessments
5
Criterion-Referenced Tests Standardized Tests Every question/item is aligned to an explicitly stated educational objective Used to identify which standards and objectives have been mastered by the examinee CRT or End-of-Level tests in Language Arts, Math, and Science
6
Summative Assessment Used to determine the students’ final understanding of material State CRT tests are an example
7
Formative Assessment Used to identify the students’ understanding of material, to provide feedback for teachers and learning experiences for students Benchmarks, UTIPS, Running Records, and Student Interviews are all included in this category
8
Raw score The number of correct responses on a test A student answered 48 questions correctly
9
Percent Correct Score The number of correct responses divided by the total number if items 49 out of 70 = 70%
10
Percentile Score The percent of students who performed worse on a test 75 th percentile – 75% of examinees scored lower on the test than this examinee
11
Scaled Score The students performance is based on an arbitrary numerical scale (can be alphabetical) A scaled score correctly provides comparable information on student performance for different years on different tests
12
ACT What is 36? What is 28? What is 12? These numbers represent the value we place on numbers in a scale Often we have the help of others such as colleges in setting value Utah State University and University of Utah say you must have at least a score of 18
13
Scaled Scores Act Scores range from 10-36 18-28 is considered proficient depending on school Advanced Placement tests range from 1-5 3 is proficient UBSCT and CRT range from 100-200 160 is proficient
14
Scaled Scores Scaled scores offer the advantage of simplifying the reporting of results There can be common score reporting for each level and for each test No more specific percentages for cut scores for each subject Far greater comparability between tests and years
15
Scaled Scores CRTs and UBSCT use a cut score of 160 Each proficiency level has its own cut score Proficiency levels range from 1-4 in NCLB and 1a-4 in UPASS (We will discuss this in the next session)
16
Example If john has a raw score of 65 in 2004, and a raw score of 58 in 2005,does this show a decrease in performance? If john has a scaled score of 165 in 2004, and a scaled score of 155 in 2005, does this show a decrease in performance?
17
Why Not Raw Scores Most states do not release raw scores Looking at raw scores can lead to an incorrect assumption It is incorrect to compare raw scores from one year to those of the next It is incorrect to compare the raw scores of one test to those of another
18
Career Home Runs EQUATING
19
Individually Ability Strength Skill Technique Knowledge Difficulty of the game Tightly Wound Baseballs Improved Bats Higher Pitchers Mound Changes in Season Length Steroids. Who Is The Greatest?
20
Comparisons Impossible to compare Barry Bonds with Babe Ruth Impossible to compare a game in 1914 to a game in 2006
21
Comparisons Possible to compare johns ability on the 2005 language arts CRT with johns ability on the 2006 language arts CRT (Scaling) Possible to compare the difficulty of the 2005 language arts CRT to the 2006 CRT (Equating)
22
Equating Statistical process that takes different tests and makes them equal in difficulty Disentangles differences between test difficulty and student ability
23
Equating Common (anchor) items between test forms Statistical comparison of common items for equivalent difficulty level This statistical process ensures that results from test to test are accurately comparable and not subject to fluctuations due to unintentional changes in item difficulty
24
Equating Anchor Items Form X Anchor Items Form Y
25
Anchor Items It is the performance of the two sets of anchor items across years that allow us to make interpretations about the relative difficulty of the non-anchor items If student performance on the anchor items is the same, we conclude that the student achievement is the same If student performance on the anchor items increases we can interoperate that student achievement increased If student performance on the anchor items decreases we interoperate that student achievement decreased We use this information to judge the difficulty of the non-anchor items
26
Why Equate One test is more difficult than another One group of examinees may be more intelligent than another Both
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.