Assessment Training Nebo School District. Assessment Literacy.

Assessment Training Nebo School District

Assessment Literacy

Test Acronyms  CRT - Criterion Referenced Test 1-11  IOWA –Iowa Test of Basic Skills and Iowa Test of Educational Development 3,5,8,&11  UBSCT - Utah Basic Skills Competency Test 10-12  DWA - Direct Writing Assessment 6&9  UAA – Utah Alternate Assessment 1-12 with severe cognitive disabilities  UALPA - The Utah Academic Language Proficiency Assessment 1-12 ELL

Norm-Referenced Tests  Standardized Tests  Scores interpreted in terms of comparison to a specific group  Percentile scores are most common measurement of achievement  Percentile scores range from 1st to 99 th with the 50 th percentile being used to represent the national average  ITBS and ITED (IOWA) tests are the state adopted Norm-Referenced Assessments

Criterion-Referenced Tests  Standardized Tests  Every question/item is aligned to an explicitly stated educational objective  Used to identify which standards and objectives have been mastered by the examinee  CRT or End-of-Level tests in Language Arts, Math, and Science

Summative Assessment  Used to determine the students’ final understanding of material  State CRT tests are an example

Formative Assessment  Used to identify the students’ understanding of material, to provide feedback for teachers and learning experiences for students  Benchmarks, UTIPS, Running Records, and Student Interviews are all included in this category

Raw score  The number of correct responses on a test  A student answered 48 questions correctly

Percent Correct Score  The number of correct responses divided by the total number if items  49 out of 70 = 70%

Percentile Score  The percent of students who performed worse on a test  75 th percentile – 75% of examinees scored lower on the test than this examinee

Scaled Score  The students performance is based on an arbitrary numerical scale (can be alphabetical)  A scaled score correctly provides comparable information on student performance for different years on different tests

ACT  What is 36?  What is 28?  What is 12?  These numbers represent the value we place on numbers in a scale  Often we have the help of others such as colleges in setting value  Utah State University and University of Utah say you must have at least a score of 18

Scaled Scores  Act Scores range from 10-36 18-28 is considered proficient depending on school  Advanced Placement tests range from 1-5 3 is proficient  UBSCT and CRT range from 100-200 160 is proficient

Scaled Scores  Scaled scores offer the advantage of simplifying the reporting of results  There can be common score reporting for each level and for each test  No more specific percentages for cut scores for each subject  Far greater comparability between tests and years

Scaled Scores  CRTs and UBSCT use a cut score of 160  Each proficiency level has its own cut score  Proficiency levels range from 1-4 in NCLB and 1a-4 in UPASS (We will discuss this in the next session)

Example  If john has a raw score of 65 in 2004, and a raw score of 58 in 2005,does this show a decrease in performance?  If john has a scaled score of 165 in 2004, and a scaled score of 155 in 2005, does this show a decrease in performance?

Why Not Raw Scores  Most states do not release raw scores  Looking at raw scores can lead to an incorrect assumption  It is incorrect to compare raw scores from one year to those of the next  It is incorrect to compare the raw scores of one test to those of another

Career Home Runs EQUATING

Individually Ability Strength Skill Technique Knowledge Difficulty of the game Tightly Wound Baseballs Improved Bats Higher Pitchers Mound Changes in Season Length Steroids. Who Is The Greatest?

Comparisons  Impossible to compare Barry Bonds with Babe Ruth  Impossible to compare a game in 1914 to a game in 2006

Comparisons  Possible to compare johns ability on the 2005 language arts CRT with johns ability on the 2006 language arts CRT (Scaling)  Possible to compare the difficulty of the 2005 language arts CRT to the 2006 CRT (Equating)

Equating  Statistical process that takes different tests and makes them equal in difficulty  Disentangles differences between test difficulty and student ability

Equating  Common (anchor) items between test forms  Statistical comparison of common items for equivalent difficulty level  This statistical process ensures that results from test to test are accurately comparable and not subject to fluctuations due to unintentional changes in item difficulty

Equating Anchor Items Form X Anchor Items Form Y

Anchor Items  It is the performance of the two sets of anchor items across years that allow us to make interpretations about the relative difficulty of the non-anchor items  If student performance on the anchor items is the same, we conclude that the student achievement is the same  If student performance on the anchor items increases we can interoperate that student achievement increased  If student performance on the anchor items decreases we interoperate that student achievement decreased  We use this information to judge the difficulty of the non-anchor items

Why Equate  One test is more difficult than another  One group of examinees may be more intelligent than another  Both

Assessment Training Nebo School District. Assessment Literacy.

Similar presentations

Presentation on theme: "Assessment Training Nebo School District. Assessment Literacy."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Assessment Training Nebo School District. Assessment Literacy.

Similar presentations

Presentation on theme: "Assessment Training Nebo School District. Assessment Literacy."— Presentation transcript:

Similar presentations

About project

Feedback