Download presentation
1
CHAPTER 4: Using and Reporting Standardized Test Results
Assessment In Early Childhood Education Fifth Edition Sue C. Wortham Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
2
Chapter Objectives 1. Explain the difference between norm-referenced and criterion-referenced tests. 2. List common characteristics of norm-referenced and 3. Explain the advantages and disadvantages of using standardized tests. 4. Understand how test scores are interpreted and reported. 5. Describe how individual and group test results are used to report student progress and program effectiveness. 6. Discuss the advantages and disadvantages of using norm-referenced and criterion-referenced tests with young children. 7. Understand the difficulties in using standardized tests with young children. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
3
Distinctions Between Norm-Referenced and Criterion-Referenced Tests
Norm-referenced tests provide information on how the performance of an individual compares with that of a known group. Criterion-referenced tests provide information on how the individual performed on some standard or objective (without considering the performance of others). Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
4
Common Characteristics of Norm- and Criterion-Referenced Tests
Require a relevant and representative sample of test items Require specification of the achievement domain to be measured Use the same type of test items Use the same rules for item writing (except for item difficulty) Judged by the same qualities of goodness (validity and reliability) Useful in educational measurement (Linn and Gronlund, 2000) Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
5
Aptitude vs. Achievement Tests
Aptitude Tests Predict a student’s ability to learn a skill or accomplish a task. (Stanford Binet, Wechsler, SAT when used to predict success) Achievement Tests Measure what the student has learned or mastered. (California Achievement, IOWA Basic Skills, SAT when used to determine what has been learned) 15.5
6
Uses of Norm-Referenced Tests with School-Age Children
Achievement tests are: given to measure and analyze individual and group performance resulting from the educational program analyzed for trends in achievement used to describe the program effectiveness - areas of weakness and strength, and plans can be made to improve curriculum Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
7
How Criterion-Referenced Tests with Preschoolers Are Used
Developmental screening Diagnostic evaluation Instructional planning Developmental screenings determine whether further evaluation is needed to identify disabilities and strategies for remediation. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
8
Reasons Criterion-Referenced Tests Are Used with School-Age Children
Achievement test scores describe individual performance and are used to plan instruction for groups and individual students. Diagnostic evaluation intelligence batteries in academic content areas are used with students who demonstrate learning difficulties. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
9
Savant Syndrome condition in which a person otherwise limited in mental ability has an exceptional specific skill Calculation abilities Drawing Musical
10
Wortham. Assessment in Early Childhood Education, 5e.
© 2008 by Pearson Education, Inc. All Rights Reserved.
11
Wortham. Assessment in Early Childhood Education, 5e.
© 2008 by Pearson Education, Inc. All Rights Reserved.
12
The Psychometric Approach
Intelligence - A single attribute? Spearman ( ) 2 – factor theory of intelligence “g” = general ability “s” = special abilities
13
Figure 9.3 According to Spearman (1904), all intelligent abilities have an area of overlap, which he called (for “general”). Each ability also depends partly on an s (for “specific”) factor.
14
Figure 9.4a Measurements of sprinting, high jumping, and long jumping correlate with one another because they all depend on the same leg muscles. Similarly, the g factor that emerges in IQ testing could reflect a single ability that all tests tap.
15
Many attributes? Thurstone: 7 primary mental abilities Spatial ability, perceptual speed, numeric reasoning, verbal meaning, word fluency, memory, inductive reasoning
16
What is Intelligence? Fluid intelligence and crystallized intelligence
Cattell & Horn believed that the “g” factor has two components: - Fluid intelligence is the power of reasoning, solving unfamiliar problems, seeing relationships and gaining new knowledge - Crystallized intelligence is acquired knowledge and the application of that knowledge to experience.
17
Concept Check: A 16-year-old is learning to play chess and is becoming proficient enough to be accepted into the school’s chess club. Is this fluid or crystallized intelligence?
18
Concept Check: Ten years later, the chess player achieves grandmaster status. Is this a result of fluid or crystallized intelligence?
19
Gardner’s Theory of Multiple Intelligences
Logical-Mathematical Linguistic Musical Spatial Bodily-Kinesthetic Interpersonal Intrapersonal Naturalistic Existential Gardner’s Theory of Multiple Intelligences Instructor’s Notes Refer to Text/Discussion Topic: Refer students to Table 12.2 which highlights the components of Gardner’s multiple intelligences. Ask students in which of these areas they have relative strengths (and weaknesses). Discuss the implications of this model for classroom instruction. Discussion Topic: When slides 17, 18, and 19 have been presented, ask students to discuss the implications of these models for all students, not just those who are gifted. Copyright © Allyn & Bacon 2006
20
Gardner’s Multiple Intelligences
21
Sternberg’s Triarchic Theory
Contextual Component (“street smarts or practical”) Adapting to the environment Experiential Component: (creative) Response to novelty Automatization Componential Component (“academic or analytical”) Information processing Efficiency of strategies
22
Theories and Tests of Intelligence
IQ tests Intelligence quotient (IQ) tests attempt to measure an individual’s probable performance in school and similar settings. Binet ( ) and Simon created 1st IQ test in 1905
23
Binet Intelligence Tests
An individual’s level of mental development relative to others Mental Age Intelligence Quotient (IQ)
24
Theories and Tests of Intelligence
The Stanford-Binet test The Stanford-Binet test - V (2-85) The mean or average IQ score for all age groups is designated as 100 ± 15 (85-115). Given individually
25
Interpreting Test Scores
A child’s performance on a standardized test is meaningless until it can be compared with other scores. A raw score is translated into a standard score that reports how well the child’s performance compares with that of other children who took the same test. The bell-shaped normal curve is the graph on which the distribution of standard scores is arranged. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
26
The Normal Curve Represents the ideal normal distribution of test scores The scores are distributed in a bell-shaped frequency polygon, with most scores clustered toward the center of the curve (see Figure 4-5 on p. 87 of the text) Standard deviations are used to calculate how an individual scored, compared with the scores of the norming group Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
27
Normal Distribution Normal Distribution
28
© 2006 The McGraw-Hill Companies, Inc. All rights reserved
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Bell Curve 4.28
29
Individual Intelligence Tests The Wechsler Scales
Overall IQ and also verbal and performance IQs. (WPPSI-III) Wechsler Preschool and Primary Scale of Intelligence-Revised. Ages 2 ½ to 7 years, 3 months (WISC-IV) Wechsler Intelligence Scale for Children-Revised. Ages 6 to 16 years, 11 months (WAIS-IV) Wechsler Adult Intelligence Scale-Revised Ages 16 to 90 years, 11 months
30
WPPSI-III WPPSI
34
WAIS-III
36
WISC-IV Word Reasoning—measures reasoning with verbal material; child identifies underlying concept given successive clues. Matrix Reasoning—measures fluid reasoning a (highly reliable subtest on WAIS® –III and WPPSI™–III); child is presented with a partially filled grid and asked to select the item that properly completes the matrix. Picture Concepts—measures fluid reasoning, perceptual organization, and categorization (requires categorical reasoning without a verbal response); from each of two or three rows of objects, child selects objects that go together based on an underlying concept. Letter-Number Sequencing—measures working memory (adapted from WAIS–III); child is presented a mixed series of numbers and letters and repeats them numbers first (in numerical order), then letters (in alphabetical order). Cancellation—measures processing speed using random and structured animal target forms (foils are common non-animal objects).
37
WAIS - IV
41
Theories and Tests of Intelligence
Raven’s Progressive Matrices Psychologists created “culture-reduced” tests without language. It tests abstract reasoning ability (non-verbal intelligence or performance IQ)
45
Counting the Data-Frequency
Look at the set of data that follows on the next slide. A tally mark was made to count each time a score occurred Which number most likely represents the average score? Which number is the most frequently occurring score? Descriptive statistics are the mathematical procedures that are used to describe and summarize data.
46
Frequency Distribution
Scores 100 99 98 94 90 89 88 82 75 74 68 60 Tally 1 11 1111 1111 1 Frequency 1 2 5 7 10 6 Average Score? Most 88 Most Frequent Score? 88
47
This frequency count represents data that
closely represent a normal distribution. 1111 1 1111 Tally 11 1
48
Descriptive Statistics
15.48
49
Frequency Polygons Data 100 89 99 89 98 89 94 88 90 75 90 74 90 68
5 4 3 2 1 Scores
50
Measures of Central Tendency
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Measures of Central Tendency Measures of central tendency provide information about the average or typical score in a data set Mean: The numerical average of a group of scores Median: The score that falls exactly in the middle of a data set Mode: The score that occurs most often 15.50
51
Mean- To find the mean, simply add the
Central tendency = representative or typical value in a distribution Mean Same thing as an average Computed by Summing all the scores (sigma, ) Dividing by the number of scores (N) Mean- To find the mean, simply add the scores and divide by the number of scores in the set of data. = 355 Divide by the number of scores: 355/4 = 88.75
52
Mean
53
Measures of Central Tendency
Steps to computing the median 1. Line up scores from highest to lowest 2. Count up to middle score If there is 1 middle score, that’s the median If there are 2 middle scores, median is their average Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall
54
Median-The Middlemost point in a set of data
Data Set 1 100 99 98 97 96 90 88 85 80 79 Data Set 2 100 99 98 97 86 82 78 72 70 68 The median is 84 for this set. 84 represents the middle most point in this set of data. Median 96
55
Mode-The most frequently occurring score in
a set of data. Find the modes for the following sets of data: 88 and 87 are both modes for this set of data. This is called a bimodal distribution. Data Set 3 99 89 75 Data set 4 99 88 87 72 70 Mode: 89
56
Measures of Variability (Dispersion)
Range- Distance between the highest and lowest scores in a set of data. 35 = 35 is the range in this set of scores.
57
Variance - Describes the total amount
that a set of scores varies from the mean. 1. Subtract the mean from each score. When the mean for a set of data is 87, subtract 87 from each score.
59
2. Next-Square each difference-
multiply each difference by itself. 13 x 13 = 11 x 11 = x 8 = x 4 = -2 x -2 = -7 x -7 = -27x -27= = 13 = 11 = 8 = 4 = -2 = -7 = -27 3. Sum these differences 1,152 Sum of squares
60
4. Divide the sum of squares by the
number of scores. 1,152 divided by 7 = This number represents the variance for this set of data.
61
Standard Deviation-Represents the typical
amount that a score is expected to vary from the mean in a set of data. 5. To find the standard deviation, find the square root of the variance. For this set of data, find the square root of The standard deviation for this set of data is or 13.
66
Ceiling and Floor Effects
Ceiling effects Occur when scores can go no higher than an upper limit and “pile up” at the top e.g., scores on an easy exam, as shown on the right Causes negative skew Floor effects Occur when scores can go no lower than a lower limit and pile up at the bottom e.g., household income Causes positive skew
67
Skewed Frequency Distributions
Normal distribution (a) Skewed right (b) Fewer scores right of the peak Positively skewed Can be caused by a floor effect Skewed left (c) Fewer scores left of the peak Negatively skewed Can be caused by a ceiling effect
68
Understanding Descriptive Statistics
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Understanding Descriptive Statistics The Normal Distribution: A “bell-shaped” curve in which most of the scores are clustered around the mean; the farther from the mean, the less frequently the score occurs. 15.68
69
Commonly Reported Test Scores Based on the Normal Curve
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology, Second Edition, Classroom Update Commonly Reported Test Scores Based on the Normal Curve 15.69
70
Z Scores When values in a distribution are converted to Z scores, the distribution will have Mean of 0 Standard deviation of 1 Useful Allows variables to be compared to one another even when they are measured on different scales, have very different distributions, etc. Provides a generalized standard of comparison
71
Z Scores To compute a Z score, subtract the mean from a raw score and divide by the SD To convert a Z score back to a raw score, multiply the Z score by the SD and then add the mean
72
The Normal Curve Derived scores are used to specify where the individual score falls on the curve and how far above or below the mean the score falls Raw scores are transformed into percentiles, stanine or other standard scores All scoring scales are drawn parallel to the baseline of the normal curve; and use the deviation from the mean as the reference to compare an individual score with the mean score of a group Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
73
Percentile Ranks Percentiles represent the point on the normal curve
below which a percentage of test scores is distributed. A student’s percentile rank on a test indicates the percentage of students who scored lower in the comparison group. For example, if a student is ranked in the 55th percentile, the student’s score was 55% better than the comparison group who took the test. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
74
Stanines Parents find stanine results easiest to understand
because their child’s standardized test scores are reported as: 9 Very superior 8 Superior 7 Considerably above average 6 Slightly above average 5 Average 4 Slightly below average 3 Considerably below average 2 Poor 1 Very poor Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
75
Z Scores and T Scores Called standard scores
Report how many standard deviations a transformed raw score is located above or below the mean Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
76
Grade Equivalent Scores
Test publishers recommend that grade equivalents not be used to report to parents because they may not understand that the score does not mean the child should be placed in a higher or lower grade. Grade level results are compared with test results from grades above and below the grade, indicating whether the child performed above or below average. The grade equivalent score does not indicate grade level placement in school. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
77
Reporting Standardized Test Results
Both norm- and criterion-referenced information can be organized in a useful form. Scores can be reported for an individual, a class, a grade, a school, and a district. Strengths and weaknesses can be analyzed by content areas, by school, and by grade level. Achievement can be compared over several years to determine long-term improvement or decline. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
78
Reporting Test Results to Parents
A parent–teacher conference may be used to report test results. The teacher should explain: both the value and the limitations of the test scores why the test was chosen how the results will be used - for example, to plan appropriate learning experiences for their child Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
79
Advantages of Standardized Tests
Norm-referenced and criterion-referenced achievement tests provide valuable information regarding the effectiveness of curriculum and instruction. Teachers can determine curriculum strengths and weaknesses. Individual students’ reports determine who would benefit from additional instruction and those who are ready to move to more advanced learning experiences. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
80
Advantages of Standardized Tests
Standardized tests have unique qualities that are advantageous: Uniformity in test administration Quantifiable scores Norm referencing Validity and reliability Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
81
Disadvantages of Standardized Tests
Standardized tests are not necessarily the best method of evaluation of young children. A variety of strategies should be used in assessing children. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
82
Concerns About the Use of Standardized Tests
Use of tests with children from a different culture or whose first language is not English Use of standardized tests to deny children entrance to school, or retention in grade Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
83
No Child Left Behind Act
Assessment of Students with Disabilities and/or Limited English Proficiency (LEP) NCLB requires that all students be assessed regardless of their special needs Accommodations have been made for students with disabilities and for those who speak a language other than English or have limited English Limitations of the tests designed for NCLB when used with these populations has become an issue Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
84
Standardized Tests Have Effects On Curriculum And Instruction
Standardized tests only sample a few of the curriculum objectives. Pressures for higher test scores result in limitations on the curriculum that is taught. Instruction becomes focused on what will be tested and limits the balance of the curriculum. Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
85
Misapplication of Test Results
Using of standardized tests to decide school entry or the placement into early childhood programs is inappropriate because: tests do not differentiate between limited intelligence and limited opportunities to learn decisions on enrollment, retention, and placement in special classes should never be based on a single test score other sources of information, including systematic observation and samples of children’s work, should be a part of the evaluation process Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.