Download presentation
Presentation is loading. Please wait.
Published byWilla Moody Modified over 8 years ago
2
Assessing Student Performance Characteristics of Good Assessment Instruments (c) 2007 McGraw-Hill Higher Education. All rights reserved.
3
Tests and Measurements Definitions l Test = Measurement = Assessment l Evaluation - involves judgment l Formative tests - to help form instruction l Summative tests - to summarize learning l Norm-referenced - compared with peers l Criterion-referenced - compared with a standard (c) 2007 McGraw-Hill Higher Education. All rights reserved.
4
Characteristics of a Good Test l Valid - Tests what it is supposed to test l Reliable - Consistent l Objective - Scorers or raters agree l Easy to Administer - –Class Time –Set-up l Low Cost - Equipment (c) 2007 McGraw-Hill Higher Education. All rights reserved.
5
Validity A test is valid for a given purpose i.e. skilled high school male soccer players (c) 2007 McGraw-Hill Higher Education. All rights reserved.
6
Two Types of Tests Norm-ReferencedCriterion-Referenced V R O (c) 2007 McGraw-Hill Higher Education. All rights reserved.
7
Norm-Referenced Assessment l Your score is compared with the scores of your peers l Used for standardized tests and summative tests (c) 2007 McGraw-Hill Higher Education. All rights reserved.
8
Norm-Referenced Validity, Reliability, Objectivity Adapted from Morrow, Jackson, Disch, and Mood (c) 2007 McGraw-Hill Higher Education. All rights reserved.
9
To be valid a test must be reliable and relevant. Consistent Tests what it is supposed to test (c) 2007 McGraw-Hill Higher Education. All rights reserved.
10
However a test can be reliable without being valid. For example, a volleyball wall volley is reliable but not very valid (c) 2007 McGraw-Hill Higher Education. All rights reserved.
11
To be reliable a test must be objective. Reliability of Scorers (c) 2007 McGraw-Hill Higher Education. All rights reserved.
12
Two Types of Norm-Referenced Reliability (c) 2007 McGraw-Hill Higher Education. All rights reserved.
13
Interclass Reliability l Inter means between, i.e. interscholastic l The statistic used is the Pearson Product Moment method of correlation l The symbol is r (c) 2007 McGraw-Hill Higher Education. All rights reserved.
14
Intraclass Reliability l Intra means within, i.e. intramural l The statistic used is AN alysi s O f VA riance l The symbol is R (c) 2007 McGraw-Hill Higher Education. All rights reserved.
15
Establishing Norm- Referenced Reliability Interclass Methods Test - Retest Equivalent forms Split Halves - Spearman Brown prophecy formula (c) 2007 McGraw-Hill Higher Education. All rights reserved.
16
Establishing Norm- Referenced Reliability Intraclass Methods Test - Retest Alpha Equivalent forms Split Halves - KR 20 and KR 21 (c) 2007 McGraw-Hill Higher Education. All rights reserved.
17
Interpreting Reliability Coefficients l.80 -1.00 = Very high relationship l.60 -.79 = High relationship l.40 -.59 = Moderate relationship l.20 -.39 = Low relationship l Below.20 = Little or no relationship l Desirable for most teacher-made tests = >.70 (c) 2007 McGraw-Hill Higher Education. All rights reserved.
18
Factors Influencing Reliability l Type of Test –High reliability expected on tests of maximum effort –Reliability decreases as accuracy demands increase l Range of ability –Best when reliability was established for your specific group (c) 2007 McGraw-Hill Higher Education. All rights reserved.
19
Factors Influencing Reliability l Level of ability –Most reliable with high-skilled or low- skilled –Least reliable with middle-skilled group l Test length –Most reliable with a longer test –Least reliable with a shorter test (c) 2007 McGraw-Hill Higher Education. All rights reserved.
20
Two Types of Norm- Referenced Objectivity (c) 2007 McGraw-Hill Higher Education. All rights reserved.
21
Objectivity Interrater 2 or more raters 1 occasion Discussion and agreement on scale increases objectivity (c) 2007 McGraw-Hill Higher Education. All rights reserved.
22
Objectivity Interrater 2 or more raters 1 occasion Intrarater 1 rater 2 or more occasions Use videotape (c) 2007 McGraw-Hill Higher Education. All rights reserved.
23
Types of Norm-Referenced Validity (c) 2007 McGraw-Hill Higher Education. All rights reserved.
24
Content and Logical Validity l Use content validity for written tests. l Use logical validity for skills tests, fitness tests, and affective tests. (c) 2007 McGraw-Hill Higher Education. All rights reserved.
25
Use a Table of Specifications to establish content validity. (c) 2007 McGraw-Hill Higher Education. All rights reserved.
26
To establish logical validity, choose tests that test the elements of the activity. For example, the elements of beginning volleyball are serve, pass, set, hit (c) 2007 McGraw-Hill Higher Education. All rights reserved.
27
Criterion-Related Validity (c) 2007 McGraw-Hill Higher Education. All rights reserved.
28
Criterion-Related Validity Concurrent Validity l Compares the new test with a criterion test (a valid test of the attribute) l Established with Pearson correlation or a cross-validation study l Examples: –Skinfolds vs. hydrostatic weighing –1.5 mile run vs. MaxVO 2 (c) 2007 McGraw-Hill Higher Education. All rights reserved.
29
Criterion-related Validity Predictive Validity l Degree to which the score on a new test can predict the score on the criterion test l Established with regression equation l Examples: –Y = a + bx –MaxVO 2 =.123 +.456 (1.5 mile run score) (c) 2007 McGraw-Hill Higher Education. All rights reserved.
30
Construct Validity (c) 2007 McGraw-Hill Higher Education. All rights reserved.
31
Construct Validity l Degree to which a test measures a trait that cannot be directly measured l A construct must be measured by indicators. Examples: –Self-efficacy - engages in activity, loves the activity –Fitness - engages in fitness activities (c) 2007 McGraw-Hill Higher Education. All rights reserved.
32
Establishing Construct Validity l Find two groups--one that has the trait and one that doesn’t l Test both groups l Various statistical methods can be used (c) 2007 McGraw-Hill Higher Education. All rights reserved.
33
Interpreting Validity Coefficients l Above.90 desirable l Above.80 acceptable l With predictive validity--between.60 and.79 is okay in some instances (especially with affective tests) (c) 2007 McGraw-Hill Higher Education. All rights reserved.
34
Two Types of Tests Norm-ReferencedCriterion-Referenced V R O (c) 2007 McGraw-Hill Higher Education. All rights reserved.
35
Criterion-Referenced Assessment l Evaluation using a predetermined standard of performance = what should be l Validity--Students classified correctly as masters or nonmasters l Reliability--Students classified consistently as masters or nonmasters (c) 2007 McGraw-Hill Higher Education. All rights reserved.
36
Criterion-Referenced Validity (c) 2007 McGraw-Hill Higher Education. All rights reserved.
37
Assessing Student Performance Characteristics of Good Assessment Instruments (c) 2007 McGraw-Hill Higher Education. All rights reserved.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.