MEASUREMENT AND EVALUATION
IMPORTANCE AND PURPOSE OF MEASUREMENT AND EVALUATION IN HUMAN PERFORMANCE
DEFINITIONS MEASUREMENT - COLLECTION OF INFORMATION ON WHICH A DECISION IS BASED EVALUATION - THE USE OF MEASUREMENT IN MAKING DECISIONS
• INTERDEPENDENT CONCEPTS AS EVALUATION IS A PROCESS THAT USES MEASUREMENTS AND THE PURPOSE OF MEASUREMENT IS TO ACCURATELY COLLECT INFORMATION USING TESTS FOR EVALUATION • IMPROVED MEASUREMENT LEADS TO ACCURATE EVALUATION “GARBAGE IN, GARBAGE OUT”
OBJECTIVE VERSUS SUBJECTIVE TEST CONTINUUM OBJECTIVE TEST - 2 OR MORE PEOPLE SCORE THE SAME TEST AND ASSIGN A SIMILAR GRADE DEFINED SCORING SYSTEM AND TRAINED TESTERS INCREASES OBJECTIVITY HIGHLY SUBJECTIVE TEST LACKS A STANDARDIZED SCORING SYSTEM
EVALUATION COLLECT SUITABLE DATA (MEASUREMENT) JUDGE THE VALUE OF THE DATA ACCORDING TO SOME STANDARD (I.E., CRITERION-REFERENCED STANDARD OR NORM-REFERENCED STANDARD) MAKE DECISIONS BASED ON THE DATA
FUNCTIONS OF MEASUREMENT AND EVALUATION
PLACEMENT in classes/programs or grouping based on ability DIAGNOSIS of weaknesses EVALUATION OF ACHIEVEMENT to determine if individuals have reached important objectives
PREDICTION of an individual’s level of achievement in future activities or predict one measure from another measure PROGRAM EVALUATION MOTIVATION
FORMATIVE AND SUMMATIVE EVALUATION
FORMATIVE EVALUATION JUDGMENT OF ACHIEVEMENT DURING THE PROCESS OF LEARNING OR TRAINING PROVIDES FEEDBACK DURING THE PROCESS TO BOTH THE LEARNER/ATHLETE AND TEACHER/COACH “WHAT IS SUCCESSFUL AND WHAT NEEDS IMPROVEMENT”
SUMMATIVE EVALUATION JUDGMENT OF ACHIEVEMENT AT THE END OF AN INSTRUCTIONAL UNIT OR PROGRAM TYPICALLY INVOLVES TEST ADMINISTRATION AT THE END OF AN INSTRUCTIONAL UNIT OR TRAINING PERIOD USED TO DECIDE IF BROAD OBJECTIVES HAVE BEEN ACHIEVED
STANDARDS FOR EVALUATION
“EVALUATION IS THE PROCESS OF GIVING MEANING TO A MEASUREMENT BY JUDGING IT AGAINST SOME STANDARD”
CRITERION-REFERENCED (C-R) STANDARD IS USED TO DETERMINE IF SOMEONE HAS ATTAINED A SPECIFIED STANDARD NORM-REFERENCE (N-R) STANDARD IS USED TO JUDGE AN INDIVIDUAL’S PERFORMANCE IN RELATION TO THE PERFORMANCES OF OTHER MEMBERS OF A WELL-DEFINED GROUP
CRITERION-REFERENCED (C-R) STANDARDS ARE USEFUL FOR SETTING PERFORMANCE STANDARDS FOR ALL NORM-REFERENCED (N-R) STANDARDS ARE VALUABLE FOR COMPARISONS AMONG INDIVIDUALS WHEN THE SITUATION REQUIRES A DEGREE OF SENSITIVITY OR DISCRIMINATION IN ABILITY
• NORM-REFERENCED STANDARDS - DEVELOPED BY TESTING A LARGE GROUP OF PEOPLE - USING DESCRIPTIVE STATISTICS TO DEVELOP STANDARDS - PERCENTILE RANKS ARE A COMMON NORMING METHOD • MAJOR CONCERN - GROUP CHARACTERISTICS USED TO DEVELOP NORMS MAY NOT RESULT IN DESIRABLE NORMS; EXAMPLES WITH BODY COMPOSTION AND BLOOD CHOLESTEROL LEVELS WERE AVERAGE MAY NOT BE DESIRABLE
CRITERION-REFERENCED STANDARDS - PREDETERMINED STANDARD OF PERFORMANCE SHOWS THE INDIVIDUAL HAS ACHIEVED A DESIRED LEVEL OF PERFORMANCE - PERFORMANCE OF INDIVIDUAL IS NOT COMPARED WITH THAT OF OTHER INDIVIDUALS “COMMON PRACTICE TO APPLY A CRITERION-REFERENCED STANDARD TO A NORM-REFERENCED TEST”
DETERMINING ACCURACY OF CRITERION-REFERENCED (C-R) STANDARDS ACCURACY EXAMINED BY USING A 2 X 2 CONTIGENCY TABLE C-R TEST RELIABILITY EXAMINES THE CONSISTENCY OF CLASSIFICATION
LIMITATIONS OF CRITERION-REFERENCED (C-R) STANDARDS NOT ALWAYS POSSIBLE TO FIND A CRITERION THAT EXPLICITLY DEFINES MASTERY, PARTICULARLY IN SOME SKILLS
LIMITATIONS OF CRITERION-REFERENCED (C-R) STANDARDS ACCURACY OF C-R TEST VARIES WITH THE POPULATION BEING TESTED
EXAMPLE: ACCURACY OF EXERCISE STRESS TEST VARIES WITH THE DISEASE PREVALENCE IN THE GROUP STUDIED (I.E., PERCENTAGE OF PATIENTS WHO TRULY HAVE CORNOARY ARTERY DISEASE
MODELS OF EVALUATION
EDUCATIONAL MODEL
ADULT FITNESS MODEL
QUESTIONS OR COMMENTS??