Download presentation
Presentation is loading. Please wait.
Published byDerick Lindsey Modified over 9 years ago
2
SHOWTIME!!
3
EVALUATING ACHIEVEMENT
4
INTRODUCTION BOTH CHILDREN AND ADULTS WANT TO KNOW HOW THEY COMPARE TO OTHERS OR A STANDARD PRIMARY ROLE OF TEACHER OR PROGRAM LEADER IS TO PROMOTE DESIRABLE CHANGES IN PEOPLE FOR INSTRUCTIONAL OR PROGRAM PROCESS TO BE MEANINGFUL: -RELEVANT STATED OBJECTIVES -INSTRUCTION OR PROGRAM MUST BE DESIGNED TO ACHIEVE OBJECTIVES EFFECTIVELY -RELIABLE AND VALID EVALUATION PROCESS THAT ASSESSES ACHIEVEMENT TESTS ARE ADMINISTERED PRIMARILY TO FACILITATE THE ACHIEVEMENT OF INSTRUCTIONAL AND PROGRAM OBJECTIVES
5
INTRODUCTION EDUCATIONAL TESTS CAN BE USED FOR PLACEMENT, DIAGNOSIS, EVALUATION OF LEARNING, PREDICTION, PROGRAM EVALUATION, AND MOTIVATION (CHAPTER 1) EVALUATION IS NOT SYNONYMOUS WITH GRADING AND EVALUATION CAN OCCUR WITHOUT THE ASSIGNMENT OF GRADES A TEACHER THAT PASSES ALL STUDENTS REGARDLESS OF THEIR LEVEL OF ACHIEVEMENT OR A TRAINER WHO DOES NOT TELL HIS/HER CLIENT THAT THEY ARE NOT DOING WELL IS IGNORING HER/HIS PROFESSIONAL RESPONSIBILITIES
6
EVALUATION OFTEN FOLLOWS MEASUREMENT TAKING THE FORM OF A JUDGMENT ABOUT THE QUALITY OF A PERFORMANCE OBJECTIVITY OF EVALUATION INCREASES WHEN IT IS BASED ON DEFINED STANDARDS SUCH AS -REQUIRED LEVELS OF PERFORMANCE BASED ON TEACHER’S OR TRAINER’S EXPERIENCE AND/OR CONVICTIONS -THE RANKED PERFORMANCE OF THE REST OF THE GROUP -EXISTING STANDARDS CALLED NORMS
7
TYPES OF EVALUATION FORMATIVE EVALUATION THROUGHOUT THE PROGRAM MOTIVATES AND INFORMS PARTICIPANTS OF THEIR PROGRESS AS WELL AS ALLOWS FOR JUDGEMENT REGARDING THE PROGRAM’S EFFECTIVENSS SUMMATIVE EVALUATION IS THE FINAL MEASURMENT OF A PARTICIPANT’S PERFORMANCE AT THE END OF A PROGRAM WHICH OFTEN INVOLVES COMPARISON AMONG STUDENTS OR STUDENTS TO NORMS OR AN IDEAL STANDARD
8
STANDARDS FOR EVALUATION: CRITERION REFERENCE STANDARDS REPRESENTS THE LEVEL OF PERFOMRANCE THAT ALL INDIVIDUALS SHOULD BE ABLE TO ACHIEVE GIVEN PROPER INSTRUCTION MUST BE USED WITH EXPLICIT OBJECTIVES USED IN FORMATIVE EVALUATION TO DIAGNOSIS WEAKNESSES AND TO DETERMINE WHEN PARTICIPANTS ARE READY TO PROGRESS STANDARDS TEND TO BE PASS OR FAIL
9
EXAMPLES OF CRITERION- REFERENCED STANDARDS
10
PROCEDURES TO DEVELOP CRITERION-REFERENCED STANDARDS IDENTIFY THE SPECIFIC BEHAVIORS THAT MUST BE ACHIEVED TO ACCOMPLISH A BROAD OBJECTIVE DEVELOP CLEARLY DEFINED OBJECTIVES THAT CORRESPOND TO THE SPECIFIC BEHAVIORS DEVELOP STANDARDS THAT GIVE EVIDENCE OF SUCCESSFUL ACHIEVEMENT OF THE OBJECTIVE; THESE STANDARDS MAY BE BASED ON LOGIC, EXPERT OPINION, RESEARCH LITERATURE, AND/OR ANALYSIS OF TEST SCORES TRY THE SYSTEM AND EVALUATE THE STANDARDS; DETERMINE WHETHER THE STANDARDS MUST BE ALTERED AND DO SO IF NECESSARY “IF STANDARDS ARE TOO HIGH, VERY FEW PEOPLE WILL PASS AND RECEIVE POSITIVE REINFORCEMENT; IF STANDARDS ARE TOO LOW, MANY WILL PASS THAT MAY HAVE FALSE ILLUSIONS OF THEIR CAPABILITIES”
11
STANDARDS OF EVALUATION: NORM- REFERENCED STANDARDS COMPARE THE PERFORMANCES OF PEERS USED IN SUMMATIVE EVALUATION TO DETERMINE IF BROAD PROGRAM OBJECTIVES HAVE BEEN MET LEVELS OF PERFOMANCE ARE ESTABLISHED THAT DISTINGUISH BETWEEN ABILITY GROUPS RANGING FROM ‘HIGH ABILITY” TO “LOW ABILITY”
12
GRADING GRADING IS A TWO-FOLD PROCESS - THE SELECTION OF THE MEASUREMNTS (SUBJECTIVE OR OBJECTIVE) THAT FORM THE BASIS OF THE GRADE AND THE ACTUAL CALCULATION INSTRUCTIONAL PROCESS BEGINS WITH INSTRUCTIONAL OBJECTIVES AND CULMINATES WITH EVALUATION GRADES SHOULD BE BASED ON INSTRUCTIONAL OBJECTIVES AND THE SCORES FROM RELIABLE AND VALID TESTS SELECTION OF TESTING INSTRUMENTS SHOULD CONSIDER: -WHAT ARE THE INSTRUCTIONAL OBJECTIVES? -WERE THE STUDENTS TAUGHT IN ACCORDANCE WITH THESE OBJECTIVES? -DOES THE TEST YIELD SCORES THAT REFLECT ACHIEVEMENT OF THE OBJECTIVES?
13
GRADING ISSUES IS IT A MAJOR OBJECTIVE OF THE PHYSICAL EDUCATION PROGRAM? DO ALL STUDENTS HAVE IDENTICAL OPPORTUNITIES TO DEMONSTRATE THEIR ABILITY RELATIVE TO THE ATTRIBUTE? CAN THE ATTRIBUTE BE MEASURED SO THAT THE TEST SCORES ARE RELIABLE AND THE INTERPRETATIONS OF THE SCORES VALID? WERE THE GRADING POLICES EXPLAINED AT THE BEGINNING OF THE PROGRAM? WERE THE GRADES BASED ON A SUFFICIENT AMOUNT OF VALID EVIDENCE? WHAT SHOULD THE RANGE IN GRADING BE? SHOULD THE RANGE IN GRADING BE THE SAME FOR A BEGINNING COURSE COMPARED TO AN ADVANCED COURSE? SHOULD THE OVERALL QUALITY OF THE CLASS AFFECT THE GRADING DISTRIBUTION? DOES THE GRADING REPRESENT ONLY ACHIEVEMENT OR ACHIEVEMENT AND STUDENT EFFORT AS WELL? IF PASS-FAIL GRADES ARE ASSIGNED WILL ANYONE FAIL?
14
GENERALLY ACCEPTED GRADING PHILOSOPHY GRADE A STUDENT RECEIVES SHOULD NOT DEPEND ON -THE SEMESTER OR YEAR IN WHICH THE CLASS IS TAKEN -THE INSTRUCTOR, PARTICULARLY IF SEVERAL INSTRUCTORS TEACH THE COURSE -OTHER STUDENTS IN THE COURSE
15
GRADING METHODS NATURAL BREAKS TEACHER’S STANDARD RANK ORDER NORMS
16
GRADING METHODS: NATURAL BREAKS SCORES ARE LISTED FROM BEST TO WORST EACH BREAK OR GAP IS A CUT-OFF POINT FOR A LETTER GRADE USEFUL METHOD FOR TEACHERS WHO DO NOT BELIEVE IN SPECIFYING THE POSSIBLE GRADES AND PERCENTAGES FOR THESE GRADES POOREST METHOD OF ASSIGNING GRADES NON SEMESTER-TO-SEMESTER CONSISTENCY EACH STUDENT’S GRADE IS DEPENDENT ON THE PERFORMANCE OF OTHER STUDENTS IN THE CLASS
17
GRADING METHODS: NATURAL BREAKS
18
GRADING METHODS: TEACHER’S STANDARD GRADES ARE BASED ON THE TEACHER’S PERCEPTION OF WHAT IS FAIR AND APPROPRIATE, SOMETIMES WITHOUT ANALYZING ANY DATA EX.: 90-100 A, 80-89 B, ETC CONSISTENT STANDARDS FROM YEAR TO YEAR ARE POSSIBLE STUDENT’S PERFORMANCE IS NOT DEPENDENT ON THE PERFORMANCE OF OTHER STUDENTS GOOD METHOD FOR EXPERIENCED TEACHER’S WHO HAVE REASONABLE STANDARDS OR EXPECTATIONS OF STUDENTS’ ABILITIES NORM-REFERENCED STANDARDS DEVELOPED USING THE CRITERION-REFERENCED STANDARDS SET BY THE TEACHER
19
GRADING METHODS: RANK ORDER STRAIGHT FORWARD, NORM-REFERENCED METHOD OF GRADING TEACHER DECIDES LETTER GRADES WILL BE ASSIGNED AND WHAT PERCENTAGE OF THE CLASS SHOULD RECEIVE EACH LETTER GRADE SCORES ARE ORDERED AND GRADES ARE ASSIGNED ADVANTAGES INCLUDE THAT IT IS QUICK AND EASY TO USE AND ALLOWS GRADES TO BE DISTRIBUTED AS WANTED DISADVANTAGES INCLUDE THAT A STUDENT’S GRADE IS DEPENDENT ON THE GRADES OF OTHER STUDENTS AND THAT NO ALLOWANCE IS MADE FOR THE QUALITY OF THE CLASS WHICH RESULTS IN GRADES VARYING FROM SEMESTER TO SEMESTER
20
GRADING METHODS: RANK ORDER
21
GRADING METHODS: NORMS NORMS BASED ON ANALSYS OF THE DATA, NOT ON SUBJECTIVE STANDARDS CHOSEN BY THE TEACHER DEVELOPED BY GATHERING SCORES FOR A LARGE NUMBER OF INDIVIDUALS WITH SIMILAR DEMOGRAPHICS DATA IS STATISTICALLY ANALYZED AND PERFORMANCE STANDARDS ARE THEN CONSTRUCTED BASED ON THE ANALYSIS ADVANTAGES INCLUDE -THE STUDENT’S GRADE IS NOT BASED ON THE PERFORMANCE OF THE GROUP OR CLASS BEING EVALUATED -THE NORMS CAN BE USED FOR SEVERAL YEARS (THEREBY PROVIDING CONSISTENCY FROM SEMESTER TO SEMESTER) BEFORE THEY NEED TO RE-EVALUATED AND PERHAPS REVISED HOWEVER, THE TEACHER STILL NEEDS TO DECIDE HOW LETTER GRADES WILL BE ASSIGNED TO THE NORMS
22
GRADING METHODS: NORMS HOW WOULD YOU ASSIGN A LETTER GRADE TO THESE NORMS?
23
FINAL GRADES ASSIGNMENT OF A FINAL GRADE OR FINAL CLASSIFICATION (FITNESS OR REHAB) MUST BE BASED ON ALL AVAILABLE INFORMATION TEACHER SHOULD CHOOSE AND EXPLAIN THE FINAL GRADING SYSTEM AT THE BEGINNING OF A PROGRAM THREE METHODS OF ASSIGNING FINAL GRADES -SUM OF LETTER GRADES -POINT SYSTEM -SUM OF THE T-SCORES
24
SUM OF THE LETTER GRADES USED WHEN TEST SCORES REFLECT DIFFERENT UNITS OF MEASURE THAT CANNOT BE SUMMED SCORES ON TESTS ARE CONVERTED TO LETTER GRADES LETTER GRADES ON EACH TEST ARE CONVERTED TO POINTS (A+ = 14, A = 13, A- = 12, B+ = 11, ETC. DOWN TO F = 1 AND F- = 0) POINTS ON ALL TESTS ARE ADDED TOGETHER AND DIVIDED BY THE NUMBER OF TESTS TO GET AN AVERAGE SCORE (POINT VALUE), WHICH IS CONVERTED BACK INTO A LETTER GRADE USING THE 14-POINT SCALE ABOVE
25
SUM OF THE LETTER GRADES WHEN TESTS ARE EQUALLY WEIGHTED USING TABLE 5.5 AS AN EXAMPLE THAT HAD 5 TESTS SUM = 45 / 5 TESTS = 9 AVERAGE SCORE (POINT VALUE) OF 9 = B -
26
SUM OF THE LETTER GRADESWHEN TESTS ARE EQUALLY WEIGHTED USING TABLE 5.6 AS AN EXAMPLE THAT HAD 5 TESTS SUM = 59 / 5 TESTS = 11.8 AVERAGE SCORE (POINT VALUE) OF 11.8 = B+ AS 12 IS NEEDED FOR AN “A-” DOES THIS SEEM FAIR LOOKING AT THE TEST SCORES?
27
DRAWBACKS OF THE SUM OF THE LETTER GRADES METHOD LOSE INFORMATION BY CONVERTING TEST SCORES TO POINT VALUES 96% OR 93% ARE BOTH AN “A” OR 13 POINTS WASTE OF TIME TO CALCULATE THE MEAN NO ALLOWANCE IS MADE IN THE FINAL GRADE FOR THE REGRESSION EFFECT AND THUS VERY FEW HIGH OR LOW GRADES ARE GIVEN, MOST GRADES ARE IN THE MIDDLE OF THE RANGE REGRESSION EFFECT: A STUDENT WHO EARNS AN “A” OR A “F” ON ONE TEST IS LIKELIER ON THE NEXT TO EARN A GRADE CLOSER TO “C” THAN TO REPEAT THE FIRST PERFORMANCE
28
SUM OF THE LETTER GRADES WHEN TESTS ARE UNEQUALLY WEIGHTED
30
POINT SYSTEMS OFTEN USED BY CLASSROOM TEACHERS SO THAT ALL TEST SCORES ARE IN THE SAME UNIT OF MEASURE AND CAN BE EASILY COMBINED
31
SUM OF THE T-SCORES CHANGE TEST SCORE TO T-SCORES AND SUM THE T-SCORES AS PREVIOUSLY DISCUSSED POSSIBLE TO WEIGHT EACH TEST DIFFERENTLY IN SUMMING THE T- SCORES BY USING THE PROCEDURES JUST OUTLINED FOR WEIGHTING LETTER-GRADE POINTS
32
OTHER EVALUATION TECHNIQUES BEST OF 5 PEOPLE RECEIVE A SCHOLARHIP, JOB, PROMOTION, ETC –RANK-ORDER SITUATION WHEN THE 5 BEST POPLE ARE REWARDED (PASS) AND REST GET NOTHING (FAIL) NUMBER OF PEOPLE AWARDED OR RECOGNIZED IS NOT LIMITED –CRITERION-REFERENCED SITUATION THAT IDEALLY NEEDS A GOLD STANDARD OR A STANDARD ESTABLISHED BY EXPERT(S) PHYSICAL THERAPIST OR ATHLETIC TRAINER SETS A STANDARD FOR RELEASING PEOPLE FROM THERAPY PROGRAM -CRITERION REFERENDED STANDARD WHERE STANDARD SHOULD BE BASED ON MINIMUM STRENGTH OR ABILITY NEEDED TO FUNCTION IN DAILY LIFE
33
AUTHENTIC ASSESSMENT “AN ATTEMPT TO EVALUATE PEOPLE IN A REAL-LIFE OR MORE “AUTHENTIC” SETTING”
34
CHARACTERISTICS OF AUTHENTIC ASSESSMENT AUTHENTHIC ASSESSMENTS PRESENT CHALLENGES THAT ARE REPRESENTATIVE OF REAL LIFE AUTHENTIC ASSESSMENTS REQUIRE STUDENTS TO DEMONSTRATE HIGHER-LEVEL THINKING STUDENTS KNOW THE STANDARDS FOR ASSESSMENT FROM THE BEGINNING ALLOWING THEM TO CONSTANTLY RECEIVE FEEDBACK ABOUT THEIR PROGRESS AUTHENTIC ASSESSMENTS BECOME PART OF THE CURRICULUM RESULTING IN TEACHERS TEACHING TO THE TEST STUDENTS OFTEN PRESENT THE CULMINATION OF THE AUTHENTIC ASSESSMENT PUBLICLY THERE IS AM EMPHASIS ON PROCESS (HOW STUDENTS ARRIVE AT THE CORRECT ANSWER) AND NOT JUST PRODUCT (CORRECT ANSWER)
35
TYPES OF AUTHENTIC ASSESSMENT STUDENT PROJECTS STUDENT LOGS STUDENT JOURNALS PEER OBSERVATION SELF-ASSESSMENT GROUP PROJECTS PORTFOLIOS EVENT TASKS TEACHER OBSERVATION
36
RUBRICS OFTEN USED IN AUTHENTIC ASSESSMENT PERSON’S PERFORMANCE IS COMPARED TO CRITERIA SPECIFIED IN THE RUBRIC USING A SCALE THAT RANGES FROM 3 (OUTSTANDING, ACCEPTABLE, AND DEFICIENT) TO 5 (EXCELLENT, GOOD, SATISFACTORY, FAIR, AND POOR) LEVELS WHEN DESIGNING THE RUBRIC: –DECIDE WHICH ERRORS WOULD BE MOST JUSTIFIABLE FOR DISCRIMINATING BETWEEN ABILITY LEVELS –BE AS SPECIFIC AS POSSIBLE WHEN DESIGNING RUBRICS AS THIS WILL INCREASE OBJECTIVITY
39
CONCERNS WITH AUTHENTIC ASSESSMENT QUALITY (VALIDITY, RELIABILITY, AND OBJECTIVITY) OF AUTHENTIC ASSESSMENT HOW WELL DOES THE AUTHENTIC ASSESSMENT TEST RELATE TO OTHER MEASURES (CRITERION-RELATED VALIDITY) - ONE MEASURE OF VOLLEYBALL SKILL SHOULD BE RELATED TO OTHER MEASURES OF VOLLEYBALL SKILL ABILITY OF THE ASSESSMET TO PREDICT FUTURE PERFORMANCE (PREDICTIVE VALIDITY) - CAN AUTHENTIC ASSESSMENT OF CURRENT FITNESS PREDICT FUTURE FITNESS BEHAVIOR? DOES THE AUTHENTIC ASSESSMENT COVER ALL AREAS OF THE ACTIVITY (CONTENT VALIDITY) - ARE THE AUTHENTIC ASSESSMENT OF SOME SOFTBALL SKILLS REFLECTIVE OF THE ALL THE COMPONENTS OF SOFTBALL? DETAILED RUBRIC AND PRACTICE SCORING WITH THE RUBRIC CAN ENHANCE THE RELIABILITY AND OBJECTIVITY OF AUTHENTIC ASSESSMENT
40
CHARACTERISTICS OF GOOD AUTHENTIC ASSESSMENT MEANINGFUL FOR BOTH TEACHERS AND STUDENTS SERVES AS MOTIVATION FOR PERFORMANCE EVALUATES ATTRIBUTES THAT ARE IMPORTANT TO BOTH TEACHERS AND STUDENTS REQUIRES DEMONSTRATION OF COMPLEX COGNITION EXEMPLIES CURRENT STANDARDS OF CONTENT QUALITY MINIMIZES THE EFFECTS OF IRRELEVANT SKILLS POSSESSES EXPLICIT STANDARDS FOR RATING OR JUDGMENT
41
PROGRAM EVALUATION SUCCESS OF A PROGRAM DEPENDS LESS ON ITS PHYSICAL CHARACTERISTICS (E.G., FACILITIES AND EQUIPMENT) AND MORE ON THE MANNER IN WHICH THEY ARE USED IN THE INSTRUCTIONAL OR PROGRAM PROCESS ARE STUDENTS ACHIEVING IMPORTANT INSTRUCTIONAL OBJECTIVES? ARE PARTICIPANTS BENEFITING FROM THE PROGRAM? ARE PROGRAM OBJECTIVES BEING MET? BOTH FORMATIVE AND SUMMATIVE EVALUATION ARE REQUIRED FOR PROGRAM EVALUATION REQUIRES PLANNED DATA COLLECTION FROM TESTING AND/OR GOOD DAILY RECORD KEEPING
42
PROGRAM EVALUATION FORMATIVE EVALUAITON IS THE PROCESS OF JUDGING PERFOMANCE WITH REFERENCE TO AN ESTABLISHED STANDARD (CRITERION) FORMATIVE EVALUATION REQUIRES SELECTION OF WELL-DEFINED PROGRAM OBJECTIVES AND ESTABLISHMENT OF REALISTIC STANDARDS VALUE OF FORMULATIVE EVALUATION IS THAT IF IT SIGNALS THAT SOMETHING IS WRONG, ACTION CAN STILL BE TAKEN TO ADJUST AND IMPROVE THE PROGRAM
43
PROGRAM EVALUATION SUCCESS OF A PROGRAM IS REFLECTED IN TERMS OF HOW WELL A PROGRAM ACHIEVES ITS BROAD, OVERALL OBJECTIVES –SCHOOL PERFORMANCES ARE OFTEN COMPARED TO NATIONAL, STATEWIDE, OR LOCAL NORMS –IN FITNESS PROGRAMS PARTICIPANT PEFORMANCE IS OFTEN COMPARED TO NATIONAL OR LOCAL STANDARDS OR PERHAPS TO LONG-TERM EXERCISE ADHERENCE PATTERNS
44
PROGRAM IMPROVEMENT EVALUATION IS A DYNAMIC DECISION- MAKING PROCESS THAT WORKS TOWARD PROGRAM IMPROVEMENT FORMATIVE EVALUATION LEADS TO HIGHER-LEVEL ACHIEVEMENT OF OBJECTIVES EVALUATED SUMMATIVELY PRIMARY OBJECTIVE OF PROGRAM DEVELOPERS SHOULD BE IMPROVED PARTICIPANT PERFORMANCE OVER TIME
45
COMMENTS OR QUESTIONS?? THANK YOU, THANK YOU VERY MUCH!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.