Michigan Assessment Consortium Common Assessment Development Series Module 14 – Presenting the Results of an Assessment.

Slides:



Advertisements
Similar presentations
Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.
Advertisements

Using Charts and Graphs in the Classroom
AS Sociology Exam Technique.
An Introduction to Test Construction
Representing and Analyzing Data 3,9,5,2,7,9,6,4,10,1,9, First, before finding the mode, median, range, we should_______________ What is the mode? ______.
Rubric Design Denise White Office of Instruction WVDE.
M & M’s Counting Activity
SADC Course in Statistics Good graphs & charts using Excel Module B2 Sessions 6 & 7.
Take a Test: Answer These Questions About Preparations for Kansas Assessments 1.What’s the difference between a “practice test” and a “formative test”?
Correction, feedback and assessment: Their role in learning
Scoring Terminology Used in Assessment in Special Education
Michigan Assessment Consortium Common Assessment Development Series Putting Together The Test Blueprint.
How to Make a Test & Judge its Quality. Aim of the Talk Acquaint teachers with the characteristics of a good and objective test See Item Analysis techniques.
Reading charts and graphs Interpreting Data Bridging the gap to.
Unit 8: Presenting Data in Charts, Graphs and Tables
Michigan Assessment Consortium Common Assessment Development Series Module 14 – Presenting the Results of an Assessment.
1 CommunicationEnquiry – Planning– Planning Interdependence of Organism Y7 Cells test Sustainable Earth Investigating Dissolving Enquiry - ReflectingReflecting.
Data Driven Decisions Moving from 3D to D 3. Data Driven Decisions Moving from 3D to D 3 Malcolm Thomas Director, Evaluation Services Escambia School.
Copyright © 2011 Pearson Education, Inc. Statistical Reasoning.
East Meets West Conference October 30, Georgia Milestones Comprehensive – single program, not series of tests (e.g., CRCT; EOCT; WA); formative.
Student Learning Targets (SLT) You Can Do This! Getting Ready for the School Year.
Creating and Speaking from a Presentation Steven Reid 1.
Michigan Assessment Consortium Common Assessment Development Series Module 6 – The Test Blueprint.
Reports and Scores Fen Chou, Ph.D. Louisiana Department of Education August 2006.
Graphing With Excel 2010 University of Michigan – Dearborn Science Learning Center Based on a presentation by James Golen Revised by Annette Sieg…
LSP 120: Quantitative Reasoning and Technological Literacy Section 903 Özlem Elgün.
Brock’s Gap Intermediate School Hoover City Schools Testing- Spring 2014 Results / Analysis- Fall 2014.
Lesson Thirteen Standardized Test. Yuan 2 Contents Components of a Standardized test Reasons for the Name “Standardized” Reasons for Using a Standardized.
Week 4 LSP 120 Joanna Deszcz. 3 Types of Graphs used in QR  Pie Charts Very limited use Category sets must make a whole  XY Graphs or Line Graphs Use.
Standardized Test Scores Common Representations for Parents and Students.
Introduction to World History AP
Introduction to GREAT for ELs Office of Student Assessment Wisconsin Department of Public Instruction (608)
LSP 120: Quantitative Reasoning and Technological Literacy Özlem Elgün Prepared by Ozlem Elgun1.
Using Data to Improve Student Achievement Summer 2006 Preschool CSDC.
Chapter 3 Understanding Test Scores Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition.
Examples of different formulas and their uses....
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Introduction to GREAT for ELs Office of Student Assessment Wisconsin Department of Public Instruction (608)
Accommodations Required Content for STC and TA Training.
Michigan Assessment Consortium Building and Using Common Assessments: A Professional Development Series Module 1 Overview of the Series.
Graphing.
GrowingKnowing.com © Frequency distribution Given a 1000 rows of data, most people cannot see any useful information, just rows and rows of data.
Unit 42 : Spreadsheet Modelling
Using Data to Improve Student Achievement Summer 2006 Preschool CSDC.
ESI: Education Statistics Investigations Sarah McKenzie PhD TOSA: Data and Assessment Fayetteville Public Schools.
Advanced Graphing Using Excel V.1 Part II: Giving your graph style Written and Created by: James Golen University of Michigan – Dearborn Science Learning.
Statistical Analysis Topic – Math skills requirements.
2 pt 3 pt 4 pt 5pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2pt 3 pt 4pt 5 pt 1pt 2pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4pt 5 pt 1pt Terms 2 Terms 3 Terms 4 Terms 5 Terms.
Basic steps to get the easy marks
Interpreting Results Modelling. Does it work? It is very important to test that the data entered is correct. This can be done by spot checking some of.
SAT-10/ARMT Results (Stanford 10),was administered to Alabama students in Grades 3-8 for the first time in April Norm-referenced scores Enable us.
Berry Middle School/Spain Park High School Hoover City Schools Testing- Spring 2014 Results / Analysis- Fall 2014.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
The New MCAIII Science Benchmark Reports for 2015 Minnesota Department of Education Science Assessment Specialist Jim WoodDawn Cameron
Using Data to Improve Student Achievement Summer 2006 Preschool CSDC.
Probability & Statistics Displays of Quantitative Data
SPREADSHEETS Parts of a graph Data Range X and Y axes
graphical representation of data
Prepared by: Toni Joy Thurs Atayoc, RMT
graphical representation of data
Make Your Data Tell a Story
Standards and Assessment Alternatives
Ace it! Summer Conference 2011 Assessment
graphical representation of data
Understanding and Using Standardized Tests
Michigan Assessment Consortium Common Assessment Development Series Module 18 Reliability MAC CAD-PD Mod-6 BRF [comp].ppt 1.
Statistical Reasoning
Suitability Test Wednesday, 22 May 2019.
Presentation transcript:

Michigan Assessment Consortium Common Assessment Development Series Module 14 – Presenting the Results of an Assessment

Developed by Bruce R. Fay, PhD & Ellen Vorenkamp, EdD Assessment Consultants Wayne RESA

Support The Michigan Assessment Consortium professional development series in common assessment development is funded in part by the Michigan Association of Intermediate School Administrators in cooperation with …

In Module 14 you will learn about Score types… Score types… Standards-based reports… Standards-based reports… Graphical Representations… Graphical Representations…

So, youve… Developed a test (for use as a common assessment) Developed a test (for use as a common assessment) Pilot / field-tested it (right?) Pilot / field-tested it (right?) Looked at the field test results (of course) Looked at the field test results (of course) Now what? Now what?

Presenting Your Results Before you present the results of your test, you need to be clear about: Before you present the results of your test, you need to be clear about: Who the audience is Who the audience is Why they are seeing this data? (What?) Why they are seeing this data? (What?) Why they should care about it? (So what?) Why they should care about it? (So what?) What you want them to do as a result of seeing it? (Now what?) What you want them to do as a result of seeing it? (Now what?)

SCORE TYPES

A score by any other name Many score types that you may have heard of are really only appropriate for Norm-Referenced Tests (NRT), such as percentile rank, stanine, and grade level equivalent. Many score types that you may have heard of are really only appropriate for Norm-Referenced Tests (NRT), such as percentile rank, stanine, and grade level equivalent. Your common assessment is a Criterion- Referenced Test (CRT), so lets focus on score types that are appropriate for that. Your common assessment is a Criterion- Referenced Test (CRT), so lets focus on score types that are appropriate for that.

Raw Scores Number of items correct or Number of items correct or Number of points earned Number of points earned Q? Whats the difference? Q? Whats the difference? A! None, if each item has the same point value, otherwise… A! None, if each item has the same point value, otherwise…

Scaled Score (equal weight) If each test item has the same weight, say 1 point (1 if correct, 0 if wrong) then % correct is: If each test item has the same weight, say 1 point (1 if correct, 0 if wrong) then % correct is: The simplest scaled score you can create The simplest scaled score you can create The same as %points earned The same as %points earned Puts the raw score on a scale of 0 – 100 Puts the raw score on a scale of 0 – 100

Scaled Score (unequal weight) If each test item does not have the same number of points (there are weighted and/or partial credit items on the test) then If each test item does not have the same number of points (there are weighted and/or partial credit items on the test) then % correct becomes % of total possible points earned % correct becomes % of total possible points earned You still end up with a 0 – 100 scale You still end up with a 0 – 100 scale

% Correct Features (Issues) Features A common scale, as in widely used A common scale, as in widely used A common scale, as in the same regardless of raw score points A common scale, as in the same regardless of raw score points Intuitively interpretable (maybe) Intuitively interpretable (maybe) Permits comparisons between different tests Permits comparisons between different tests Issues Can/will be misinterpreted Can/will be misinterpreted Can make a 10 point test and a 100 point test appear equally important Can make a 10 point test and a 100 point test appear equally important Widely held belief that scores in certain ranges (60-70, 70-80, etc.) have some inherent meaning Widely held belief that scores in certain ranges (60-70, 70-80, etc.) have some inherent meaning

Interpretation of % Correct Q? Is 50% correct good or bad? Q? Is 50% correct good or bad? A!: We dont know yet. We dont discuss standard–setting until the next module (15). A!: We dont know yet. We dont discuss standard–setting until the next module (15). But most people think it is intuitively obvious that this is a bad score. But most people think it is intuitively obvious that this is a bad score.

Other ways to scale? Yes, but we dont really need them… Yes, but we dont really need them…

STANDARDS-BASED REPORTS

Two kinds of standards Content Standards Content Standards The definition of the content to be learned; what students are to know and be able to do The definition of the content to be learned; what students are to know and be able to do Performance Standards Performance Standards The definition of how good is good enough on a test to determine if, or the extent to which, students know and can do The definition of how good is good enough on a test to determine if, or the extent to which, students know and can do

Reporting by Content Standards This is our concern in this module This is our concern in this module The next module (15) deals with performance standards The next module (15) deals with performance standards

Lets consider… A test covering 5 GLCEs with 5 selected- response items per GLCE, with each item worth 1 point (25 points total). Q? What does a raw score of 20 (a % correct scaled score of 80%) mean? Q? What does a raw score of 20 (a % correct scaled score of 80%) mean? A! It depends A! It depends

Depends on What? Student A GLCE 1: 4/5 GLCE 1: 4/5 GLCE 2: 4/5 GLCE 2: 4/5 GLCE 3: 4/5 GLCE 3: 4/5 GLCE 4: 4/4 GLCE 4: 4/4 GLCE 5: 4/5 GLCE 5: 4/5 Student B GLCE 1: 5/5 GLCE 1: 5/5 GLCE 2: 5/5 GLCE 2: 5/5 GLCE 3: 5/5 GLCE 3: 5/5 GLCE 4: 3/5 GLCE 4: 3/5 GLCE 4: 2/5 GLCE 4: 2/5 Same or different?

How about these two? Student C GLCE 1: 5/5 GLCE 1: 5/5 GLCE 2: 5/5 GLCE 2: 5/5 GLCE 3: 4/5 GLCE 3: 4/5 GLCE 4: 3/5 GLCE 4: 3/5 GLCE 5: 3/5 GLCE 5: 3/5 Student D GLCE 1: 5/5 GLCE 1: 5/5 GLCE 2: 5/5 GLCE 2: 5/5 GLCE 3: 5/5 GLCE 3: 5/5 GLCE 4: 5/5 GLCE 4: 5/5 GLCE 5: 0/5 GLCE 5: 0/5 These 4 examples all have a raw score of 20 (80% correct) but represent 4 different performances by the students.

Another way to see it GLCEA #A %B #B %C #C %D #D % total

Scores by Standard Remember, we havent set performance standards yet, so we really cant say what these scores mean Remember, we havent set performance standards yet, so we really cant say what these scores mean Even so, 5 out 5 may suggest that a student knows the material and 0 out 5 may suggest that they dont (depends on item-GLCE match) Even so, 5 out 5 may suggest that a student knows the material and 0 out 5 may suggest that they dont (depends on item-GLCE match) However…even though this is a CRT, you cant make instructional decisions without the context of the overall pattern of scores However…even though this is a CRT, you cant make instructional decisions without the context of the overall pattern of scores

Say what? There will often be extreme scores (outliers) that are not representative of most of the scores in a set. There will often be extreme scores (outliers) that are not representative of most of the scores in a set. Q? What if most of the students scored a 0 or a 1 on GLCE 5 in the example? Q? What if most of the students scored a 0 or a 1 on GLCE 5 in the example? A! Maybe a picture would help A! Maybe a picture would help

GRAPHICAL REPRESENTATIONS Or, I can see clearly now

Guidelines for Good Graphs Title & Subtitles Title & Subtitles Data Source and Time Frame Data Source and Time Frame Axis Labels Axis Labels Legend Legend Viewable Colors Viewable Colors Readability (3-D doesnt make it better) Readability (3-D doesnt make it better)

Appropriate Type Bar Graphs Bar Graphs Line Graphs Line Graphs Scatterplots Scatterplots Stem & Leaf Stem & Leaf Pie Charts (evil) Pie Charts (evil)

Results for 25 students (# scoring at each score point for each GLCE)

The Data Heres how the spreadsheet is set up Heres how the spreadsheet is set up GLCE GLCE GLCE GLCE GLCE Note: This will be replaced with a table so it looks better

Lets Assume… We have established that 3 out of 5 on each standard is an acceptable standard of evidence that a student understands the GLCE in question (hey, these were hard items) We have established that 3 out of 5 on each standard is an acceptable standard of evidence that a student understands the GLCE in question (hey, these were hard items) Then students who score a 3, 4, or 5 on the cluster of items for a GLCE can be considered proficient while students with a 2, 1, or 0 are not. Then students who score a 3, 4, or 5 on the cluster of items for a GLCE can be considered proficient while students with a 2, 1, or 0 are not.

Proficiency by Standard (for 25 Students) GLCE# Not Prof% Not Prof# Prof% Prof This is what the previous data looks like in table form. Would a picture help?

Proficiency by Standard (for 25 Students)

Heres the data GLCENPProf Note: this will be replaced with a table so it looks better

Repeated Measures If you test the same content on more than one occasion, you can look at your test results over time. If you test the same content on more than one occasion, you can look at your test results over time. As an example, lets look at test results for our class of 25 students on a pre-test, two intermediate tests, and a post-test covering the same five GLCEs. We will look only at GLCE 1, with 5 points possible each time. As an example, lets look at test results for our class of 25 students on a pre-test, two intermediate tests, and a post-test covering the same five GLCEs. We will look only at GLCE 1, with 5 points possible each time.

The Data – Results for 25 students on GLCE 1 on 4 test administrations by score point Score Points Pre-TestTest 1Test 2Post-Test (This is a somewhat idealized example), but interpret it with caution!

And heres the picture – Results for 25 students on 4 tests by score point

The Excel spreadsheet Score Pre-test Test Test Post-test Note: This will be replaced with a table for better viewing

Conclusions Audience Audience Purpose Purpose Technical Considerations Technical Considerations What? So what? Now what? What? So what? Now what?

Next Module