Download presentation
Presentation is loading. Please wait.
1
Professor Jim Tognolini
Presentation 3: Developing Quality assessment items: Principles of Reliability and Validity Professor Jim Tognolini
2
Introduction to Modern Assessment Theory: A basis for all assessments
During this session we will define reliability. define measurement error. examine the sources of measurement error. define validity and identify threats to validity. build assessment frameworks operationalise frameworks with Tables of Specification.
3
Reliability The reliability of results gives the extent to which the results are consistent or error-free. The concept of reliability is closely associated with the idea of consistency. Reliability is not an all or nothing concept; there can exist degrees of reliability. How similar are results if students are assessed at different times? How similar are results if students are assessed with a different sample of equivalent tasks? How similar are results if essays have been marked by different markers.
4
Measurement Error
5
Sources of Measurement Error
The following are some of the sources of measurement error Test taking skills Comprehension of instructions Sampling variance of items Temporary factors such as health; fatigue; motivation; testing conditions. Memory fluctuations Marking bias (especially in essays) Guessing Item types The aim for test developers is to identify sources of measurement error and minimise their impact.
6
CSSA HSC Trial Examinations 2013 Convenors Forum
Validity Jim Tognolini - September2012 6
7
Validity The validity of the results if a test can best be defined as the extent to which the results of a test measure what they purport to measure. It is the interpretation (including inferences and decisions) that is validated, not the test or the test score. Messick also argued (1989) that validation can include the evaluation of the consequences of the test; are the specific benefits likely to be realised? In 1999 the Standards (AERA, APA and NCME) suggested that validation can be viewed as developing scientifically sound validity arguments to support the intended interpretation of test scores and their relevance to the proposed use.
8
Threats to Validity Factors in the test itself
Unclear direction (e.g. how to respond to guessing; recording answers). Reading vocab and sentence structure is too difficult Inappropriate level of difficulty of test items (e.g. guessing) Poorly constructed test items Ambiguity Test items (tasks) inappropriate for content being assessed Test too short Improper arrangement of items Identifiable pattern of answers Factors in test administration and scoring Insufficient time Cheating Unreliable scoring
9
Relationship between Validity and Reliability
Reliability is a necessary but insufficient condition for validity.
10
Some basic assessment theory
Validity and reliability are not deterministic – maximise validity and reliability Validity is paramount Ways to minimise threats to validity and reliability Breadth of material sampled – increase validity Guessing Quality of items
11
Assessment frameworks
12
Preliminary Questions
Why are we assessing? What are we assessing? What is the most appropriate way to assess these outcomes?
13
Definition of construct
Assessment framework Definition of construct Domains/strands Sub-domains/sub-strands Outcomes/content standards …
14
Addition & subtraction
Example 1 - Mathematics Mathematics Domains/strands Sub-domains/sub-strands Outcomes/content standards … Number Addition & subtraction Multiplication and Division Fractions and Decimals Measurement Space Chance
15
Example 1 - Mathematics Mathematics Domains/strands
Sub-domains/sub-strands Outcomes/content standards Progress levels Number Addition & subtraction Students develop facility with number facts and computation with larger numbers in addition and subtraction and an appreciation of the relationship between those facts Early Stage 1 Combines, separates and compares collections of objects, describes using everyday language and records using informal methods Stage 1 Uses concrete materials and mental strategies for addition and subtraction involving one- and two-digit numbers Multiplication and Division Students develop facility with number facts and computation with larger numbers in multiplication and division and an appreciation of the relationship between those facts Groups and shares collections of objects, describes using everyday language and records using informal methods Models and uses strategies for multiplication and division
16
Example 1 - Mathematics Mathematics Domains/ strands
Sub-domains/sub-strands Outcomes/content standards Progress levels Number Addition & subtraction Students develop facility with number facts and computation with larger numbers in addition and subtraction and an appreciation of the relationship between those facts Stage 2 Uses mental and written strategies for addition and subtraction involving two-, three- and four-digit numbers Stage 3 Selects and applies appropriate strategies for addition and subtraction with numbers of any size Multiplication and Division Students develop facility with number facts and computation with larger numbers in multiplication and division and an appreciation of the relationship between those facts Uses mental and written strategies for multiplication and division Selects and applies appropriate strategies for multiplication and division
17
Example developmental continuum for mathematics
18
Definition of construct
Assessment framework Definition of construct Domains/strands Sub-domains/sub-strands Outcomes/content standards …
19
Example 2 – Scientific literacy
Domains/strands Outcomes/content standards … Formulating Formulating or identifying investigable questions and hypotheses, planning investigations and collecting evidence Interpreting Interpreting evidence and drawing conclusions, critiquing the trustworthiness of evidence and claims made by others, and communicating findings Using Using understandings for describing and explaining natural phenomena, making sense of reports, and for decision-making
20
Example 2 – Scientific literacy
Domains/strands Outcomes/content standards … Formulating Formulating or identifying investigable questions and hypotheses, planning investigations and collecting evidence Level 1 - Year 2 Responds to the teacher’s questions, observes and describes Level 2 - Year 4 Given a question in a familiar context, identifies a variable to be considered, observes and describes or makes non-standard measurements and limited records of data Interpreting Interpreting evidence and drawing conclusions, critiquing the trustworthiness of evidence and claims made by others, and communicating findings Describes what happened Makes comparisons between objects or events observed Using Using understandings for describing and explaining natural phenomena, making sense of reports, and for decision-making Describes an aspect or property of an individual object or event that has been experienced or reported Describes changes to, differences between or properties of objects or events that have been experienced or reported
21
Example developmental continuum for scientific literacy
Level 1 - Year 2 Level 2 - Year 4 Level 3 – Year 6 Formulating - Domain A Domain A Domain B Interpreting - Domain B Domain C Using - Domain C T3 T10 T1 T8 T4 T6 T2 T11 T7 T5 T12
22
Building a table of specifications
Preparing a list of learning outcomes – these describe the types of performances the students are expected to demonstrate (e.g. Knows basic terms – “Writes a definition of each term”; “Identifies the term that represents each weather element”; etc.) Outlining the course content – the content describes the area in which each type of performance is to be demonstrated (e.g. “air pressure”; “wind”; “temperature”; etc.) Preparing a chart that relates the relative emphasis of the learning objectives to the content through the number, type and percentage of items.
23
Table of specifications
Learning Outcomes Content Area Basic Skills Application Problem Solving Total Percentage Fractions 5 15 Mixed numbers 10 20 Decimals 30 Decimal to Fraction conversions 35 Total Percentage Points 40 100
24
Table of specifications - English
CONTENT COGNITIVE LEVELS WEIGHTAGE MARKS Identifies Interprets Infers Weightage Marks PLOT 1 SA (2marks) 1 Essay (3marks) 1SA 28% 7marks CHARACTER (4marks) 24% 6marks CRISIS 1 Performance task (8 marks) 32% 8marks LANGUAGE (1mark) 16% 4marks 20% 48% 100 5marks 12marks 25marks
25
Table of specifications - Geography
CONTENT COGNITIVE LEVELS WEIGHTAGE MARKS Basic map skills & Understanding Application Extended Understanding Weightage Marks Physical Landforms 1 SA (2 marks) 1 Essay (6 marks) 2 SAs (4 marks) 24% 12 marks Location 4 SA (8 marks) 16% 8 marks Climate 1 Perform. task (16 marks) 40% 20 marks Vegetation 20% 10 marks 32% 44% 100 16 marks 22 marks 50 marks
26
Constructing a test that operationally defines the scale.
Test constructors are challenged by the need to define items that enable students at different stages along the scale to demonstrate that they have enough of the subject (construct) to correctly answer the item; ensure that the items are assessing the outcomes for the particular location on the scale; ensure that as the items are being written, the ones that are intended to be located further towards the top of the scale on the line are, in fact, are more demanding then those that are located towards the bottom of the scale on the line; and ensure that the reason that the items are more demanding is a function of the property/variable that is being measured and not a function of some other extraneous feature (validity).
27
Assessment Literacy: Question 1
What is the most important thing to consider when selecting a method for assessing performance against learning objectives? how easy the assessment is to score how easy the assessment is to prepare how useful the assessment is at assessing the learning objective how well the assessment is accepted by the school administration
28
Assessment Literacy: Question 2
What does it mean when you are told that the test is “reliable”? student scores from the assessment can be used for a large number of decisions students who take the same test are likely to get similar scores next time the test score accurately assesses the content the test score is more valid than teacher-based assessments
29
Assessment Literacy: Question 3
Class teachers in a school want to assess their students’ understanding of the method for solving problems that they have been teaching. Which one of the following would be the most appropriate method for seeing whether the teaching had been effective? Justify your answer. select a problem solving book with a problem solving test already in it develop an assessment method consistent with what has actually been taught in class select a problem solving test (like the PSA) that will give a problem solving mark select an assessment that measures students’ attitudes to problem solving strategies
30
The following Table of Specifications for a Mathematics assessment was prepared by the classroom teacher. Use this Table to answer items 4 and 5. Note: The numbers in the cells refer to the number of items. Content Area Bloom’s Taxonomy Knowledge Comprehension Application Synthesis Analysis Total Place values and number sense 1 2 7 Space 3 10 Addition and subtraction 4 5 16 Multiplication & Division Measurement 13 8 14 56
31
Assessment Literacy: Question 4
How many items did the teacher aim to use to assess higher order thinking skills, where higher order thinking skills are those that assess items at or above Application in Bloom’s Taxonomy? 14 34 7 None of the above
32
Assessment Literacy: Question 5
Which one of the following statements BEST DEFINES a Table of Specifications? It ensures that the total number of marks for the assessment will equal 100. It classifies educational goals, learning objectives and standards. It relates the content to the cognitive level of the learning objectives for the purpose of improving the validity of the instrument. It is a table that is used by teachers to reliably assess students.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.