Professor Jim Tognolini

Slides:



Advertisements
Similar presentations
An Introduction to Test Construction
Advertisements

Performance Assessment
Critical Reading Strategies: Overview of Research Process
Fieldwork assessment The difference between AS and A2 David Redfern
Victorian Curriculum and Assessment Authority
Designing Instruction Objectives, Indirect Instruction, and Differentiation Adapted from required text: Effective Teaching Methods: Research-Based Practice.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 6 Validity.
Learning Objectives, Performance Tasks and Rubrics: Demonstrating Understanding and Defining What Good Is Brenda Lyseng Minnesota State Colleges.
Consistency of Assessment
Intellectual Challenge of Teaching
Principles of High Quality Assessment
Understanding Validity for Teachers
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Science Inquiry Minds-on Hands-on.
Formulating objectives, general and specific
MATHEMATICS KLA Years 1 to 10 Understanding the syllabus MATHEMATICS.
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
Student assessment AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
Science Process Skills. Observe- using our senses to find out about objects, events, or living things. Classify- arranging or sorting objects, events,
RE - SEARCH ---- CAREFUL SEARCH OR ENQUIRY INTO SUBJECT TO DISCOVER FACTS OR INVESTIGATE.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Assessment Specifications Gronlund, Chapter 4 Gronlund, Chapter 5.
1 Math 413 Mathematics Tasks for Cognitive Instruction October 2008.
Assessment and Testing
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
BLOOM’S TAXONOMY Mrs. Eagen A, A. Bloom identified six levels within the cognitive domain, from the simple recall or recognition of facts,
Major Science Project Process A blueprint for experiment success.
Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.
Higher Level Thinking Skills
 The introduction of the new assessment framework in line with the new curriculum now that levels have gone.  Help parents understand how their children.
Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.
A research and policy informed discussion of cross-curricular approaches to the teaching of mathematics and science with a focus on how scientific enquiry.
Grade 4: Alignment to Mathematics Grade-Level Standards.
Grade 5: Alignment to Mathematics Grade-Level Standards.
Key Updates. What has changed? National Curriculum Early Years baseline assessment SATS Teacher Assessments Assessment without levels, expected standards.
Inquiry-Based Instruction
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
Quality Assurance processes
SATs KS1 – YEAR 2 We all matter.
Information for Parents Key Stage 3 Statutory Assessment Arrangements
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
End of KS2 Tests “Show off week”.
Assessment.
Bloom’s Taxonomy of Learning
How to Research Lynn W Zimmerman, PhD.
Professor Jim Tognolini
VALIDITY by Barli Tambunan/
Concept of Test Validity
In-Service Teacher Training
Starting with the End in Sight…
Validity and Reliability
Writing Tasks and Prompts
Research Methods in Computer Science
Classroom test and Assessment
Classroom Assessment Validity And Bias in Assessment.
Week 3 Class Discussion.
Higher Level Thinking Skills
Outcome Based Education
مركز تطوير التدريس والتدريب الجامعي ورقة بعنوان
مركز تطوير التدريس والتدريب الجامعي ورقة بعنوان إعداد
Investigating science
SCIENCE AND ENGINEERING PRACTICES
Curriculum in Context.
Critically Evaluating an Assessment Task
BBI3420 PJJ 2009/2010 Dr. Zalina Mohd. Kasim
Using the 7 Step Lesson Plan to Enhance Student Learning
TAKS, Inquiry, Standards and Assessment
BURNSIDE ACADEMY INSPIRES KS2 SATS Guidance for Parents
Progress in Curriculum Alignment with the NH College and Career Ready Standards (NH CCRS) October 8, 2013.
Presentation transcript:

Professor Jim Tognolini Presentation 3: Developing Quality assessment items: Principles of Reliability and Validity Professor Jim Tognolini

Introduction to Modern Assessment Theory: A basis for all assessments During this session we will define reliability. define measurement error. examine the sources of measurement error. define validity and identify threats to validity. build assessment frameworks operationalise frameworks with Tables of Specification.

Reliability The reliability of results gives the extent to which the results are consistent or error-free. The concept of reliability is closely associated with the idea of consistency. Reliability is not an all or nothing concept; there can exist degrees of reliability. How similar are results if students are assessed at different times? How similar are results if students are assessed with a different sample of equivalent tasks? How similar are results if essays have been marked by different markers.

Measurement Error  

Sources of Measurement Error The following are some of the sources of measurement error Test taking skills Comprehension of instructions Sampling variance of items Temporary factors such as health; fatigue; motivation; testing conditions. Memory fluctuations Marking bias (especially in essays) Guessing Item types The aim for test developers is to identify sources of measurement error and minimise their impact.

CSSA HSC Trial Examinations 2013 Convenors Forum Validity Jim Tognolini - September2012 6

Validity The validity of the results if a test can best be defined as the extent to which the results of a test measure what they purport to measure. It is the interpretation (including inferences and decisions) that is validated, not the test or the test score. Messick also argued (1989) that validation can include the evaluation of the consequences of the test; are the specific benefits likely to be realised? In 1999 the Standards (AERA, APA and NCME) suggested that validation can be viewed as developing scientifically sound validity arguments to support the intended interpretation of test scores and their relevance to the proposed use.

Threats to Validity Factors in the test itself Unclear direction (e.g. how to respond to guessing; recording answers). Reading vocab and sentence structure is too difficult Inappropriate level of difficulty of test items (e.g. guessing) Poorly constructed test items Ambiguity Test items (tasks) inappropriate for content being assessed Test too short Improper arrangement of items Identifiable pattern of answers Factors in test administration and scoring Insufficient time Cheating Unreliable scoring

Relationship between Validity and Reliability Reliability is a necessary but insufficient condition for validity.

Some basic assessment theory Validity and reliability are not deterministic – maximise validity and reliability Validity is paramount Ways to minimise threats to validity and reliability Breadth of material sampled – increase validity Guessing Quality of items

Assessment frameworks

Preliminary Questions Why are we assessing? What are we assessing? What is the most appropriate way to assess these outcomes?

Definition of construct Assessment framework Definition of construct Domains/strands Sub-domains/sub-strands Outcomes/content standards …

Addition & subtraction Example 1 - Mathematics Mathematics Domains/strands Sub-domains/sub-strands Outcomes/content standards … Number Addition & subtraction Multiplication and Division Fractions and Decimals Measurement Space Chance

Example 1 - Mathematics Mathematics Domains/strands Sub-domains/sub-strands Outcomes/content standards Progress levels Number Addition & subtraction Students develop facility with number facts and computation with larger numbers in addition and subtraction and an appreciation of the relationship between those facts Early Stage 1 Combines, separates and compares collections of objects, describes using everyday language and records using informal methods Stage 1 Uses concrete materials and mental strategies for addition and subtraction involving one- and two-digit numbers Multiplication and Division Students develop facility with number facts and computation with larger numbers in multiplication and division and an appreciation of the relationship between those facts Groups and shares collections of objects, describes using everyday language and records using informal methods Models and uses strategies for multiplication and division

Example 1 - Mathematics Mathematics Domains/ strands Sub-domains/sub-strands Outcomes/content standards Progress levels Number Addition & subtraction Students develop facility with number facts and computation with larger numbers in addition and subtraction and an appreciation of the relationship between those facts Stage 2 Uses mental and written strategies for addition and subtraction involving two-, three- and four-digit numbers Stage 3 Selects and applies appropriate strategies for addition and subtraction with numbers of any size Multiplication and Division Students develop facility with number facts and computation with larger numbers in multiplication and division and an appreciation of the relationship between those facts Uses mental and written strategies for multiplication and division Selects and applies appropriate strategies for multiplication and division

Example developmental continuum for mathematics

Definition of construct Assessment framework Definition of construct Domains/strands Sub-domains/sub-strands Outcomes/content standards …

Example 2 – Scientific literacy Domains/strands Outcomes/content standards … Formulating Formulating or identifying investigable questions and hypotheses, planning investigations and collecting evidence Interpreting Interpreting evidence and drawing conclusions, critiquing the trustworthiness of evidence and claims made by others, and communicating findings Using Using understandings for describing and explaining natural phenomena, making sense of reports, and for decision-making

Example 2 – Scientific literacy Domains/strands Outcomes/content standards … Formulating Formulating or identifying investigable questions and hypotheses, planning investigations and collecting evidence Level 1 - Year 2 Responds to the teacher’s questions, observes and describes Level 2 - Year 4 Given a question in a familiar context, identifies a variable to be considered, observes and describes or makes non-standard measurements and limited records of data Interpreting Interpreting evidence and drawing conclusions, critiquing the trustworthiness of evidence and claims made by others, and communicating findings Describes what happened Makes comparisons between objects or events observed Using Using understandings for describing and explaining natural phenomena, making sense of reports, and for decision-making Describes an aspect or property of an individual object or event that has been experienced or reported Describes changes to, differences between or properties of objects or events that have been experienced or reported

Example developmental continuum for scientific literacy Level 1 - Year 2 Level 2 - Year 4 Level 3 – Year 6 Formulating - Domain A Domain A Domain B Interpreting - Domain B Domain C Using - Domain C T3 T10 T1 T8 T4 T6 T2 T11 T7 T5 T12

Building a table of specifications Preparing a list of learning outcomes – these describe the types of performances the students are expected to demonstrate (e.g. Knows basic terms – “Writes a definition of each term”; “Identifies the term that represents each weather element”; etc.) Outlining the course content – the content describes the area in which each type of performance is to be demonstrated (e.g. “air pressure”; “wind”; “temperature”; etc.) Preparing a chart that relates the relative emphasis of the learning objectives to the content through the number, type and percentage of items.

Table of specifications Learning Outcomes Content Area Basic Skills Application Problem Solving Total Percentage Fractions 5 15 Mixed numbers 10 20 Decimals 30 Decimal to Fraction conversions 35 Total Percentage Points 40 100

Table of specifications - English CONTENT COGNITIVE LEVELS WEIGHTAGE MARKS Identifies Interprets Infers Weightage Marks PLOT 1 SA (2marks) 1 Essay (3marks) 1SA 28% 7marks CHARACTER (4marks) 24% 6marks CRISIS 1 Performance task (8 marks) 32% 8marks LANGUAGE (1mark) 16% 4marks 20% 48% 100 5marks 12marks 25marks

Table of specifications - Geography CONTENT COGNITIVE LEVELS WEIGHTAGE MARKS Basic map skills & Understanding Application Extended Understanding Weightage Marks Physical Landforms 1 SA (2 marks) 1 Essay (6 marks) 2 SAs (4 marks) 24% 12 marks Location 4 SA (8 marks) 16% 8 marks Climate 1 Perform. task (16 marks) 40% 20 marks Vegetation 20% 10 marks 32% 44% 100 16 marks 22 marks 50 marks

Constructing a test that operationally defines the scale. Test constructors are challenged by the need to define items that enable students at different stages along the scale to demonstrate that they have enough of the subject (construct) to correctly answer the item; ensure that the items are assessing the outcomes for the particular location on the scale; ensure that as the items are being written, the ones that are intended to be located further towards the top of the scale on the line are, in fact, are more demanding then those that are located towards the bottom of the scale on the line; and ensure that the reason that the items are more demanding is a function of the property/variable that is being measured and not a function of some other extraneous feature (validity).

Assessment Literacy: Question 1 What is the most important thing to consider when selecting a method for assessing performance against learning objectives? how easy the assessment is to score how easy the assessment is to prepare how useful the assessment is at assessing the learning objective how well the assessment is accepted by the school administration  

Assessment Literacy: Question 2 What does it mean when you are told that the test is “reliable”? student scores from the assessment can be used for a large number of decisions students who take the same test are likely to get similar scores next time the test score accurately assesses the content the test score is more valid than teacher-based assessments  

Assessment Literacy: Question 3 Class teachers in a school want to assess their students’ understanding of the method for solving problems that they have been teaching. Which one of the following would be the most appropriate method for seeing whether the teaching had been effective? Justify your answer. select a problem solving book with a problem solving test already in it develop an assessment method consistent with what has actually been taught in class select a problem solving test (like the PSA) that will give a problem solving mark select an assessment that measures students’ attitudes to problem solving strategies

The following Table of Specifications for a Mathematics assessment was prepared by the classroom teacher. Use this Table to answer items 4 and 5. Note: The numbers in the cells refer to the number of items. Content Area Bloom’s Taxonomy Knowledge Comprehension Application Synthesis Analysis Total Place values and number sense 1 2 7 Space 3 10 Addition and subtraction 4 5 16 Multiplication & Division Measurement 13 8 14 56

Assessment Literacy: Question 4 How many items did the teacher aim to use to assess higher order thinking skills, where higher order thinking skills are those that assess items at or above Application in Bloom’s Taxonomy? 14 34 7 None of the above

Assessment Literacy: Question 5 Which one of the following statements BEST DEFINES a Table of Specifications? It ensures that the total number of marks for the assessment will equal 100. It classifies educational goals, learning objectives and standards. It relates the content to the cognitive level of the learning objectives for the purpose of improving the validity of the instrument. It is a table that is used by teachers to reliably assess students.