Validity & Practicality

Slides:

Advertisements

Similar presentations

The meaning of Reliability and Validity in psychological research

Advertisements

Assessment types and activities

Quality Control in Evaluation and Assessment

Conceptualization and Measurement

Cal State Northridge Psy 427 Andrew Ainsworth PhD

The Research Consumer Evaluates Measurement Reliability and Validity

1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.

VALIDITY AND RELIABILITY

Part II Sigma Freud & Descriptive Statistics

Testing What You Teach: Eliminating the “Will this be on the final

What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.

MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT

Types of Tests. Why do we need tests? Why do we need tests?

Assessment: Reliability, Validity, and Absence of bias

Reliability, Validity, Trustworthiness If a research says it must be right, then it must be right,… right??

RESEARCH METHODS Lecture 18

Chapter 4 Validity.

Test Validity: What it is, and why we care.

VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green 1.

Language Testing Introduction. Aims of the Course The primary purpose of this course is to enable students to become competent in the design, development,

Teaching and Testing Pertemuan 13

Basic Issues in Language Assessment 袁韻璧輔仁大學英文系. Contents Introduction: relationship between teaching & testing Introduction: relationship between teaching.

Lesson Three Kinds of Test and Testing. Yun-Pi Yuan 2 Contents Kinds of Tests: Based on Purposes  Classroom use Classroom use  External examination.

Principles of High Quality Assessment

BASIC PRINCIPLES OF ASSSESSMENT RELIABILITY & VALIDITY

Linguistics and Language Teaching Lecture 9. Approaches to Language Teaching In order to improve the efficiency of language teaching, many approaches.

Chapter 7 Evaluating What a Test Really Measures

Classroom Assessment A Practical Guide for Educators by Craig A

Understanding Validity for Teachers

Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.

Chapter 4. Validity: Does the test cover what we are told (or believe)

Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.

Validity and Reliability

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Principles in language testing What is a good test?

Chap. 2 Principles of Language Assessment

EDU 8603 Day 6. What do the following numbers mean?

Module 6 Testing & Assessment Part 1

Measurement Validity.

Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.

Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.

Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)

Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.

Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.

Validity in Testing “Are we testing what we think we’re testing?”

 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science?

Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.

Chapter 6 - Standardized Measurement and Assessment

PRINCIPLES OF LANGUAGE ASSESSMENT Riko Arfiyantama Ratnawati Olivia.

Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.

Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

VALIDITY by Barli Tambunan/

Reliability and Validity in Research

Concept of Test Validity

Human Resource Management By Dr. Debashish Sengupta

پرسشنامه کارگاه.

Learning About Language Assessment. Albany: Heinle & Heinle

Reliability and Validity of Measurement

VALIDITY Ceren Çınar.

RESEARCH METHODS Lecture 18

Chapter 8 VALIDITY AND RELIABILITY

Presentation transcript:

Validity & Practicality Lesson Five Validity & Practicality

Contents Introduction: Definition of Validity Types of validity Non-empirical Face Validity Content Validity Empirical Construct Validity Criterion-related Validity Practicality

Introduction A writing test asks test takers to write on the following topic: “Is Photography an Art or a Science?” A valid writing test? Why or why not? You should be clear about what exactly you want to test (i.e., no other irrelevant abilities or knowledge). Validity concerns what a test measures and how well it measures what it is intended to measure.

Definition of Validity “the extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment” (cited in Brown 22) A valid test = a test that measures what it is intended to measure, and nothing else (i.e., no external knowledge or other skills measured at the same time). e.g. A listening test measures listening skill and nothing else. It shouldn’t favor any students.

Non-empirical Validity Involving inspection, intuition, and common sense Consequential validity: Face validity Content validity

Consequential Validity Encompasses all the consequences of a test: (Brown 26) Its accuracy in measuring intended criteria Its impact on the preparation of test-takers Its effect on the learner The social consequences of a test’s interpretation and use The effect on Ss’ motivation, subsequence in a course, independent learning, study habits, and attitude toward school work.

Face Validity (1) You know if the test is valid or not by ‘looking’ at it. It “looks right” to other testers, teachers, and testees, the general public, etc. It “appears” to measure the knowledge or abilities it claims to measure.

Face Validity (2) Face validity asked the Q: “does the test, on the ‘face’ of it, appear from the learner’s perspective to test what it is designed to test?” (Brown 27) Face validity cannot be empirically tested. Essential to all kinds of tests, but it is not enough.

Content Validity (1) Also called rational or logical validity. “A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc. with which it is meant to be concerned.” (Hughes 1989) Also called rational or logical validity. Esp. important for achievement, progress, & diagnostic tests A valid test: contains appropriate and representative content.

Content Validity (2) A test with content validity contains a representative sample of the course (objectives), and quantifies and balances the test components (given a percentage weighting) Check against: Test specifications (test plan) Notes, textbooks Course syllabus/objectives Another teacher or subject-matter experts

Content Validity (3) An example of a (fill-up) quiz on the use of articles: (see Brown 23) Does it have content validity if used as a listening/speaking test? Classroom tests should always have content validity. Rule of thumb for achieving content validity: always use direst tests

Criterion-related Validity (1) The extent to which the “criterion” of the test has actually been reached. “how far results on the test agree with those provided by some independent and highly dependable assessment of the candidate’s ability.”

Criterion-related Validity (2) Two kinds of criterion-related validity Concurrent validity: How closely the test result parallels test takers’ performance on another valid test, or criterion, which is thought to measure the same or similar activities test & criterion administered at about the same time possible criteria = an established test or some other measure within the same domain (e.g., course grades, T’s ratings)

Criterion-related Validity (3) E.g., situation: conv. class, objectives = a large # of functions. To test all of which will take 45 min. for each S. Q: Is such a 10-min. test a valid measure? Method: a random sample of Ss taking the full 45 min-test = criterion test; compare scores on short version with the those on criterion test  if a high level of agreement  short version = valid test

Criterion-related Validity (4) Validity coefficient: A mathematical measure of similarity Perfect agreement  validity coefficient = 1 E.g., a coefficient = 0.7; (0.7)2 = 0.49 49%, which means almost 50% agreement

Criterion-related Validity (5) Predictive validity: How well the test result predicts future performance/success correlation done at future time Important for the validation of aptitude tests, placement test, admissions tests. Criterion: Outcome of the course (pass/fail), T’s ratings later

Construct Validity (1) Construct: any underlying ability (trait) which is hypothesized in a theory of language ability Any theory, hypothesis, or model that attempts to explain observed phenomena in our universe of perceptions (Brown 25)

Construct Validity (2) Originated for psychological tests Refers to the extent to which the test may be said to measure a theoretical construct or trait which is normally unobservable and abstract at different levels (e.g., personality, self-esteem; proficiency, communicative competence) It examines whether the test is a true reflection of the theory of the trait being measured.

Construct Validity (3) A test has construct validity if it can be demonstrated that it measures just the ability which it is supposed to measure. Two examples: 1. reading ability: involves a # of sub-abilities, e.g., skimming, scanning, guessing meaning of unknown words, etc.

Construct Validity (4)  need empirical research to establish if such a distinct ability existed and could be measured Need of construct validity (because we have to demonstrate we’re indeed measuring just that ability in a particular test.

Construct Validity (5) 2. when measuring an ability indirectly: E.g., writing ability Need to look to a theory of writing ability for guidance as to the form (i.e., content, techniques) an indirect test should take Theory of writing tells us that underlying writing abilities = a # of sub-abilities, e.g., punctuation, organization, word choice, grammar . . . Based on the theory, we construct multiple-choice tests to measure these sub-abilities

Construct Validity (6) But, how do we know this test really is measuring writing ability? Validation methods: Compare scores on the pilot test with scores on a writing test (direct test)  if high level of agreement  yes Administer a # of tests; each measures a construct. Score the composition (direct test) separately for each construct. Then compare scores.

Construct Validity (7) To examine whether the test is a true reflection of the theory of the trait being measured. In lang. testing construct= any underlying ability/trait which is hypothesized in a theory of language ability. Necessary in a case of indirect testing. Can be measured by comparing the scores of a group of students for two tests.

Practicality Practical consideration when planning tests or ways of measurement, including cost, time/effort required Economy (cost, time: administration & scoring) Ease of scoring and score interpretation administration test compilation A test should be practical to use, but also valid and reliable