Testing What You Teach: Eliminating the “Will this be on the final

Slides:



Advertisements
Similar presentations
The meaning of Reliability and Validity in psychological research
Advertisements

TESTING SPEAKING AND LISTENING
Presented by Eroika Jeniffer.  We want to set tasks that form a representative of the population of oral tasks that we expect candidates to be able to.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
VALIDITY AND RELIABILITY
Lesson Six Reliability.
RESEARCH METHODS Lecture 18
VALIDITY.
RELIABILITY & VALIDITY
Teaching and Testing Pertemuan 13
Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.
1 BASIC CONSIDERATIONS in Test Design 2 Pertemuan 16 Matakuliah: >/ > Tahun: >
Basic Issues in Language Assessment 袁韻璧輔仁大學英文系. Contents Introduction: relationship between teaching & testing Introduction: relationship between teaching.
Creating Effective Classroom Tests by Christine Coombe and Nancy Hubley 1.
Linguistics and Language Teaching Lecture 9. Approaches to Language Teaching In order to improve the efficiency of language teaching, many approaches.
Classroom Assessment A Practical Guide for Educators by Craig A
Understanding Validity for Teachers
Assessment in Language Teaching: part 2 Today’s # 24.
Comprehensive Assessment System Webinar #6 December 14, 2011.
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
Chapter 4. Validity: Does the test cover what we are told (or believe)
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Stages of testing + Common test techniques
Grammar-Translation Approach Direct Approach
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Principles of Test Construction
1 An Introduction to Language Testing Fundamentals of Language Testing Fundamentals of Language Testing Dr Abbas Mousavi American Public University.
+ Old Reliable Testing accurately for thousands of years.
Curriculum What is it like? A path or course to run in small steps. What is the Purpose? To focus and connect the work of teachers in their classroom.
Validity & Practicality
Principles in language testing What is a good test?
Chap. 2 Principles of Language Assessment
Reliability & Validity
Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.
EDU 385 CLASSROOM ASSESSMENT Week 1 Introduction and Syllabus.
Module 6 Testing & Assessment Part 1
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
What are the stages of test construction??? Take a minute and try to think of these stages???
Validity in Testing “Are we testing what we think we’re testing?”
Washback in Language Testing
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science?
Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.
Imagine…  A hundred students is taking a 100 item test at 3 o'clock on a Tuesday afternoon.  The test is neither difficult nor easy. So, not ALL get.
Stages of Test Development By Lily Novita
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
RELIABILITY BY DONNA MARGARET. WHAT IS RELIABILITY?  Does this test consistently measure what it’s supposed to measure?  The more similar the scores,
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
Standards-Based Tests A measure of student achievement in which a student’s score is compared to a standard of performance.
Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.
Unit 3 L2 Testing (2): The cornerstones of language testing.
Understanding Language Testing. What is testing???
Consistency and Meaningfulness Ensuring all efforts have been made to establish the internal validity of an experiment is an important task, but it is.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
EVALUATING EPP-CREATED ASSESSMENTS
Validity and Reliability
Testing testing: developing tests to support our teaching and learning
Human Resource Management By Dr. Debashish Sengupta
پرسشنامه کارگاه.
VALIDITY Ceren Çınar.
RELIABILITY IN TESTING
The extent to which an experiment, test or any measuring procedure shows the same result on repeated trials.
Gazİ unIVERSITY M.A. PROGRAM IN ELT TESTING AND ASSESSMENT IN ELT «ValIdIty» PREPARED BY FEVZI BALIDEDE 2013, ANKARA.
Why do we assess?.
Qualities of a good data gathering procedures
Presentation transcript:

Testing What You Teach: Eliminating the “Will this be on the final Testing What You Teach: Eliminating the “Will this be on the final?” Ideology Dr. Barry Lee Reynolds National Yang-Ming University Education Center for Humanities and Social Sciences

Outline Introduction Backwash Reliability Validity Explain time is limited so can only focus on a few issues, recommend the Testing book.

Introduction

Why students ask: “Will this be on the final exam?”

The distrust of tests Who distrusts tests? Why? Language Teachers Language Students Why? Due to their negative effects on learning, tests are often considered as more harmful than helpful. Sometimes teaching is good, but the test does not reflect the teaching. The effect of testing on teaching is known as backwash, and can be harmful or beneficial (Hughes, 2003).

Tests are often inaccurate measurements Testing technique e.g., If you want to know how well someone writes, you must ask them to write. (referred to as validity) e.g., The test must consistently measure the ‘construct’ (e.g., the past tense, vocabulary, writing) (referred to as reliability)

Backwash How can a teacher achieve beneficial backwash?

Backwash Harmful Backwash Beneficial Backwash Ex. multiple choice items to test writing Beneficial Backwash Ex. writing to test writing More contextualized (low-stakes exam) Final exam for a course More global (high-stakes exam) University entrance exam (e.g., TOEFL) If the skill of writing, for example, is tested only by multiple choice items, then there is great pressure to practise such items rather than practise the skill of writing itself. TOEFL exam

How can a teacher achieve beneficial backwash? (1/2) Test the abilities whose development you want to encourage If you want to encourage oral ability, then test oral ability. Sample widely and unpredictably It is important that the sample taken should represent as far as possible the full scope of what is specified. Use direct testing If we test directly the skills that we are interested in fostering, then practice for the test represents practice in those skills.

How can a teacher achieve beneficial backwash? (2/2) Make testing criterion-referenced If the test specifications make clear just what students have to be able to do (and with what degree of success), then students will have a clear picture of what they have to achieve. Base tests on objectives If tests are based on objectives, rather than on detailed teaching and textbook content, they will provide a truer picture of what has actually been achieved. Ensure the test is known and understood by students and teachers Students need to understand what the test demands of them. Explain the rationale for the test, its specifications, and provide sample items.

Validity How can teachers ensure the validity of an assessment?

Construct validity An assessment is said to be valid and have construct validity if it measures accurately what it is intended to measure. e.g., “reading ability”; “speaking fluency”; “grammar” Does the assessment really test the “construct” it has set out to test? Construct validity used in reference to an overarching notion of validity. Teachers must ensure that their tests truly assess the skills they have taught in their classrooms.

Content validity Content Validity If you wish to test “reading ability” the assessment must be made up of items that test for language skills that are associated with “reading ability.” To ensure content validity, it is not enough just to have students “read” and require them to answer questions; the questions must constitute a proper sample of all the language skills that have been taught in the course. Areas that are not tested, tend to be ignored by teachers in their teaching and students in their learning. Unfortunately, the content of tests are usually made up of what is easiest to test. Match assessment content to specifications written for the course (i.e., class goals & objectives).

Criterion-related validity Criterion-related validity refers to the degree to which one assessment correlates with another assessment. Criterion-related validity includes concurrent validity and predictive validity. Concurrent validity is established when the test and the criterion are administered at about the same time. Example – testing of oral and written language abilities Predictive validity concerns the degree to which a test can predict students’ future performance. Example – prerequisite course; internship opportunities Criterion-related validity is usually investigated through the use of correlation coefficients.

Validity in scoring An assessment should not test more than one ability (unless it was designed with the intention to do so!). Example – Reading test that also assesses spelling and grammar; writing test that emphasizes punctuation

Face validity A test is said to have face validity if it looks as if it measures what it is supposed to measure.

Reliability How can teachers ensure the reliability of an assessment?

Reliability Reliability refers to the degree to which an assessment produces stable and consistent results. In other words, giving the assessment on X day will result with pretty much the same results if it had been given on Y day. This is determined through the use of “the reliability coefficient.” test-retest method split-half method Lado (1961) provides benchmarks to follow: vocabulary, grammar, and reading assessments .90-.99 listening .80-.89 speaking .70-.79

Scorer reliability Quantifying the level of agreement given by the same or different scorers on different occasions by means of a coefficient can help ensure scorer reliability. Ex. grading essays

How to make tests more reliable? (1/3) Take enough samples of behavior It is not enough to just include enough items, but to ensure each item is a “fresh start” for the students. Exclude items which do not discriminate well between weaker and stronger students. Do not allow candidates too much freedom. Write unambiguous items. Provide clear and explicit instructions.

How to make tests more reliable? (2/3) Ensure that tests are well laid out and perfectly legible. Make students familiar with format and testing techniques. Provide uniform and non-distracting conditions of administration. Use items that permit scoring which is as objective as possible. Make comparisons between students as direct as possible (similar to not allowing students too much freedom).

How to make tests more reliable? (3/3) Create a detailed scoring key. Train scorers (if not scoring sheets yourself). Agree acceptable responses and appropriate scores at outset of scoring. Identify candidates by number, not name. Employ multiple, independent scoring (if possible).

Relationship between reliability and validity To be valid an assessment must be reliable; however, it may be possible for an assessment to be reliable but not valid. Ex. writing test that actually assesses translation Be careful not to sacrifice validity while ensuring reliability.

Thank You For Your Attention

References Hughes, A. (2003). Testing for language teachers. Cambridge University Press. Lado, R. (1961). Language Testing: The Construction and Use of Foreign Language Tests. A Teacher's Book.