Basic Issues in Language Assessment 袁韻璧輔仁大學英文系. Contents Introduction: relationship between teaching & testing Introduction: relationship between teaching.

Slides:



Advertisements
Similar presentations
Assessment types and activities
Advertisements

Assessment Adapted from text Effective Teaching Methods Research-Based Practices by Gary D. Borich and How to Differentiate Instruction in Mixed Ability.
TESTING SPEAKING AND LISTENING
Chapter 1 What is listening?
L2 program design Content, structure, evaluation.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Lesson Six Reliability.
Testing What You Teach: Eliminating the “Will this be on the final
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 6 Validity.
Education 3504 Week 3 reliability & validity observation techniques checklists and rubrics.
VALIDITY.
Chapter 15 Conducting & Reading Research Baumgartner et al Chapter 15 Measurement Issues in Research.
Language Testing Introduction. Aims of the Course The primary purpose of this course is to enable students to become competent in the design, development,
1 Testing Oral Ability Pertemuan 22 Matakuliah: >/ > Tahun: >
Teaching and Testing Pertemuan 13
Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.
1 Lesson One Introduction: Teaching and Testing/Assessment.
1 BASIC CONSIDERATIONS in Test Design 2 Pertemuan 16 Matakuliah: >/ > Tahun: >
Testing for Language Teachers
Creating Effective Classroom Tests by Christine Coombe and Nancy Hubley 1.
Research Methods in MIS
Linguistics and Language Teaching Lecture 9. Approaches to Language Teaching In order to improve the efficiency of language teaching, many approaches.
Classroom Assessment A Practical Guide for Educators by Craig A
Assessing and Evaluating Learning
Construct Validity and Measurement
Understanding Validity for Teachers
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
Chapter 4. Validity: Does the test cover what we are told (or believe)
Stages of testing + Common test techniques
LANGUAGE TESTING AND ASSESSMENT ASSESSMENT LITERACY Prepared by Marina Gvozdeva, Natalya Milyavskaya, Tatiana Sadovskaya, Violetta Yurkevich Based on material.
Principles of Language Assessment Ratnawati Graduate Program University State of Semarang.
Shawna Williams BC TEAL Annual Conference May 24, 2014.
GUIDELINES FOR SETTING A GOOD QUESTION PAPER
RELIABILITY BY DESIGN Prepared by Marina Gvozdeva, Elena Onoprienko, Yulia Polshina, Nadezhda Shablikova.
Introduction: Teaching and Testing/Assessment
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Reliability Lesson Six
Classroom Assessments Checklists, Rating Scales, and Rubrics
1 An Introduction to Language Testing Fundamentals of Language Testing Fundamentals of Language Testing Dr Abbas Mousavi American Public University.
ASSESSING LANGUAGE SKILLS
CONSTRUCTING OBJECTIVE TEST ITEMS: MULTIPLE-CHOICE FORMS CONSTRUCTING OBJECTIVE TEST ITEMS: MULTIPLE-CHOICE FORMS CHAPTER 8 AMY L. BLACKWELL JUNE 19, 2007.
Validity & Practicality
Principles in language testing What is a good test?
The second part of Second Language Assessment 김자연 정샘 위지영.
Language Assessment Instructor: Dr. Yan-Ling Hwang, Assistant Professor Class Time : Monday 1:10 a.m. - 2:50 p.m. Classroom : B28 大慶校區 Office : A26 應語系研究室.
Lesson Three Kinds of Test and Testing. Contents Kinds of Tests: Based on Purposes  Classroom use Classroom use  External examination Kinds of Testing:
Chap. 2 Principles of Language Assessment
Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.
USEFULNESS IN ASSESSMENT Prepared by Vera Novikova and Tatyana Shkuratova.
1 Item Analysis - Outline 1. Types of test items A. Selected response items B. Constructed response items 2. Parts of test items 3. Guidelines for writing.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:
What are the stages of test construction??? Take a minute and try to think of these stages???
Assessment Purposes  Assessment for Curriculum Diagnostic Motivation Grades  Assessment for Communication Certification Selection  Assessment for Accountability.
1 LANGUAE TEST RELIABILITY. 2 What Is Reliability? Refer to a quality of test scores, and has to do with the consistency of measures across different.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 6 - Standardized Measurement and Assessment
Limitations and Future Directions of Tests. Test Interpretation and Use A valid test involves valid interpretation and valid use of the test scores A.
RELIABILITY BY DONNA MARGARET. WHAT IS RELIABILITY?  Does this test consistently measure what it’s supposed to measure?  The more similar the scores,
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.
The definition of table of specification. Table of specification is a chart that provides graphic representations of a related to the content of a course.
Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.
Assessing Speaking. Possible challenges in assessing speaking Effect of listening skill: Speaking without interaction is observable but very limited (telling.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
How to Use These Modules 1.Complete these modules with your grade level and/or content team. 2.Print the note taking sheets. 3.Read the notes as you view.
Research on Using Observation Systems with Special Educators
Learning About Language Assessment. Albany: Heinle & Heinle
Presentation transcript:

Basic Issues in Language Assessment 袁韻璧輔仁大學英文系

Contents Introduction: relationship between teaching & testing Introduction: relationship between teaching & testingrelationship between teaching & testingrelationship between teaching & testing Forms of test delivery Forms of test delivery Forms of test delivery Forms of test delivery Characteristics of a good test Characteristics of a good test Characteristics of a good test Characteristics of a good test Validity, reliability, practicality, positive washback Validity, reliability, practicality, positive washback Validityreliabilitypracticality Validityreliabilitypracticality Multiple-choice reading tests Multiple-choice reading tests Multiple-choice reading tests Multiple-choice reading tests Computer-based testing Computer-based testing Advantages and disadvantages Advantages and disadvantages Advantagesdisadvantages Advantagesdisadvantages Conclusion Conclusion Conclusion

Relationship between Teaching & Testing Subordinate  partnership Subordinate  partnership (supportive, corrective) (supportive, corrective)

Forms of Test Delivery alternative assessment alternative assessment paper-&-pencil computer-based paper-&-pencil computer-based tests testing tests testing

Characteristics of a Good Test Validity Validity Reliability Reliability Practicality (feasibility) Practicality (feasibility) Positive washback Positive washback The effect of tests on teaching & learning The effect of tests on teaching & learning

Validity Definition: a test should measure what it is intended to measure, and nothing else (i.e., no external knowledge or other skills measured at the same time). Definition: a test should measure what it is intended to measure, and nothing else (i.e., no external knowledge or other skills measured at the same time). Types of validity Types of validity Face validity, content validity, construct validity, criterion-related validity Face validity, content validity, construct validity, criterion-related validity

Face Validity You know if the test is valid or not by ‘ looking ’ at it. You know if the test is valid or not by ‘ looking ’ at it. It “ looks right ” to other testers, teachers, and testees, etc. It “ looks right ” to other testers, teachers, and testees, etc. Essential to all kinds of tests, but it is not enough. Essential to all kinds of tests, but it is not enough.

Content Validity “ A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc. with which it is meant to be concerned. ” (Hughes 1989, p. 22) “ A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc. with which it is meant to be concerned. ” (Hughes 1989, p. 22) Also called rational or logical validity. Also called rational or logical validity. Check against: Check against: Test specification (test plan) Test specification (test plan) Teaching materials, textbooks Teaching materials, textbooks Course syllabus/objectives Course syllabus/objectives Another teacher or subject-matter experts Another teacher or subject-matter experts

Definition of Reliability “ The consistency of measures across different times, test forms, raters, and other characteristics of the measurement context ” (Bachman, 1990, p. 24). “ The consistency of measures across different times, test forms, raters, and other characteristics of the measurement context ” (Bachman, 1990, p. 24). The accuracy or precision with which a test measures something; consistency, dependability, or stability of test results. The accuracy or precision with which a test measures something; consistency, dependability, or stability of test results.

How to make sure the test is reliable for teachers Take enough samples of behavior Take enough samples of behavior Try to avoid ambiguous items Try to avoid ambiguous items Provide clear and explicit instructions Provide clear and explicit instructions Well layout Well layout Provide uniform and undistracted condition Provide uniform and undistracted condition Try to use objective tests Try to use objective tests Try to use direct tests Try to use direct tests Have independent, trained raters Have independent, trained raters Try to identify the test takers by number, not by names Try to identify the test takers by number, not by names Try to have more multiple independent scoring in subjective tests (Hughes, 1989, pp ). Try to have more multiple independent scoring in subjective tests (Hughes, 1989, pp ).

Practicality Practical consideration when planning tests or ways of measurement, including cost, time/effort required Practical consideration when planning tests or ways of measurement, including cost, time/effort required Economy Economy Ease of Ease of Scoring and score interpretation Scoring and score interpretation Administration Administration Test compilation Test compilation A test should be practical to use, but also valid and reliable. A test should be practical to use, but also valid and reliable.

Multiple-choice Reading Tests Comprehension — being able to find meaning in what is read Comprehension — being able to find meaning in what is read Three levels of comprehension: Three levels of comprehension: Literal, interpretive (or referential) & critical Literal, interpretive (or referential) & critical Problems of multiple-choice reading tests Problems of multiple-choice reading tests Recall the info. Or text recycling Recall the info. Or text recycling Ambiguous, flawed texts/items Ambiguous, flawed texts/items Information gaps in passages Information gaps in passages Unfair, tricky tasks (e.g., full of unfamiliar words) Unfair, tricky tasks (e.g., full of unfamiliar words) Too much background knowledge assumed Too much background knowledge assumed Scored for wrong reason or vice versa Scored for wrong reason or vice versa Test-taking techniques Test-taking techniques

Advantages of CBT Scoring done automatically and immediately Scoring done automatically and immediately Tests tailored to the particular abilities of each test taker Tests tailored to the particular abilities of each test taker Tests provided on demand Tests provided on demand Many item combos are possible  test security Many item combos are possible  test security Multi-media  multiple-intelligent learning Multi-media  multiple-intelligent learning

Disadvantages of CBT Writing tests: Writing tests: Do raters react differentially to printed vs. handwritten texts? Do raters react differentially to printed vs. handwritten texts? To testees: different composing processes To testees: different composing processes Reading tests: Reading tests: Do testees react in the same way to read texts presented on computer screen and texts printed on paper? Speaking tests (semi-direct tests): Speaking tests (semi-direct tests): Nature of communication: a shared human activity, involving interlocutors & interaction Nature of communication: a shared human activity, involving interlocutors & interaction

Conclusion Variables of test performance: Variables of test performance: Types/formats of tasks, nervousness, physical conditions of testees, rater factors, etc. Types/formats of tasks, nervousness, physical conditions of testees, rater factors, etc. Adoption of multiple methods of assessment, alternative assessment Adoption of multiple methods of assessment, alternative assessment Valid, reliable paper-and-pencil tests that have positive washback Valid, reliable paper-and-pencil tests that have positive washback CBT for classroom teachers — depending on the testing purpose & needs CBT for classroom teachers — depending on the testing purpose & needs