Dimensions of Test Washback

Slides:

Advertisements

Similar presentations

Performance Assessment

Advertisements

Quality Control in Evaluation and Assessment

TESTING SPEAKING AND LISTENING

Testing What You Teach: Eliminating the “Will this be on the final

Alternative Assesment There is no single definition of ‘alternative assessment’ in the relevant literature. For some educators, alternative assessment.

1 The IELTS Academic Reading Module Background information Question types Skills Challenges Helping Ss prepare Questions?

Presenter: Luong Thi Phuong Nhi (MA)

Updated 11/16/06©1996 & forthcoming, Bachman & Palmer & OUPPage 1 The Place of Intended Impact in Assessment Use Arguments * Lyle F. Bachman Department.

Assessment as a washback tool: is it beneficial or harmful? Nick Saville Director, Research and Validation University of Cambridge ESOL Examinations October.

Spanish Assessment Smackdown. What are two of the 5 important aspects of objectives? Hint, look at p. 22 Genesee and Upshur 1. CO’s = general 2. CO’s.

Language Testing Introduction. Aims of the Course The primary purpose of this course is to enable students to become competent in the design, development,

New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Consequential Validity Inclusive Assessment Seminar Elizabeth.

Teaching and Testing Pertemuan 13

Distance pre-service teachers’ perceptions of the effectiveness of their pedagogical courses in preparing them for their practicum By Asst. Prof. Belgin.

Globalization, educational convergence & the language classroom

TOPIC 3 BASIC PRINCIPLES OF ASSSESSMENT

Developed by Marian Hargreaves for NEAS 2013

Linguistics and Language Teaching Lecture 9. Approaches to Language Teaching In order to improve the efficiency of language teaching, many approaches.

Impact, Washback and Consequences of Large-scale Testing

Principles of language testing

LANGUAGE PROFICIENCY TESTING A Critical Survey Presented by Ruth Hungerland, Memorial University of Newfoundland, TESL Newfoundland and Labrador.

Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.

Welcome to the Athens, Greece June17, Teaching and Testing: Promoting Positive Washback Kathleen M. Bailey Monterey Institute of International Studies.

WASHBACK AND CONSEQUENCES Prepared by Natalya Milyavskaya, Tatiana Sadovskaya, Olga Mironova and Anzhelika Kalinina Based on material by Anthony Green.

Assessment Literacy for Language Teachers by Peggy Garza Partner Language Training Center Europe Associate BILC Secretary for Testing Programs.

Principles of Language Assessment Ratnawati Graduate Program University State of Semarang.

6 th semester Course Instructor: Kia Karavas.  What is educational evaluation? Why, what and how can we evaluate? How do we evaluate student learning?

Developing constructive alignment of assessment: the contested place of assessed reflective writing in ITE Julia Croft

Creating Assessments with English Language Learners in Mind In this module we will examine: Who are English Language Learners (ELL) and how are they identified?

Challenges in Developing and Delivering a Valid Test Michael King and Mabel Li NAFLE, July 2013.

Annotated Bibliography and Literature Survey: Assessment of Language Proficiency in the Health and Social Services Sector Annotated Bibliography and Literature.

ICAO Language Proficiency Requirements Presented by Elizabeth Mathews Linguistic Consultant, ICAO.

Higher Level of English Learning: A Social and Critical Perspective of Chinese EFL Learners’ Language Awareness Yamin Qian Kangxian Zhao Fang Liu.

Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.

June 09 Testing: Back to Basics. Abdellatif Zoubair Abdellatif

AMEP Assessment Task Bank Professional Development Kit Reading Developed by Marian Hargreaves for NEAS 2013 © NEAS Ltd

Validity & Practicality

Principles in language testing What is a good test?

The second part of Second Language Assessment 김자연 정샘 위지영.

Washback of BiH STANAG 6001 test

Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006 Investigating the impact of language assessment systems within a state.

An Investigation of test- taking strategies among Uitm students in an online test. SITI NASUHA ABU HASSAN P61632.

1 Historical Perspective... Historical Perspective... Science Education Reform Efforts Leading to Standards-based Science Education.

Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.

Smarter Balanced Assessment System March 11, 2013.

Introduction to Validity

USEFULNESS IN ASSESSMENT Prepared by Vera Novikova and Tatyana Shkuratova.

Assessing Learning for Students with Disabilities Tom Haladyna Arizona State University.

1 Comprehensive Accountability Systems: A Framework for Evaluation Kerry Englert, Ph.D. Paper Presented at the Canadian Evaluation Society June 2, 2003.

Evaluating Consequential Validity of AA-AAS Presented at OSEP Conference January 15, 2008 by Marianne Perie Center for Assessment.

DESIGNING CLASSROOM TESTS TSL3112 LANGUAGE ASSESSMENT PISMP TESL SEMESTER 6 IPGKDRI.

Promoting Positive Washback

Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.

Washback in Language Testing

THE ASSESSMENT CYCLE, ASSESSMENT DESIGN AND SPECIFICATIONS PROSET - TEMPUS1 Prepared by Maria Verbitskaya, Angelika Kalinina, Elena Solovova.

SPEAKING TESTS IN THE CONTEXT OF LANGUAGE LEARNING.

Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.

Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.

Case Study of the TOEFL iBT Preparation Course: Teacher’s perspective Jie Chen UWO.

BILC Seminar, Budapest, October 2016

Standards-Based Assessment Linking up with Authentic Assessment

The evidence is in the specifications

ECML Colloquium2016 The experience of the ECML RELANG team

PRESENTER: TRAN THI HIEU THUY SUPERVISOR: DR. TO THI THU HUONG

Small group consensus discussion tasks: CA driven criteria

Gazİ unIVERSITY M.A. PROGRAM IN ELT TESTING AND ASSESSMENT IN ELT «ValIdIty» PREPARED BY FEVZI BALIDEDE 2013, ANKARA.

BILC Professional Seminar - Zagreb, October 16, 2018 Maria Vargova

BASIC PRINCIPLES OF ASSESSMENT

Presentation transcript:

Dimensions of Test Washback Presentation to the BILC Conference in Prague This presentation addresses the topic of washback. It begins with some first principles of washback. David Oglesby Defense Language Institute English Language Center May 2012

What is Washback? Backwash is "the effect of testing on teaching and learning". (Hughes, A., 1994, p. 53) Washback is "the extent to which the introduction and use of a test influences language teachers and learners to do things they would not otherwise do that promote or inhibit language learning". (Messick, S., 1996, p. 241) Wall, D., & Alderson, J. C. (1993). Examining Washback: The Sri Lankan impact study. Language Testing, 10(1), 41-69.

Washback is Real * “It has frequently been noted that teachers will teach to a test: that is, if they know the content of a test and/or the format of a test, they will teach their students accordingly...” (Swain, 1985, p. 43) * “…a case of the examination tail wagging the education dog” (Fullilove, 1992, p. 31) Fullilove, J. (1992). The tail that wags. Institute of Language in Education Journal (9), 131-147. Swain, M. (1985). Large-scale communicative testing: A case study. In Y. P. Lee, A. C. Y. Fok, R. Lord, & G. Low (Eds.), New directions in language testing (pp. 35-46). Oxford: Pergamon Press.

Washback & Test Validity The Standards for Educational and Psychological Testing (NCME, AERA, APA, 1999) suggest a grouping of five kinds of evidence may be useful in evaluating high stakes examinations: Test Content Response Processes Internal Structure Relations to Other Variables Consequences of Testing

Consequences of Testing “Tests are commonly administered in the expectation that some benefit will be realized from the intended use of the scores... A fundamental purpose of validation is to indicate whether these specific benefits are realized.” (AERA, APA, NCME, 1999, p. 16) Tests are commonly administered in the expectation that some benefit will be realized from the intended use of the scores. This is referred to as consequential validity. The consequences of tests and test scores are clearly important and can be both positive and negative. To give you an example, suppose that scientists developed a new test for detecting a type of cancer. And this test was very accurate. Suppose that the test began to be used widely, and it was noticed that many people who had a positive test result were committing suicide. It is obvious that there is an unintended negative consequence. But the consequence has absolutely no bearing on the accuracy of the test. The test is still accurate in detecting the cancer. It is incumbent on the test developer to check for consequences. But it should be clear that consequences are not a part of the inference at all, and therefore, consequences have no part in validity.

Consequences & Impact Bachman and Palmer’s test usefulness framework (1996) Reliability + Construct Validity + Authenticity + Interactiveness + Impact + Practicality Kunnan’s test fairness framework (2004) Validity + Absence of Bias + Access + Administration + social consequences Bachman, L. F. and Palmer, A.S. (1996). Language Testing in Practice, Oxford University Press, Oxford, England. Kunnan, A. J. (2004). Test fairness. In M. Milanovic, C. Weir, & S. Bolton (Eds.). Europe language testing in a global context: Selected papers from the ALTE conference in Barcelona. Cambridge: Cambridge University Press.

Scope of Influence Impact Washback - Teacher - Learner Micro/Local - International - National Macro/Social Impact Washback

Characteristics of Washback Individual Positive Narrow Intended Short term Perceptions Low Scale Social Negative Broad Unintended Long term Actions High Value Focus Intentionality Length Stimulus Stakes

Components of Washback Participants Students, teachers, administrators, materials developers, researchers, selecting officials Processes Using, studying, speaking the language,worrying, memorizing, cheating (de)emphasizing, pacing, tailoring, tutoring Products Course content, methodology, curricula, materials

Bailey’s Model of Washback

Green’s Model of Washback Washback direction Target task characteristics Test design characteristics Overlap Positive washback Negative washback Washback variability Participant characteristics and values Knowledge / understanding of test demands Resources to meet test demands Difficulty Washback intensity Easy Challenging Unachievable Important Unimportant No washback Intense washback Importance Washback

Lam’s Types of Washback Timetable Performance Methodology Learner Teacher Content CurriculumDeveloper Attitude Textbook Proofreading

Stakeholders in the Testing Community Stakeholders input to test design Stakeholders use test scores Learners Teachers Administrators Military Hierarchy Government Agencies Receiving Institutions Course Writers Testing Centers Test Writers Examiners Consultants (A)LTS BILC Learners Teachers Administrators Military Hierarchy Government Agencies Receiving Institutions Professional Orgs Researchers (A)LTS BILC STANAG 6001 Test Construct Test Specs Test Conditions Assessment Criteria Test Scores Saville N (2009) Developing a model for investigating the impact of language assessment within educational contexts by a public examination provider, unpublished PhD thesis.

Washback Works Both Ways Teachers and Teaching Tests and Testing How can teaching affect testing? construct under-representation narrowed domain limited tasks/content construct irrelevant variance background knowledge testwiseness

Promoting Beneficial Backwash Hughes suggests some salutary practices: 1. Test the abilities whose development you want to encourage. 2. Sample widely and unpredictably. 3. Use direct testing. 4. Make testing criterion-referenced. 5. Base achievement on objectives. 6. Ensure [that the] test is known and understood by students and teachers. 7. Where necessary, provide assistance to teachers. Hughes, A. (1989). Testing for language teachers. Cambridge: Cambridge University Press.

Questions? References Alderson J C and Banerjee J (1996) How might impact study instruments be validated? Paper commissioned by the University of Cambridge Local Examinations Syndicate (UCLES) as part of the IELTS Impact Study Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics 14(2), 115-129. Alderson, J. C., & Wall, D. (1996). Editorial. Language Testing 13(3), 239-240. Andrews, S., & Fullilove, J. (1994). Assessing spoken English in public examinations- why and how? In J. Boyle & P. Falvey (Eds.), English language testing in Hong Kong (pp. 57-86). Hong Kong: Chinese University Press. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press. 46

References Alderson J C and Banerjee J (1996) How might impact study instruments be validated? Paper commissioned by the University of Cambridge Local Examinations Syndicate (UCLES) as part of the IELTS Impact Study Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics 14(2), 115-129. Alderson, J. C., & Wall, D. (1996). Editorial. Language Testing 13(3), 239-240. Andrews, S., & Fullilove, J. (1994). Assessing spoken English in public examinations- why and how? In J. Boyle & P. Falvey (Eds.), English language testing in Hong Kong (pp. 57-86). Hong Kong: Chinese University Press. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press. 46 Bachman (2005) Building and supporting a case for test use, Language Assessment Quarterly 2, 1, 1-34 Bailey, K. M. (1999). Washback in language testing. TOEFL Monograph Series, Ms. 15. Princeton, NJ: Educational Testing Service. Buck, G. (1988). Testing listening comprehension in Japanese university entrance examinations. JALT Journal (10), 12-42. Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics (1), 1-47. Bachman (2005) Building and supporting a case for test use, Language Assessment Quarterly 2, 1, 1-34 Bailey, K. M. (1999). Washback in language testing. TOEFL Monograph Series, Ms. 15. Princeton, NJ: Educational Testing Service. Buck, G. (1988). Testing listening comprehension in Japanese university entrance examinations. JALT Journal (10), 12-42. Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics (1), 1-47. Cheng, L. (2004). The washback effect of a public examination change on teachers’ perceptions toward their classroom teaching. In L. Cheng, Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp. 146-170). Mahwah, NJ: Lawrence Erlbaum Associates. Fullilove, J. (1992). The tail that wags. Institute of Language in Education Journal (9), 131-147. G-TELP (General Test of English Proficiency) Information Bulletin. (1990). San Diego: San Diego State University, International Testing Service Center, College of Extended Studies. G-TELP (General Test of English Proficiency). (undated). Seoul: G-TELP Committee of Korea.

References Cheng, L. (2004). The washback effect of a public examination change on teachers’ perceptions toward their classroom teaching. In L. Cheng, Y. Watanabe, & A. Fullilove, J. (1992). The tail that wags. Institute of Language in Education Journal (9), 131-147. Hamp-Lyons, L. 1997. ‘Washback, impact and validity: ethical concerns’. Language Testing 14/3: 295–303. Hughes, A. (1989). Testing for language teachers. Cambridge: Cambridge University Press. Hughes, A. (1993). Backwash and TOEFL 2000. Unpublished manuscript, University of Reading. 49 Kane, M. T. (2006). Validation. In R. Brennan (Ed.), Educational measurement, 4th ed (pp. 17-64). Westport, CT: Praeger Kunnan, A. J. (2004). Test fairness. In M. Milanovic & C. Weir (Eds.), European language testing in a global context (pp. 27-48). Cambridge, UK: CUP. Lam, H. P. (1994). Methodology washback- an insider's view. In D. Nunan, R. Berry, & V. Berry (Eds.), Bringing about change in language education: Proceedings of the International Language in Education Conference 1994 (83-102). Hong Kong: University of Hong Kong. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Education measurement (3rd ed., pp. 13-103). New York: Macmillan. Messick, S. (1994). The interplay of evidence and consequences in the validation of performanceasse ssments. Educational Researcher (1)23, 13- 23. Messick, S. (1996). Validity and washback in language testing. Language Testing 13(3), 241-256. Hamp-Lyons, L. 1997. ‘Washback, impact and validity: ethical concerns’. Language Testing 14/3: 295–303. Hughes, A. (1989). Testing for language teachers. Cambridge: Cambridge University Press. Hughes, A. (1993). Backwash and TOEFL 2000. Unpublished manuscript, University of Reading. 49 Kane, M. T. (2006). Validation. In R. Brennan (Ed.), Educational measurement, 4th ed (pp. 17- 64). Westport, CT: Praeger Kunnan, A. J. (ed.). 2000. Fairness and validation in language assessment: Selected papers from the 19th Language Testing Research Colloquium, Orlando, Florida. Studies in Language Testing, Vol. 9. Cambridge: UCLES/Cambridge University Press. Lam, H. P. (1994). Methodology washback- an insider's view. In D. Nunan, R. Berry, & V. Berry (Eds.), Bringing about change in language education: Proceedings of the International Language in Education Conference 1994 (83-102). Hong Kong: University of Hong Kong.

References Popham, W. J. (1991). Appropriateness of teachers' test preparation practices. Educational Measurement: lssues and Practices 10(1), 12-15. Reckase, M. (1998). Consequential validity from the test developer’s perspective. Educational Measurement: Issues and Practice, 17, 13-16. Saville N. (2009). Developing a model for investigating the impact of language assessment within educational contexts by a public examination provider, unpublished PhD thesis. Shepard, L. A. (1993). The place of testing reform in educational reform: A reply to Cizek. Educational Researcher, 22, 10-14. Shohamy, E. (2005). The power of tests over teachers: the power of teachers over tests. In D.J. Tedick (Ed.), Second language teacher education: International perspectives (pp. 101-111). Mahwah, NJ: Lawrence Erlbaum Associates. Swain, M. (1984). Large-scale communicative testing: A case study. In S. L. Savignon & M. Berns (Eds.), Initiatives in communicative language teaching (pp. 185-201). Reading, MA: Addison Wesley. Swain, M. (1985). Large-scale communicative testing: A case study. In Y. P. Lee, A. C. Y. Fok, R. Lord, & G. Low (Eds.), New directions in language testing (pp. 35-46). Oxford: Pergamon Press.

References Wall, D., & Alderson, J. C. (1993). Examining washback: The Sri Lankan impact study. Language Testing 10(1), 41-69. Watanabe, Y. (1996). Does grammar translation come from the entrance examination? Preliminary findings from classroom-based research. Language Testing 13(3), 318-333. Weir C (2005) Language Testing and Validity Evidence: Oxford:. Palgrave.