PRINCIPLES OF LANGUAGE ASSESSMENT Riko Arfiyantama Ratnawati Olivia.

Slides:



Advertisements
Similar presentations
Quality Control in Evaluation and Assessment
Advertisements

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
VALIDITY AND RELIABILITY
Lesson Six Reliability.
Testing What You Teach: Eliminating the “Will this be on the final
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 6 Validity.
Types of Tests. Why do we need tests? Why do we need tests?
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.
Assessment: Reliability, Validity, and Absence of bias
VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green 1.
VALIDITY.
Language Testing Introduction. Aims of the Course The primary purpose of this course is to enable students to become competent in the design, development,
Assessment Foreign Language Pedagogy EA 125. What is assessment for?  What do we want to assess?  How do teachers benefit from assessment?  How do.
BASIC PRINCIPLES OF ASSSESSMENT RELIABILITY & VALIDITY
VALIDITY & RELIABILITY Raja C. Bandaranayake. QUALITIES OF MEASUREMENT DEVICES  Validity Does it measure what it is supposed to measure?  Reliability.
Linguistics and Language Teaching Lecture 9. Approaches to Language Teaching In order to improve the efficiency of language teaching, many approaches.
Classroom Assessment A Practical Guide for Educators by Craig A
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
Introduction to Assessment ESL Materials and Testing Week 8.
Principles of Language Assessment Ratnawati Graduate Program University State of Semarang.
Shawna Williams BC TEAL Annual Conference May 24, 2014.
Technical Issues Two concerns Validity Reliability
Measurement and Data Quality
T HE V ALIDITY OF A SSESSMENT -B ASED I NTERPRETATIONS “A test is only a measuring instrument, an instrument far less precise than most people believe.”
6 th semester Course Instructor: Kia Karavas.  What is educational evaluation? Why, what and how can we evaluate? How do we evaluate student learning?
PhD Research Seminar Series: Reliability and Validity in Tests and Measures Dr. K. A. Korb University of Jos.
Principles of Language Assessment
Reliability and Validity what is measured and how well.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Reliability Lesson Six
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Principles of Test Construction
Validity & Practicality
Principles in language testing What is a good test?
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. Concepts in language assessment and testing theory. You will be.
Chap. 2 Principles of Language Assessment
Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.
USEFULNESS IN ASSESSMENT Prepared by Vera Novikova and Tatyana Shkuratova.
Reliability vs. Validity.  Reliability  the consistency of your measurement, or the degree to which an instrument measures the same way each time it.
Module 6 Testing & Assessment Part 1
Measurement Validity.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
Ch 9 Internal and External Validity. Validity  The quality of the instruments used in the research study  Will the reader believe what they are readying.
Validity in Testing “Are we testing what we think we’re testing?”
Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.
Chapter 3 Selection of Assessment Tools. Council of Exceptional Children’s Professional Standards All special educators should possess a common core of.
Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,
Unit 3 L2 Testing (2): The cornerstones of language testing.
Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
WHS AP Psychology Unit 7: Intelligence (Cognition) Essential Task 7-3:Explain how psychologists design tests, including standardization strategies and.
Test evaluation (group of 4; oral presentation, mins) *Purpose: Apply the principles learned in your reading and class lectures to evaluate an existing.
Language Assessment.
Principles of Language Assessment
Introduction to Assessment
Reliability and Validity in Research
Validity and Reliability
پرسشنامه کارگاه.
Learning About Language Assessment. Albany: Heinle & Heinle
The extent to which an experiment, test or any measuring procedure shows the same result on repeated trials.
BASIC PRINCIPLES OF ASSESSMENT
Why do we assess?.
Presentation transcript:

PRINCIPLES OF LANGUAGE ASSESSMENT Riko Arfiyantama Ratnawati Olivia

J OB D ESCRIPTION Speaker I - Practicality - Reliability - Validity Speaker II - Authenticity - Washback Speaker III - Applying principles to the evaluation of classroom tests

H OW DO YOU KNOW IF A TEST IS EFFECTIVE ? 1. Practicality 2. Reliability 3. Validity 4. Authenticity 5. Washback

P RACTICALLY An effective test is Practical: Is not excessively expensive, Stays within appropriate time constraints, Is relatively easy to administer, and Has a scoring/evaluation procedure that is specific and time-efficient.

RELIABILITY A reliable test is consistent and dependable. If you give the same test to the same student or matched students on two different occasion, the test should yield similar results. First occasionSecond occasion Test I Test II

T HE POSSIBILITIES OF R ELIABILITY The fluctuations in: The students Scoring Test administration The test itself

S TUDENT -R ELATED R ELIABILITY The fluctuation in the student can be caused by the following factors: Temporary illness, Fatigue A “bad day” Anxiety Other physical and psychological factors

R ATER R ELIABILITY The fluctuation in scoring can be caused by the following factors: Human error (teacher’s fatigue) Subjectivity Bias (good or bad students) Lack of attention to scoring criteria Inexperience Inattention

T EST A DMINISTRATION R ELIABILITY The fluctuation in administration can be caused by the following factors: The condition (place) of the test administration e.g. listening test becomes unclear because of the street noise. Photocopying variations The amount of light in different parts of the room. Variations in temperature. The condition of desks and chairs.

T EST R ELIABILITY The fluctuation in the test itself can be caused by the following factors: Time limitation in a test The test is administered too long so the test- takers may become fatigue.

VALIDITY The extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment. (Gronlund, 1998: 226) For example: A valid test of reading ability actually measures reading ability. A valid test of writing ability actually measures writing ability not grammar.

C ONTENT -R ELATED E VIDENCE The validity of the test depends on the content and the relation between the purpose of the test (content) and the way the test is administered (related). For example: To get a valid speaking test, the students should do the direct test by giving the students’ chance to perform their ability in speaking, not by giving them paper-and-pencil test.

C RITERION -R ELATED E VIDENCE Criterion-related Evidence usually falls into one of two categories: Concurrent Validity: a test has concurrent validity if its results are supported by other concurrent performance beyond the assessment it self. E.g. a high score of the final exam of a foreign language course will be sustained by actual proficiency in the language. Predictive Validity: the predictive validity of an assessment becomes important in the case of placement tests, admissions assessment batteries, etc. The assessment criterion in such cases is not to measure concurrent ability but to assess (and predict) a test-taker’s likelihood of future success.

C ONSTRUCT -R ELATED E VIDENCE A construct is any theory, hypothesis, or model that attempts to explain observed phenomena in our universe of perceptions. For examples: linguistic construct covers “proficiency” and “communicative competence”, and psychological construct covers “self-esteem” and “motivation”.

C ONSEQUENTIAL V ALIDITY Consequential Validity encompasses all the consequences of a test, including such considerations as its accuracy in measuring intended criteria, its impact on the preparation of test-takers, its effect on the learner, and the (intended and unintended) social consequences of a test’s interpretation and use. McNamara (2000: 54) cautions against test results that may reflect socioeconomic conditions such as opportunities for coaching that are “differentially available to the students being assessed (for example, because only some families can afford coaching)”