Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,

Language Assessment

Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements, etc.) Assessment: Part of evaluation; variety of ways of collecting information on language ability or achievement Testing: Sub-category of assessment; formal, systematic procedure used to gather information

TYPES OF TESTS Placement Tests: Purpose is to place a student into a particular level or section of a language curriculum; to create groups of learners that are homogeneous in level. Some proficiency tests can act in the role of placement. It usually includes a sampling of material to be covered in the course. Placement tests come in many varieties: Written and oral performance Open-ended and limited response Multiple-choice and gap-filling, etc

Language Aptitude Tests: Designed to measure capacity and general ability to learn a foreign language. Two such standard tests: Modern Language Aptitude Test (MLAT) Pimsleur Language Aptitude Battery (PLAB) They are language tests and involve language-related tasks. They do NOT predict communicative success in a language, so they are seldom used today.

Diagnostic Tests: Designed to diagnose specified aspects of a language for further help. E.g. a test in pronunciation might diagnose the phonological features of English that are difficult for learners. Generally, such tests offer a checklist of features for the teacher to use in pinpointing difficulties. A test can have dual purposes: both for placement and diagnostic

Achievement Tests / Progress Tests Directly related to classroom lessons, units, curriculum: Limited to the material in the curriculum. Teacher-produced Administered at various stages throughout a course to see what students have learned; achievement tests are generally administered at mid- and end-point of the semester Progress tests are narrower in focus Can also serve for diagnostic purposes

Proficiency Tests To test global competence in a language. Not limited to one course, curriculum or single skill: it tests overall ability. Generally consists of standardized multiple-choice items on grammar, vocabulary, reading comprehension, and aural comprehension; sometimes with a writing component. They are summative and generally norm-referenced. TOEFL is an example

OTHER WAYS OF LABELING TESTS Objective & Subjective Tests The way a test is scored Objective test: scored with an established set of correct responses on an answer key and not affected by the scorers’ judgment Subjective test: requires scoring by personal judgment, like in essays

Norm-Referenced & Criterion-Referenced Tests Norm-referenced: used to place test-takers along a mathematical continuum. It is generally administered to large audiences and test taker’s score is reported in the form of a numerical score (200 out of 250). Such tests have fixed, predetermined responses. The mean, median, and standard deviation is important in interpreting the scores. Criterion-referenced: generally designed to give test-takers feedback (to see what they have done) so most classroom tests are criterion-referenced. Distribution of scores is not very important.

Formative & Summative Assessment This distinction is about the function of assessment: how is the procedure to be used? Formative Assessment: evaluating students in the process of forming; in other words, all kinds of informal and ongoing assessment are formative Summative Assessment: measuring what a student has learned; e.g. final exam is a summative assessment

High-Stakes & Low-Stakes Tests High-Stakes Tests: have a major effect on a large numbers of test-takers (e.g. TOEFL) Low-Stakes Tests: have a minor effect on a small number of test-takers (e.g. classroom exams)

Traditional & Alternative Assessment Traditional AssessmentAlternative Assessment -One-shot standardized exams -Timed, multiple-choice format -Decontextualized test items -Scores suffice for feedback -Norm-referenced scores -Focus on the right answer -Summative -Oriented to product -Non-interactive performance -Fosters extrinsic motivation -Continuous long-term assessment -Untimed, free-response format -Contextualized communicative tasks -Individualized feedback, washback -Criterion-referenced scores -Open-ended, creative answers -Formative -Oriented to process -Interactive performance -Fosters intrinsic motivation

Types of Alternative Assessment Self-assessment Portfolio assessment Student-designed tests Learner-centered assessment Projects Presentations

Cornerstones/Principles of Testing Usefulness What are we going to use the test for? Each test must have a specific purpose, a particular group of test-takers, and a specific language use in mind

Validity Validity answers the question to what extent the inferences made from the assessment are appropriate, meaningful and useful in terms of the purpose of the assessment. Test what you teach and how you teach it! E.g. a valid test of reading ability measures only the reading ability, not previous knowledge, not vision!

Types of validity Content validity: The test assesses the course content and outcomes that are familiar to students Construct validity: The fit between the underlying theories and methodology of language learning and the type of assessment. (e.g. Teaching writing in process but assessing only the product means it doesn’t have a construct validity) Face validity: The test looks as if it measures what it is supposed to measure (it should look professional)

Reliability A reliable test is consistent and dependable. If a test is given to the same group of students (or matched students) two different times, the tests should yield similar results

Factors that affect reliability Student-related reliability A temporary illness, a bad day, anxiety, etc may affect the ‘true score’ of the test taker Rater reliability Human error, subjectivity can affect the score. Inter-rater reliability Intra-rater reliability

Test Administration Reliability The conditions in which the test is administered: noise, weather, copy quality, etc. Test Reliability The nature of the test: length of the test, too many or too few items, poorly written test items

Practicality A practical test Is not excessively expensive Stays within appropriate time constraints Is relatively easy to administer Has a scoring procedure that is specific and time-efficient

Washback Effect of testing on teaching (a facet of consequential validity) Washback in large-scale tests: students prepare for these tests: washback (teaching to the test!) Washback in classroom assessment: diagnosing students’ weaknesses and strengths preparation for the assessment Formative assessment and performance assessment naturally have positive washback effect because of the feedback given to the students. Formal assessment can also have positive washback effect if there is more than just a simple grade Summative assessment, which is given at the end of a course, can even have washback effect because it can mean something for the future

Authenticity If the characteristics of a given language test task corresponds to the features of target language task, then we are talking about authenticity. Authenticity is present in the following ways: Language is as natural as possible Items are contextualized (not isolated) Topics are meaningful for the learner Some thematic organization is present (e.g. story line) Tasks represent real-world tasks

Transparency Clear, accurate information to students about testing. Outcomes to be evaluated Formats used Weighting of items and sections Time allowed to complete the test Grading criteria Transparency involves students in the assessment process

Security Security is not only an issue of large-scale high-stakes tests. Good test items can be used again, so there should not be a threat to security.

Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,

Similar presentations

Presentation on theme: "Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,

Similar presentations

Presentation on theme: "Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,"— Presentation transcript:

Similar presentations

About project

Feedback