Vocabulary Assessment Norbert Schmitt University of Nottingham

Slides:



Advertisements
Similar presentations
Assessment types and activities
Advertisements

FACULTY DEVELOPMENT PROFESSIONAL SERIES OFFICE OF MEDICAL EDUCATION TULANE UNIVERSITY SCHOOL OF MEDICINE Using Statistics to Evaluate Multiple Choice.
STAAR/EOC Overview of Assessment Program HISD Professional Support & Development High School Science Team.
Item Writing Techniques KNR 279. TYPES OF QUESTIONS Closed ended  Checking yes/no, multiple choice, etc.  Puts answers in categories  Easy to score.
Standards, data and assessment. Links to Tfel 1.6 Design, plan and organise for teaching and learning 2.4 Support and challenge students to achieve high.
Some Practical Steps to Test Construction
FLIPPING THE CLASSROOM: ADVENTURES IN STUDENTS’ SELF DIRECTED STUDY ERI TOMITA AND JULIE DEVINE.
Report Assessment AE Semester Two
Power Analysis for Correlation & Multiple Regression Sample Size & multiple regression Subject-to-variable ratios Stability of correlation values Useful.
Stages of testing + Common test techniques
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 9 Subjective Test Items.
© Curriculum Foundation1 Section 2 The nature of the assessment task Section 2 The nature of the assessment task There are three key questions: What are.
 Main Idea/Point-of-View  Specific Detail  Conclusion/Inference  Extrapolation  Vocabulary in Context.
Introduction: Teaching and Testing/Assessment
Chap. 3 Designing Classroom Language Tests
AP English Language & Composition Exam Review
Classroom Assessments Checklists, Rating Scales, and Rubrics
Language and Content-Area Assessment Chapter 7 Kelly Mitchell PPS 6010 February 3, 2011.
CHAPTER 10 – VOCABULARY: STUDENTS IN CHARGE Presenter: 1.
Test Taking Strategies. Prepare to avoid errors: Analyze your past results and errors Arrive early and prepared for tests Be familiar with exam question.
Confidence Intervals for Proportions Chapter 8, Section 3 Statistical Methods II QM 3620.
Tahir Mahmood Lecturer Department of Statistics. Outlines: E xplain the role of sampling in the research process D istinguish between probability and.
Time for Multi-State Models of Vocabulary Acquisition? Rob Waring
CHAPTER 10 – VOCABULARY: STUDENTS IN CHARGE Presenter: Laura Mizuha 1.
Lectures ASSESSING LANGUAGE SKILLS Receptive Skills Productive Skills Criteria for selecting language sub skills Different Test Types & Test Requirements.
Assessment and Testing
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
Testing and Evaluation
What are the stages of test construction??? Take a minute and try to think of these stages???
Assessment at KS4 Bury C of E High School Engaging Parents Information.
THE TEST OF ORAL ENGLISH PROFICIENCY YOUR GUIDE TO PREPARING FOR THE TOEP November 13, 2015 Dawn Takaoglu.
APA NJ APA Teacher Training 2 What is the Purpose of the APA? To measure performance of students with the most significant cognitive disabilities.
Unit 2 The Nature of Learner Language 1. Errors and errors analysis 2. Developmental patterns 3. Variability in learner language.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Stages of Test Development By Lily Novita
Chapter 7 Table of Contents Introduction Guidelines for Monitoring and Assessment Guidelines for Monitoring and Assessment Types of Monitoring and.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics: A First Course 5 th Edition.
Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
KS1 SATS Guidance for Parents
1 Vocabulary acquisition from extensive reading: A case study Maria Pigada and Norbert Schmitt ( 2006)
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SELECTED.
Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.
Brown, D. (2011). What aspects of vocabulary knowledge do textbooks give attention to? Language Teaching Research, 15(1),
 Good for:  Knowledge level content  Evaluating student understanding of popular misconceptions  Concepts with two logical responses.
Writing Selection Items
COMMON TEST TECHNIQUES FROM TESTING FOR LANGUAGETEACHER.
COMMON TEST TECHNIQUES FROM TESTING FOR LANGUAGE TEACHERs.
Diane Schmitt Nottingham Trent University
Year 2 Stay and Play!.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Nation on testing.
Chapter 6: Checklists, Rating Scales & Rubrics
Test Based on Response There are two kinds of tests based on response. They are subjective test and objective test. 1. Subjective Test Subjective test.
Writing Reading Items Module 2 Activity 4.
Writing Reading Items Module 2 Activity 4.
Data Analysis and Standard Setting
Kind of Test Based on Purposes
Writing Vocabulary Items
Classroom Assessments Checklists, Rating Scales, and Rubrics
Learning About Language Assessment. Albany: Heinle & Heinle
TESTING AND LANGUAGE TEACHING
VOCABULARY ASSESSMENT
An Introduction to e-Assessment
Exploring Assessment Options NC Teaching Standard 4
KS1 SATS Guidance for Parents
Designing Your Performance Task Assessment
Why do we assess?.
Test Construction: The Elements
“QA” = quality assurance
Presentation transcript:

Vocabulary Assessment Norbert Schmitt University of Nottingham

Vocabulary Assessment Nearly all teachers do vocabulary assessment of some sort, ranging from informal observation, to short quizzes, to more formal examinations While informal assessment may not be difficult, designing good vocabulary measures for higher stakes purposes requires a considerable amount of expertise Most teachers (and educators and researchers in general!) lack this expertise

Vocabulary Assessment I’ve been thinking about vocabulary measurement since the early 1990s. Here are 4 questions on test development which I came up with in 1994 (Thai TESOL Bulletin).

Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) 3. WHAT ASPECTS OF THESE WORDS DO YOU WANT TO TEST? 4. HOW WILL YOU ELICIT STUDENTS' KNOWLEDGE OF THESE WORDS?

Vocabulary Assessment WHY DO YOU WANT TO TEST? To see if students have learned taught words (achievement)

Vocabulary Assessment WHY DO YOU WANT TO TEST? To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic)

Vocabulary Assessment WHY DO YOU WANT TO TEST? To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement

Vocabulary Assessment WHY DO YOU WANT TO TEST? To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement Part of a proficiency test

Vocabulary Assessment WHY DO YOU WANT TO TEST? To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement Part of a proficiency test Motivation

Vocabulary Assessment WHY DO YOU WANT TO TEST? To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement Part of a proficiency test Motivation Washback (tests reflect educator goals)

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) It depends on the purpose of the test

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Achievement = ?

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Achievement = lexical items that have been taught

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Diagnostic = ?

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Diagnostic = The lexical items a student is expected to know, or should know at a certain level

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Placement = ?

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Placement = The lexical items that will be taught in a course, or that a student may know at the level being taught in the course. Also the foundation vocabulary expected to be learned before entering the course.

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Proficiency = ?

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Proficiency = A range of vocabulary, especially some that will be challenging for the best students

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Motivation = ?

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Motivation = Lexical items that were recently taught, or the items that the students see as useful for reaching their goals (e.g. TOEFL, university entrance exam) (or any vocabulary : testing always makes students study?)

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Washback = ?

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) Washback = any vocabulary, as the act of putting vocabulary on a test shows that it is important Is a way of highlighting education goals

Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) It depends How long should the test be? (low/high stakes) Longer is better, but it must be a practical length What sampling rate will you accept?

Vocabulary Assessment Sampling Rate You typically cannot test every lexical item So you need to extract a representative sample Depends on item format: checklist format allows more items than multiple-choice 1/5, 1/10, 1/100, 1/1,000? Many vocabulary tests have very low sampling rates (e.g. VLT is only 3/100)

Vocabulary Assessment How to Sample? Random Systematically: every nth item, every nth page, etc. Equal proportions of different word classes (nouns, verbs, etc.) Only the most difficult (least frequent?) items, on the assumption that these are the items which will not be known)

Vocabulary Assessment 3. WHAT ASPECTS OF THESE WORDS DO YOU WANT TO TEST? Which word knowledge aspects will you cover? Form-meaning link is the minimum specification It is also the typical specification (Why do you think this is so?)

Vocabulary Assessment HOW WILL YOU ELICIT STUDENTS’ KNOWLEDGE OF THESE WORDS? Which item format will you use?

Item Formats Let’s look at a number of item formats What word knowledge aspects do they address? Are they receptive or productive? Are they size or depth tests? What are their advantages and disadvantages? For what testing purposes might they most useful? Least useful?

Size & Depth Test Formats Next, let’s look at a number (semi-) established test formats: Vocabulary Size test formats Multiple-choice formats Vocabulary Levels Test Vocabulary Depth Formats Developmental Scales Vocabulary Knowledge Scale Schmitt and Zimmerman Scale Word Associates Format

Checklist (Yes-No) Tests Checklist tests are straightforward to take Learners just check () which words they think they know Here is a checklist test from one of the best known studies into the vocabulary size of native English speakers (NZ university students)

Checklist (Yes-No) Tests Checklist tests are an efficient way of testing a lot of lexical items This allows to a high sample rate Easy to build and easy to mark But learners sometimes overestimate their knowledge (i.e. they check words they don’t actually know) How to control for this? Meara’s 1992 Checklist Tests

Checklist (Yes-No) Tests The most common way is to add nonwords to the test, and see if they check them as known If so, then their scores are adjusted down Meara’s adjustment table However, the adjustment formulas are all a bit wonky In some research, data is deleted if a certain number of nonwords are checked as known In the end, checklist tests don’t work very well if examinees are not honest and careful So the usefulness of the test format depends on the examinees behavior to a large extent

Adjusting Checklist (Yes-No) Tests Reaction Time (speed of response) is a viable way of adjusting accuracy Faster responses are usually more sure Pellicer-Sanchez and Schmitt (2012) Language Testing Best adjustment formula by individual result and False Alarm rate FA rate Best adjustment formula NS NNS 0 RT RT 1 H − FA > RT RT = Δm 2 H − FA = RT H − FA 3 — H − FA 4 — H − FA 8 — Isdt> H − FA

Vocabulary Knowledge Scale Often used as a depth test Is a developmental type of measurement But there are many problems with this scale: See Researching Vocabulary for a full critique: How many stages should scale have? Not an interval scale Can’t use inferential statistics with it Sentences often not informative Not clear what VKS is measuring

Schmitt & Zimmerman Scale Suffers from many of the same problems as VKS Fewer stages make it more transparent? Written in a ‘can-do’ manner: easier for learners to say what they can do than what they know More closely connected to receptive vs. productive mastery Tests uses non-words (artivious, ploat) to assure honesty of response Which is better?

Word Associates Format (Read, 2000) One of the most used depth test formats Comes in 8-word and 6-word versions, some with boxes and some with words in lists Learners circle all of the words which are associated with the target word Left box is meaning-based Right box has collocations Ratio of answers per box can vary to make guessing more difficult

Word Associates Format (Read, 2000) If learner correctly selects all correct associations and none of the distractors, then this shows good knowledge of the target word If learner selects none of the correct options, and this indicates little or no knowledge of word But what about ‘split’ answers: some correct options and some incorrect ones? MA research at Nottingham (Schmitt, Ng, & Garras, 2011) shows that this actually corresponds to little real knowledge of the words That is, split scores do not indicate reliable knowledge

Vocabulary Website Most Schmitt (and colleagues) research is available on Norbert Schmitt’s personal website: www.norbertschmitt.co.uk There are also vocabulary resources, including vocabulary tests on the site