Assisting cloze test making with a web application Ayako Hoshino ( 星野綾子 ) Hiroshi Nakagawa ( 中川裕志 ) University of Tokyo ( 東京大学 ) Society for Information.

Slides:



Advertisements
Similar presentations
Common Core Standards (What this means in computer class)
Advertisements

Testing Relational Database
Assessment types and activities
The Cost of Authoring with a Knowledge Layer Judy Kay and Lichao Li School of Information Technologies The University of Sydney, Australia.
By Constanza Lermanda G.. Topic: Electronic NewspaperDuration of the lesson: 50 minutesGrade Level: 12th.
Writers Companion was designed as the ONE program you will need for the entire writing process: Brainstorm, Organize, Edit and Publish in one program.
Academic Writing Writing an Abstract.
What’s New with the FSA? January 6 & 7, 2015 Mildred Grimaldo, Director, Literacy Department.
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
Chapter 3: The Direct Method
Information Retrieval in Practice
THE TEXT PARAMETERS AND THE TECHNIQUES OF READING TESTING IN CHRISTIAN AND ISLAM RELIGION INSTITUTES (A CASE STUDY AT INSTITUTE ALKITAB TIRANUS (IAT) BANDUNG.
Lesson Eleven Cloze. Yun-Pi Yuan 2 Contents  What is cloze procedure? What is cloze procedure?  A cloze passage A cloze passage  Two kinds of cloze.
The aim of this part of the curriculum design process is to find the situational factors that will strongly affect the course.
Overview of Search Engines
Student Page Top Introduction Task Process Evaluation Conclusion Teacher page Credits Evolution: Darwin Theory A WebQuest for Grade 12 (Darwin Theory)
Text Complexi ty in the Common Core Classroo m Patricia Coldren Lee County Schools k 12. nc. us.
Defining Rigor through Research and the Common Core Standards
Benefits from Formal and Informal Assessments
1. Learning Outcomes At the end of this lecture, you should be able to: –Define the term “Usability Engineering” –Describe the various steps involved.
Text Complexity Common Core State Standards English Language Arts
The Common Core State Standards: The Common Core State Standards: Supporting Districts and Teachers with Text Complexity Susan Pimentel, Co-Lead Author.
Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton.
Copyright © 2003 Pearson Education, Inc. Slide 1-1 Web Design & Development PHP.
Measuring Hint Level in Open Cloze Questions Juan Pino, Maxine Eskenazi Language Technologies Institute Carnegie Mellon University International Florida.
AP English Language & Composition Exam Review
Text Complexity and the Common Core Standards. Building knowledge through content-rich nonfiction (text complexity) Reading, writing, and speaking grounded.
Building Effective Assessments. Agenda  Brief overview of Assess2Know content development  Assessment building pre-planning  Cognitive factors  Building.
Web Technologies Website Development Trade & Industrial Education
Practice with Text Complexity Science.  Review the 3 dimensions of text complexity  Analyze the 3 dimensions of text complexity using a science text.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
1 Developing a Departmental Style Guide by Jean Hollis Weber Presented by Elliot Jones.
Session 2 Traditional Assessments Session 2 Traditional Assessments.
A system for generating teaching initiatives in a computer-aided language learning dialogue Nanda Slabbers University of Twente Netherlands June 9, 2005.
Reading Test. Specifying what the students should be able to do 1- Operations: - Reading seems to be an easy skill to test. It is not true all the time.
Using Short-Answer Format Questions for an English Grammar Tutoring System Conceptualization & Research Planning Jonggun Gim.
Are you ready to play…. Deal or No Deal? Deal or No Deal?
Lectures ASSESSING LANGUAGE SKILLS Receptive Skills Productive Skills Criteria for selecting language sub skills Different Test Types & Test Requirements.
Grammar Translation Method
 There must be a coherent set of links between techniques and principles.  The actions are the techniques and the thoughts are the principles.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
Jane Reid, AMSc IRIC, QMUL, 30/10/01 1 Information seeking Information-seeking models Search strategies Search tactics.
Teaching Writing.
The Direct Method 1. Background It became popular since the Grammar Translation Method was not very effective in preparing students to use the target.
1 Lesson 8 Editing and Formatting Documents Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
TABLE OF CONTENTS 2014 BasmahAlQadheeb. What is a report? A report is a clearly structured document that presents information as clearly as possible.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Introduction to Language Teaching The Grammar-Translation Method.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
Word and the Writing Process. To create a document 1.On the Start menu, point to Programs, and then click Microsoft Word. A new document opens in Normal.
Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.
Making Cloze Exercises Easily and Studying English Effectively Online Kenji Kitao Doshisha University Kenichi Kamiya Osaka Institute.
IN READING, LANGUAGE, AND LISTENING Grades 6-8 Florida Standards Assessment in English Language Arts.
AUTHOR: NADIRAN TANYELI PRESENTER: SAMANTHA INSTRUCTOR: KATE CHEN DATE: MARCH 10, 2010 The Efficiency of Online English Language Instruction on Students’
1 Instructing the English Language Learner (ELL) in the Regular Classroom.
Investigate Plan Design Create Evaluate (Test it to objective evaluation at each stage of the design cycle) state – describe - explain the problem some.
Component D: Activity D.3: Surveys Department EU Twinning Project.
Making Cloze Exercises with Cloze Generator Kenji Kitao Doshisha University
Case Study of the TOEFL iBT Preparation Course: Teacher’s perspective Jie Chen UWO.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
COMMON TEST TECHNIQUES FROM TESTING FOR LANGUAGE TEACHERs.
TESTING READING Dr. Muhammad Shahbaz. Record Teacher Observations One of the most effective ways for teacher to assess a student’s reading comprehension.
What is a CAT? What is a CAT?.
GO! with Microsoft Access 2016
Grades 6-8 Florida Standards Assessment in English Language Arts
Preparing for the Verbal Reasoning Measure
FCE (FIRST CERTIFICATE IN ENGLISH) General information.
Presentation transcript:

Assisting cloze test making with a web application Ayako Hoshino ( 星野綾子 ) Hiroshi Nakagawa ( 中川裕志 ) University of Tokyo ( 東京大学 ) Society for Information Technology & Teacher Education 2007

Outline Introduction Method System Outline User Evaluation Conclusions 2008/11/10 2

Introduction (1/10) Grammar and vocabulary are vital elements in SLA (Second Language Acquisition), therefore grammar and vocabulary testing is necessary for a teacher to assess students' proficiency. Normally, those components are taught and assessed by non-native teachers. In such case it's hard to use new materials and bring in current topics, which would make the class more exciting and close to daily life, since it takes too much labor for them to develop new materials. 2008/11/10 3

Introduction (2/10) A cloze test is preferred in such situations. – Cloze test in general refers to a format of questions where a/some word(s) are removed from a passage and the test takers are asked to figure out what the missing words are. – The advantages of this format of question are easy to make and easy to mark. In this study, we especially focus on a multiple-choice fill-in-the-blank test, which is a subcategory of a cloze test. – The questions we focus on are those on grammar and vocabulary, rather than reading comprehension and discourse structures. 2008/11/10 4

Introduction (3/10) We explore the use of NLP (Natural Language Processing) for the purpose of cloze test making. For grammar and vocabulary test making, technologies and resources we employ include sentence parsers, machine readable dictionaries and thesauri, and online text and corpora. One of our motivations is to see if the state-of-the-art NLP is practical for an educational purpose. 2008/11/10 5

Introduction (4/10) The system is designed to semi-automatically make questions in a given input text. We use online news articles as a source text. – News texts are properly written compared with other sources on the web. – Their topics are suitable for English education. – They are up-to-date so a quick question making benefits. 2008/11/10 6

Introduction (5/10) Our related studies attempt automated question generation but we find several problems in putting a totally automatic system in a practical use. The main reason is that the current NLP is not hundred percent accurate, and it is not appropriate to use it in a classroom if there can be errors. So we currently provide a semi-automatic way of making question with a view of fully automating process. 2008/11/10 7

Introduction (6/10) In principle, the system presented here works as an assistant for a teacher. – The system helps the user to choose an article, highlights grammar targets, suggests possible choices for the wrong alternatives and formats the questions in a printer-friendly way. The system assists the user to make certain types of questions, with an assumption that the alternatives of a good four-choice question should either be flat or symmetric. 2008/11/10 8

Introduction (7/10) Flat alternatives mean a set of alternatives which differ on single element (either grammar or vocabulary). – For example, a set of alternatives where all alternatives have the same part-of-speech and the same inflection form (tense etc.) but are of different vocabulary items. – Also, a set of alternatives that have different inflection forms and have the same vocabulary item is flat. Symmetric alternatives are those which consist of two different elements, each from vocabulary and grammar. 2008/11/10 9

Introduction (8/10) 2008/11/10 10

Introduction (9/10) In our approach, the system provides candidates for wrong alternatives or distractors when the blank position is specified and let the user decide which ones to use as a distractor. The current version of the system provides seven choices for vocabulary items and three for grammar variations where available. The user is allowed to enter or edit the alternatives when they cannot find needed ones in the suggested alternatives. 2008/11/10 11

Introduction (10/10) To our knowledge this is the first study that uses NLP techniques to generate grammar type questions. Especially the use of parse result makes this study a unique attempt. While other studies present automatic systems, we provide an interactive system which works as an assistance for a teacher. The resource we used for vocabulary distractors is WordNet. 2008/11/10 12

Methods (1/3) In the method proposed by Coniam (1997) and adapted by Brown et al. (1996), they select the words of close frequency for the distractors. – We also use the frequency-based method. While they select the distractors from the words appeared in the same passage, we select from the words appeared in the article plus their synonyms. Also our system provides the same part of speech and inflection form as the right answer. 2008/11/10 13

Methods (2/3) 2008/11/10 14

Method (3/3) 2008/11/10 15

System Outline (1/6) 2008/11/10 16

System Outline (2/6) The preprocess program extracts article text from an HTML document, analyze sentences and add synonyms from a thesaurus. The output of preprocess program is added to the pool of preprocessed articles, and an index file which contain summarized information of all articles is updated. The preprocessing is totally automated. It takes about 10 minutes to process one article of about 500 words. 2008/11/10 17

System Outline (3/6) The input documents (in HTML) go through a series of subprocesses: 1.Text extraction 2.Sentence splitting 3.Tagging and lemmatizing 4.Synonym lookup 5.Frequency annotation 6.Inflection generation 7.Grammar target mark up and grammar distractor generation 8.Selection of vocabulary distractor 2008/11/10 18

System Outline (4/6) 2008/11/10 19

System Outline (5/6) 2008/11/10 20

System Outline (6/6) On the first screen, the system helps the user to choose an article on which she is going to make questions. – It shows the list of articles it has proprocessed. – An article is shown as a list item with its news source, date, title, and number of words. – The user can see more information on the article, such as its lead (the first 10 words of the article), vocabulary level (Kincaid score), and grammar targets included and their times of appearance, of an article by putting the cursor on a list item. – The user can narrow down this list of articles with conditions on grammar targets and/or vocabulary levels. 2008/11/10 21

User Evaluation (1/7) We have conducted a user test on the system. – Evaluating in appropriateness questions they make with this application – Evaluating the prototype version of the system in its usefulness and its usability – Getting implications about how to improve the system The participants were teachers who are or have been teaching English. The number of the users participated was ten (all Japanese, teaching undergraduate classes). 2008/11/10 22

User Evaluation (2/7) A user test was conducted with a task and questionnaires (pre-task and post-task). – The pre-task questionnaires was on participant's background as an English teacher. The participants are asked to describe a particular class that she teaches or has been teaching recently. Then, as a task, the participants are asked to make a test for that class and evaluate each question. We asked them to make questions at a difficulty level which is as appropriate as possible for the class. – The post-questionnaire was on usefulness and usability and an overall evaluation on this system. 2008/11/10 23

User Evaluation (3/7) 2008/11/10 24

User Evaluation (4/7) The task took about 30 minutes to one hour. – After having read the user's manual, the participants were asked to make three questions in two articles. – Then the participants evaluated the resulting questions in three- graded rating. – In total, 60 questions are collected from 10 participants. 2008/11/10 25

User Evaluation (5/7) The result shows that most of the questions made with this application are of appropriate quality, with 80% of total, 79% of unedited questions rated as "appropriate". The reasons for "inappropriate" include too obvious answer, multiple right answers, and a too long blank caused by a sentence split error. There are more so-so questions in grammar and grammar-vocabulary questions, which could be attributed to the difficulty in determining the blank of right range. 2008/11/10 26

User Evaluation (6/7) All participants completed the task in reasonable time despite of the fact that some of them were not familiar with Internet browsers. 2008/11/10 27

User Evaluation (7/7) The result shows this application is useful and usable: with all functionality receive "good" from more than half participants. In overall evaluation, all participants gave rating four in five-graded rating. The grammar target highlight and suggestions received lower scores by some participants. – A major compliant about the grammar targets was about its length. – Another problem is its response time on the browser. 2008/11/10 28

Conclusions (1/2) We have presented an assisting system for cloze or multiple-choice fill-in-the-blank questions. As opposed to related studies, we built an assistant system, which allows the teachers to choose the distractors from the suggestions or enter them by themselves. The result was 80% appropriate questions in total, 79% appropriate unedited questions, which is very promising even for automatic system. 2008/11/10 29

Conclusions (2/2) Usefulness of each function was approved by most of the participants. Similarly, usability was also satisfactory in the tested version. Future work is putting the improved version on a more proper evaluation such as distractor efficiency analysis. 2008/11/10 30