A Report on the First Native Language Identification Shared Task Joel Tetreault Nuance Communications Daniel Blanchard Educational Testing Service Aoife.

Slides:



Advertisements
Similar presentations
Mini Presentations: How To
Advertisements

1 Why do we need standards for world language learning? Students, parents, administrators, and language teachers need to know what learning another language.
EDGE Institute 2014 Discovery Education Lexi Samorano.
University of Sheffield NLP Module 4: Machine Learning.
Sentiment Analysis on Twitter Data
InterLink Business Survey Preparing to Compete in the in the Global Marketplace Summer ~ Copyright InterLink 2005©
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
WACE 2016 and beyond and The Australian Curriculum: Languages
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
09:10 Mikko Kurimo: "Unsupervised Morpheme Analysis -- Morpho Challenge Workshop 2007" 09:30 Mikko Kurimo: "Evaluation by a Comparison to a Linguistic.
Page 1 NAACL-HLT BEA Los Angeles, CA Annotating ESL Errors: Challenges and Rewards Alla Rozovskaya and Dan Roth University of Illinois at Urbana-Champaign.
Online Cafés for Heritage Learners ---- The different parameters The Cultura Project, the Italy-USA Exchange, the USA-Spain exchange NFLRC – University.
T4E Teaching Book Study Lunch and Learn January 13, 2011.
2010 Performance Evaluation Process Information Session for Staff
Publishing your paper. Learning About You What journals do you have access to? Which do you read regularly? Which journals do you aspire to publish in.
Ursula Wingate Department of Education and Professional Studies Embedding writing instruction into subject teaching – how to convince subject teachers?
Discovering what Paris has to offer By Frances Beaubrun IntroductionIntroduction Task process conclusionTaskprocess conclusion.
VeldwERK: What happens when you step into the CEFR Seminar on Curriculum Convergences Council of Europe, Strasbourg 29th November, 2011 Daniela Fasoglio,
E-POSTER PRESENTATION MPhild/Phd
Machine Learning CS 165B Spring 2012
The Thesis Writing Process Maeve Gallagher Student Learning Development Student Counselling Service Trinity College Dublin.
Soft Skills for a Digital Workplace: Verbal Communication Unit D: Improving Informal Communication.
World Languages Portfolio. Student Growth Portfolio with Peer Review 2  THE GOAL: A holistic and meaningful picture of the value a teacher adds to students,
Building on the Nation’s Strength: Heritage Language Speakers, a National Resource Olga Kagan, Director, National Heritage Language Resource Center Language.
Outlines What are they? What are they good for? How to write one?
 Rigor and Acceleration in World Languages Through Literacy HCPSS World Languages November 24,
NERIL: Named Entity Recognition for Indian FIRE 2013.
Peoria Unified World Languages and Immersion Programs Dr. Heather Cruz March 25, 2014.
TagHelper: Basics Part 1 Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center and The Office of Naval.
FLTA Workshop: Teaching Less Commonly Taught Langauges Tuesday 8/16/11 Syracuse University Professor Erika Haber.
The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.
Rusty Mumford Crisfield Academy & High School Somerset County Public School
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
1 #EDC&I 495 World Languages: Standards and Assessment Fall, 2003 Presented by Michele Anciaux Aoki, Ph.D., in conjunction with P-20 International Education.
Web-Based Writing/Scoring Options August 4, 2005 Mary Hall Classroom Teacher, Warwick HS Janet Dubble Technology Coordinator, IU 13.
Review of Course Approach and Assignment on Class Discussions These slides from session 1 of the class and can be found on the class website.
World Languages Advisory Committee January 20, 2010.
A Language Independent Method for Question Classification COLING 2004.
Champions Log Book BBC Children in Need What is Champions of Change? Welcome Champions! Champions of Change is a fundraising challenge run by you,
By: Meghan Vance.  Level 0:  No Practical Proficiency  Cannot read or speak proficiency  Level 1: Elementary Proficiency  Able to satisfy routine.
SEEC SUMMER PD Day 2: Content Area Groups Wireless Access Username: wirelessguest Password: wireless seec.nefec.org.
Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma
Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.
“Foreign Languages in the Life of an Educated Person”
Ian F. C. Smith Writing a Conference Paper. 2 Disclaimer This is mostly opinion. Suggestions are incomplete. There are other strategies.
Foreign Language Spelling Bee
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Detecting Missing Hyphens in Learner Text Aoife Cahill, SusanneWolff, Nitin Madnani Educational Testing Service ACL 2013 Martin Chodorow Hunter College.
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
Utilizing Small Groups in Large ESL Classes Dr. Bruce Kreutzer International University, HCMC.
Parent Overview of the 5 th Grade Writing Test Appling County Elementary School Title 1 - Lunch and Learn January 2014.
Negotiating meaning K: But I manage to study. F: I don’t understand manage. A: She is able to study. K:She makes time to study.
Syntax By WJQ. Syntax : Syntax is the study of the rules governing the way words are combined to form sentences in a language, or simply, the study of.
Communication Applications Unit 3:2 Group member Roles & Responsibilities.
Enlightening thoughts about FETC from the deep sea of Mrs. J’s mind A reflexion on some of the sessions I attended.
PARCC Preparation: Research Simulation Task.  The research simulation task is a challenging task for the following reasons… 1.You will have to balance.
Proximity based one-class classification with Common N-Gram dissimilarity for authorship verification task Magdalena Jankowska, Vlado Kešelj and Evangelos.
ACL/EMNLP 2012 review (eNLP version) Mamoru Komachi 2012/07/17 Educational NLP research group Computational Linguistics Lab Nara Institute of Science and.
Case Study of the TOEFL iBT Preparation Course: Teacher’s perspective Jie Chen UWO.
Year 7 Foreign Language Spelling Bee
A Simple Approach for Author Profiling in MapReduce
Language use as a window to understand L1 differences in L2 writing
Evaluation of Priority Gender Equality
Investigating Pitch Accent Recognition in Non-native Speech
Soliciting Reader Contributions to Software Tutorials
MAKING THE LEAP FROM LEVEL 2 TO LEVEL 3.
Using Careerpilot in a 1:1 context with post 16 students
Annotating ESL Errors: Challenges and Rewards
COUNTRIES NATIONALITIES LANGUAGES.
IATEFL LASIG Local Conference Brno 2018
Presentation transcript:

A Report on the First Native Language Identification Shared Task Joel Tetreault Nuance Communications Daniel Blanchard Educational Testing Service Aoife Cahill Educational Testing Service

Native Language Identification Task of automatically identifying a speaker’s first language based solely on the speaker’s writing in another language Applications: ◦ Authorship profiling (Estival et al., 2007) ◦ Education: more targeted feedback to language learners (Leacock et al., 2010)

Sample Essay 1 No risk no fun I agree the statement "Successful people try new things and take risk".In my mind it is so, to. When you thing you like do new stuff you need a liddelbit the kick. That is the big point what I need. For exsample I like to go to a big city like New York. I was never in this town I dont no from the city. But I like go to the city. Thats fun I stay every time for proplems. I need eat a hood offer my head. The ather side I can go dow. I dont gat waht I need…Next exsample the wall street you put money in funds, well you this make a good job. Dont for get the risk look like lose money. German

Sample Essay 2 For example, if you take a look at an ordinary school, you have different teachers for every subject. Your calculus teacher is different than your literature teacher. Each teacher must specialize in a specific subject in order to convey suffiecient and proper information to the students. However, that doesn't mean that the teacher is narrow-minded and has a limited perspective in life because to specialize in one subject doesn't hinder you or stop you from exploring other subjects. Arabic

Motivation Lots of work in NLI but…it has been hard to compare different approaches: 1. ICLEv2 (Granger et al, 2009): de facto train/test data is small and has NLI- unfriendly idiosyncrasies 2. No consensus on evaluation: -Which L1’s / how many L1’s? -Train/test splits? -Best features?

Contributions Goal to unify community and help field progress Provide a larger, more NLI-friendly corpus that improves upon ICLEv2 Common evaluation framework ◦ Everyone evaluates using same train/dev/test splits and same L1s Corpus and scripts to be made public to further promote the field

Outline Prior Work Data Shared Task Overview Results NLI Shared Task in the Future

Prior Work Treat NLI as a classification task Koppel et al. (2005): POS n-grams, content and function words, spelling and grammatical errors Syntactic features (Wong and Dras, 2011) Tree Substitution Grammars (Swanson and Charniak, 2012) Adaptor Grammars (Wong et al., 2012) Data Size Effects (Brooke and Hirst, 2012) Word n-grams (Bykh and Meurers, 2012): LMs and Ensemble Classifiers (Tetreault et al., 2012)

Data: TOEFL11 Corpus 12,100 essays from the ETS Test of English as a Foreign Language (TOEFL) 11 L1s: ◦ Arabic, Chinese French, German, Hindi, Italian, Japanese, Korean, Spanish, Telugu, Turkish ◦ 900 train / 100 dev / 100 test Sampled for equal representation of L1s across topics as much as possible Includes 3-tier proficiency level Public release via LDC this summer?

Shared Task Description: 3 Sub-tasks 1. Closed-Training: 11-way classification task using only TOEFL11-TRAIN and DEV 2. Open-Training-1: use of any amount or type of training data excluding TOEFL11 3. Open-Training-2: use of any amount or type of training data combined with TOEFL11 * All sub-tasks use TOEFL11-TEST for the final evaluation set

Shared Task Description Each team allowed to submit up to 5 different systems per task Teams submitted a CSV file for each system to NLI Organizers Evaluation script automatically compares each prediction file to gold standard and creates performance report and contingency tables

29 Teams BobicevEuracMITRE “Carnie”UKP ChongerHAUTCSMQUnibuc CMU-HaifaItaliaNLPNAISTUNT Cologne-NijmegenJarvisNRCUTD CoRAL UABKyle et al.Oslo NLIVTEX CUNI (Charles University) LIMSIToronto CywuLTRC IIIT Hyderabad Tuebingen DartmouthMichiganUalberta

RESULTS

Sub-Task Participation Statistics Sub-task# Teams Competing# Submissions Closed29116 Open-1313 Open-2415

Closed Sub-Task See Table 3 of Report for full results No statistically significant differences between top 5 teams Team NameAbbreviationOverall Accuracy JarvisJAR0.836 Oslo NLIOSL0.834 UnibucBUC0.827 MITRE “Carnie”CAR0.826 TuebingenTUE0.822

Open Sub-tasks Challenge : finding new data to cover each L1 Data sources for HIN & TEL: ◦ ICNALE Pakistani essays  HIN (TUE team) ◦ Bilingual blogs (TOR & TUE team) CorpusDescription ICLEAll L1s except ARA, HIN, TEL FCEAll L1s except ARA, HIN, TEL ICNALECHI, JPN, KOR essays only Lang8All L1s, but mostly Asian L1s

Discussion of Approaches Machine Learning ◦ SVM overwhelmingly the most popular approach ◦ 4 teams also tried Ensemble classifiers ◦ String kernels (BUC) using character level n- grams

Discussion of Approaches Features ◦ N-grams: word, POS, character, function ◦ Syntactic Features: Dependencies, TSG, CF Productions, Adaptor Grammars ◦ Spelling Features 4 of top 5 teams used n-grams at least 4- grams, some went up to 9-grams 2 of top 10 teams used syntactic features

Future of NLI Shared Task Ideas to expand scope of task ◦ Use a new set of TOEFL essays for test ◦ Expand genres: blogs? Tweets? ◦ Number of L1s ◦ Do different L2  ItaliaNLP – preparing Italian NLI corpus with CNR Pisa  Also a corpus of Finnish with L1 (Turku Uni) ◦ Add slavic languages Logistics ◦ Hold another shared task in 2014? Or 2015? ◦ Merge with PAN Shared Task? Tell us your thoughts!

Useful Resources Tech Report on TOEFL11 Corpus TOEFL11 Release – summer 2013? Scripts and tools – put on website?

Acknowledgments Derrick Higgins (ETS) ETS TOEFL Patrick Houghton (ETS) BEA8 Organizers All the NLI Participants!

Questions?