Learning part-of-speech taggers with inter-annotator agreement loss EACL 2014 Barbara Plank, Dirk Hovy, Anders Søgaard University of Copenhagen Presentation:

Slides:



Advertisements
Similar presentations
Problem Semi supervised sarcasm identification using SASI
Advertisements

Grammatical structures for word-level sentiment detection.
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
Classifying Movie Scripts by Genre Alex Blackstock Matt Spitz 6/9/08.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
1 I256: Applied Natural Language Processing Marti Hearst Sept 27, 2006.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 22, 2004.
1 Complementarity of Lexical and Simple Syntactic Features: The SyntaLex Approach to S ENSEVAL -3 Saif Mohammad Ted Pedersen University of Toronto, Toronto.
A Memory-Based Approach to Semantic Role Labeling Beata Kouchnir Tübingen University 05/07/04.
Introduction to Machine Learning Approach Lecture 5.
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
Yin Yang (Hong Kong University of Science and Technology) Nilesh Bansal (University of Toronto) Wisam Dakka (Google) Panagiotis Ipeirotis (New York University)
Evaluation in NLP Zdeněk Žabokrtský. Intro The goal of NLP evaluation is to measure one or more qualities of an algorithm or a system Definition of proper.
Ronan Collobert Jason Weston Leon Bottou Michael Karlen Koray Kavukcouglu Pavel Kuksa.
AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.
Researcher affiliation extraction from homepages I. Nagy, R. Farkas, M. Jelasity University of Szeged, Hungary.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Evaluation CSCI-GA.2590 – Lecture 6A Ralph Grishman NYU.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.
Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.
Recognizing Names in Biomedical Texts: a Machine Learning Approach GuoDong Zhou 1,*, Jie Zhang 1,2, Jian Su 1, Dan Shen 1,2 and ChewLim Tan 2 1 Institute.
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Mebi 591D – BHI Kaggle Class Baselines kaggleclass.weebly.com/
CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Automatic Identification of Pro and Con Reasons in Online Reviews Soo-Min Kim and Eduard Hovy USC Information Sciences Institute Proceedings of the COLING/ACL.
CSKGOI'08 Commonsense Knowledge and Goal Oriented Interfaces.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Sheffield -- Victims of Mad Cow Disease???? Or is it really possible to develop a named entity recognition system in 4 days on a surprise language with.
Hierarchical Clustering for POS Tagging of the Indonesian Language Derry Tanti Wijaya and Stéphane Bressan.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Hedge Detection with Latent Features SU Qi CLSW2013, Zhengzhou, Henan May 12, 2013.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Multi-core Structural SVM Training Kai-Wei Chang Department of Computer Science University of Illinois at Urbana-Champaign Joint Work With Vivek Srikumar.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Information Extraction Lecture 3 – Rule-based Named Entity Recognition Dr. Alexander Fraser, U. Munich September 3rd, 2014 ISSALE: University of Colombo.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
A Statistical Model for Multilingual Entity Detection and Tracking R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, S.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.
Web Intelligence and Intelligent Agent Technology 2008.
Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon -Smit Shilu.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Chinese Named Entity Recognition using Lexicalized HMMs.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Domain Adaptation Slide 1 Hal Daumé III Frustratingly Easy Domain Adaptation Hal Daumé III School of Computing University of Utah
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Language Identification and Part-of-Speech Tagging
Relation Extraction CSCI-GA.2591
Erasmus University Rotterdam
Natural Language Processing of Knee MRI Reports
Result of Ontology Alignment with RiMOM at OAEI’06
Mentor: Salman Khokhar
AD Click Prediction a View from the Trenches
Hierarchical, Perceptron-like Learning for OBIE
The Voted Perceptron for Ranking and Structured Classification
Ping LUO*, Fen LIN^, Yuhong XIONG*, Yong ZHAO*, Zhongzhi SHI^
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Learning part-of-speech taggers with inter-annotator agreement loss EACL 2014 Barbara Plank, Dirk Hovy, Anders Søgaard University of Copenhagen Presentation: Ben Glass, Cincinnati Children’s Hospital

- Inter-annotator agreement - The idea - Defining agreement scores - Baseline classifier - Adding the agreement scores - Results - Conclusion - Discussion

Inter-annotator Agreement

The Idea 1) Measure inter-annotator agreement per class over a portion of the training set 2) During training use agreement score for the current label (gold/predicted/both?) in the loss function

Agreement Score 1: Annotator F1 Harmonic mean of: Prec T (A 1 (X), A 2 (X)) #(A 1 and A 2 both give label T) / #(A 1 gives label T) and Rec T (A 1 (X), A 2 (X)) #(A 1 and A 2 both give label T) / #(A 2 gives label T) In other words: normal precision and recall but use 2 nd annotator instead of gold standard

Agreement Score 2: Confusion Probability P({A 1 (x), A 2 (x)} = {t 1, t 2 }) probability of confusing tags t 1, t 2 = mean of P(A 1 (x) = t 1, A 2 (x) = t 2 ) and P(A 1 (x) = t 2, A 2 (x) = t 1 ) In other words: probability that A 1 assigns a certain tag and A 2 the other, and vice versa

Confusion Probability

Baseline Classifier Online structured perceptron

Adding the Agreement Measures Annotator F1: γ(y j, y i ) = F1 yi (A 1 (X), A 2 (X)) (gold label only) Inverse confusion probability: γ(y j, y i )) = 1 − P( {A 1 (X), A 2 (X)} = {y j, y i } ) (both gold and predicted)

Results

Results: regularization

Chunking Named entity recognition Results: downstream

Conclusion - Taking inter-annotator agreement into account during training can improve classifier performance - While this paper focused on POS tagging, the methods described could potentially be applied to biomedical tasks involving human-annotated training data

Discussion - Confusion probability for > 2 annotators ? - Asymmetric confusion probability ? - Results for individual classes ? - Compare to “remove hard cases” method ? Other existing cost-sensitive algorithms ?

Plank, B., Hovy, D., & Søgaard, A. (2014). Learning part-of-speech taggers with inter-annotator agreement loss. In Proceedings of EACL.