Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.

Slides:

Advertisements

Similar presentations

On-line learning and Boosting

Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Learning for Structured Prediction Overview of the Material TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A.

Structured SVM Chen-Tse Tsai and Siddharth Gupta.

Machine learning continued Image source:

Automatic Identification of Cognates, False Friends, and Partial Cognates University of Ottawa, Canada University of Ottawa, Canada.

Proactive Learning: Cost- Sensitive Active Learning with Multiple Imperfect Oracles Pinar Donmez and Jaime Carbonell Pinar Donmez and Jaime Carbonell Language.

The University of Wisconsin-Madison Universal Morphological Analysis using Structured Nearest Neighbor Prediction Young-Bum Kim, João V. Graça, and Benjamin.

Frustratingly Easy Domain Adaptation

1 Prepared and presented by Roozbeh Farahbod Voted Perceptron: Modified for NP-Chunking A Re-ranking Method.

Re-ranking for NP-Chunking: Maximum-Entropy Framework By: Mona Vajihollahi.

Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.

1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Evaluation in NLP Zdeněk Žabokrtský. Intro The goal of NLP evaluation is to measure one or more qualities of an algorithm or a system Definition of proper.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.

Statistical Machine Translation Part IV – Log-Linear Models Alex Fraser Institute for Natural Language Processing University of Stuttgart Seminar:

Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.

METEOR-Ranking & M-BLEU: Flexible Matching & Parameter Tuning for MT Evaluation Alon Lavie and Abhaya Agarwal Language Technologies Institute Carnegie.

1 Bins and Text Categorization Carl Sable (Columbia University) Kenneth W. Church (AT&T)

Statistical Machine Translation Part IV – Log-Linear Models Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Kyoshiro SUGIYAMA, AHC-Lab., NAIST An Investigation of Machine Translation Evaluation Metrics in Cross-lingual Question Answering Kyoshiro Sugiyama, Masahiro.

A search-based Chinese Word Segmentation Method ——WWW 2007 Xin-Jing Wang: IBM China Wen Liu: Huazhong Univ. China Yong Qin: IBM China.

Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者：郝柏翰 2013/01/28.

Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.

Part E: conclusion and discussions. Topics in this talk Dependency parsing and supervised approaches Single model Graph-based; Transition-based; Easy-first;

Approximating a Deep-Syntactic Metric for MT Evaluation and Tuning Matouš Macháček, Ondřej Bojar; {machacek, Charles University.

Semi-supervised Training of Statistical Parsers CMSC Natural Language Processing January 26, 2006.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

N-best Reranking by Multitask Learning Kevin Duh Katsuhito Sudoh Hajime Tsukada Hideki Isozaki Masaaki Nagata NTT Communication Science Laboratories 2-4.

Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.

Report on Semi-supervised Training for Statistical Parsing Zhang Hao

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Multi-core Structural SVM Training Kai-Wei Chang Department of Computer Science University of Illinois at Urbana-Champaign Joint Work With Vivek Srikumar.

Intra-Chunk Dependency Annotation : Expanding Hindi Inter-Chunk Annotated Treebank Prudhvi Kosaraju, Bharat Ram Ambati, Samar Husain Dipti Misra Sharma,

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen.

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.

UCSpv: Principled Voting in UCS Rule Populations Gavin Brown, Tim Kovacs, James Marshall.

Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Learning part-of-speech taggers with inter-annotator agreement loss EACL 2014 Barbara Plank, Dirk Hovy, Anders Søgaard University of Copenhagen Presentation:

Graph-based WSD の続き DMLA /7/10 小町守.

Natural Language Processing Vasile Rus

Language Identification and Part-of-Speech Tagging

Raymond J. Mooney University of Texas at Austin

CSC 594 Topics in AI – Natural Language Processing

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

David Mareček and Zdeněk Žabokrtský

Statistical Machine Translation Part IV – Log-Linear Models

Max-margin sequential learning methods

Learning to Parse Database Queries Using Inductive Logic Programming

Statistical NLP Spring 2011

Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007

The Voted Perceptron for Ranking and Structured Classification

Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.

Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Presentation transcript:

Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard

Evaluation Intrinsic – How well does system replicate gold annotations? – Precision/recall/F1, accuracy, BLEU, ROUGE, etc. Extrinsic – How useful is system for some downstream task? High performance on one doesn’t necessarily mean high performance on the other Can be hard to evaluate extrinsically

Dependency Parsing Given a sentence, label the dependencies (from nltk.org) Output is useful for downstream tasks like machine translation – Also of interest to NLP reaserchers

Overview of paper Optimize parser for two metrics – Intrinsic evalutation – Downstream task (here reranker in machine translation system) Algorithm to do this Experiments

Perceptron Algorithm Takes: set of labeled training examples; loss function For each example, predicts output, updates model if the output is incorrect – Rewards features that fire in gold standard model – Penalizes those that fire in predicted output

Augmented Loss Perceptron Algorithm Similar to perceptron, except takes: multiple loss functions; multiple datasets (one for each loss function); scheduler to weight loss functions Perceptron is an instance of ALP with one loss function, one dataset, and a trivial scheduler Will look at ALP with 2 loss functions Can use extrinsic evaluator as loss function

Reranker loss function Takes k-best output from parser Assign cost to each parse Take lowest cost parse to be “correct” parse If 1-best parse is lowest cost do nothing Otherwise update parameters based on correct parse Standard loss function is instance of this in which the cost is always lowest for 1-best

Experiment 1 English to Japanese MT system, specifically word reordering step – Given a parse, reorder the English sentence into Japanese word order Transition-based and graph-based dependency parsers 17,260 manually annotated word reorderings – 10,930 training, 6,338 test – These are cheaper to produce than dependency parses

Experiment 1 2 nd loss function based off of METEOR – Score = 1 – (#chunks – 1)/(#unigrams matched – 1) – Cost = 1 – score Unigrams matched are those in reference and hypothesis Chunks are sets of unigrams that are adjacent in reference and hypothesis Vary weights of primary and secondary loss

Experiment 1 As ratio of extrinsic loss : intrinsic loss increases, performance on reordering task improves Transition based parser Intrinsic : Extrinsic% Correctly Reordered Reordering Scores 1 : : : :

Experiment 2 Semi-supervised adaptation: Penn Treebank (PTB) to Question Treebank (QTB) PTB trained parser bombs on QTB QTB trained parser does much better on QTB Ask annotators a simple question about QTB sentences – What is the main verb? – ROOT usually attaches to main verb Use answers and PTB to adapt to QTB

Experiment 2 Augmented loss data set: QTB data with ROOT attached to main verb – No other labels on QTB data Loss function: 0 if ROOT dependency correct, 1 otherwise Secondary loss function looks at k-best, chooses highest ranked parse with correct ROOT dependency

Experiment 2 Results for transition parser Huge improvement with data that is very cheap to collect – Cheaper to get Turkers to annotate main verbs than grad students to manually parse sentences SetupLASUASROOT-F1 PTB QTB Aug. loss

Experiment 3 Improving accuracy on labeled and unlabeled dependency parsing (all intrinsic) Use labeled attachment score as primary loss function Secondary loss function weights lengths of incorrect and correct arcs – One version uses labeled arcs, the other unlabeled Idea is to have model account for arc length – Parsers tend to do poorly on long dependencies (McDonald and Nivre, 2007)

Experiment 3 Weighted Arc Length Score (ALS) Sum of lengths of all correct arcs divided by sum of lengths of all arcs In unlabeled version only head (and dependency) need to match In labeled version arc label must match too

Experiment 3 Results with transition parser Small improvement likely due to fact that ALS is similar to LAS and UAS SetupLASUASALS Baseline Unlabeled aug. loss Labeled aug. loss

Conclusions Possible to train tools for particular downstream tasks – Might not want to use the same parses for MT as for information extraction Can leverage cheap(er) data to improve task performance – Japanese translations/word orderings for MT – Main verb identification instead of dependency parses for domain adaptation Not necessarily easy to define the task or a good extrinsic evaluation metric – MT to word reordering score – METEOR-based metric