An SVMs Based Multi-lingual Dependency Parsing Yuchang CHENG, Masayuki ASAHARA and Yuji MATSUMOTO Nara Institute of Science and Technology.

Slides:

Advertisements

Similar presentations

Learning with lookahead: Can history-based models rival globally optimized models? Yoshimasa Tsuruoka Japan Advanced Institute of Science and Technology.

Advertisements

Koby Crammer Department of Electrical Engineering

Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:

A Structured Model for Joint Learning of Argument Roles and Predicate Senses Yotaro Watanabe Masayuki Asahara Yuji Matsumoto ACL 2010 Uppsala, Sweden July.

SVM—Support Vector Machines

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

Machine learning continued Image source:

R Yun-Nung Chen 資工碩一陳縕儂 1 /39.  Non-projective Dependency Parsing using Spanning Tree Algorithms (HLT/EMNLP 2005)  Ryan McDonald, Fernando.

Dependency Parsing Joakim Nivre. Dependency Grammar Old tradition in descriptive grammar Modern theroretical developments: –Structural syntax (Tesnière)

Dependency Parsing Some slides are based on:

Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance July 27 EMNLP 2011 Shay B. Cohen Dipanjan Das Noah A. Smith Carnegie Mellon University.

 They speak German  8.47 million of people live there.

Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.

Annotating language data Tomaž Erjavec Institut für Informationsverarbeitung Geisteswissenschaftliche Fakultät Karl-Franzens-Universität Graz Tomaž Erjavec.

1 Fast Methods for Kernel-based Text Analysis Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 NAIST (Nara Institute of Science and Technology) 41st Annual Meeting.

Support Vector Machines (SVMs) Chapter 5 (Duda et al.)

The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.

Experiments with a Multilanguage Non-Projective Dependency Parser Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Reduced Support Vector Machine

Växjö University Joakim Nivre Växjö University. 2 Who? Växjö University (800) School of Mathematics and Systems Engineering (120) Computer Science division.

Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.

April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester1 Treebanks and Parsing Jan Hajič Institute of Formal and Applied Linguistics School of.

Yu-Chieh Wu Yue-Shi Lee Jie-Chi Yang National Central University, Taiwan Ming Chuan University, Taiwan Date: 2006/6/8 Reporter: Yu-Chieh Wu The Exploration.

Dependency Parsing with Reference to Slovene, Spanish and Swedish Simon Corston-Oliver Anthony Aue Microsoft Research.

1/17 Acquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation Hiram Calvo and Alexander Gelbukh Presented.

Treebanks as Training Data for Parsers Joakim Nivre Växjö University and Uppsala University

An SVM Based Voting Algorithm with Application to Parse Reranking Paper by Libin Shen and Aravind K. Joshi Presented by Amit Wolfenfeld.

Trading Convexity for Scalability Marco A. Alvarez CS7680 Department of Computer Science Utah State University.

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

Generic object detection with deformable part-based models

Final review LING572 Fei Xia Week 10: 03/11/

IBM Maximo Asset Management © 2007 IBM Corporation Tivoli Technical Exchange Calls Aug 31, Maximo - Multi-Language Capabilities Ritsuko Beuchert.

A Phonotactic-Semantic Paradigm for Automatic Spoken Document Classification Bin MA and Haizhou LI Institute for Infocomm Research Singapore.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.

Inductive Dependency Parsing Joakim Nivre

Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者：郝柏翰 2013/01/28.

Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.

A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.

Dependency Parsing Prashanth Mannem

A Language Independent Method for Question Classification COLING 2004.

Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN.

1 Boosting-based parse re-ranking with subtree features Taku Kudo Jun Suzuki Hideki Isozaki NTT Communication Science Labs.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Protein Fold Recognition as a Data Mining Coursework Project Badri Adhikari Department of Computer Science University of Missouri-Columbia.

Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.

Mining Binary Constraints in Feature Models: A Classification-based Approach Yi Li.

Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.

Protein Classification Using Averaged Perceptron SVM

Exploiting Reducibility in Unsupervised Dependency Parsing David Mareček and Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University.

Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.

Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.

Dependency Parsing Parsing Algorithms Peng.Huang

CSSE463: Image Recognition Day 15 Announcements: Announcements: Lab 5 posted, due Weds, Jan 13. Lab 5 posted, due Weds, Jan 13. Sunset detector posted,

Weka. Weka A Java-based machine vlearning tool Implements numerous classifiers and other ML algorithms Uses a common.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.

Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.

Computational lexicology, morphology and syntax Diana Trandabă

CSSE463: Image Recognition Day 15 Today: Today: Your feedback: Your feedback: Projects/labs reinforce theory; interesting examples, topics, presentation;

Introduction to Machine Learning Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)

Graph-based Dependency Parsing with Bidirectional LSTM Wenhui Wang and Baobao Chang Institute of Computational Linguistics, Peking University.

Solving for the Roots of Polynomials Golf

MID-SEM REVIEW.

An Introduction to Support Vector Machines

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

OpenCourseWare Open Sharing, Global Benefits

Perceptron Learning for Chinese Word Segmentation

COUNTRIES NATIONALITIES LANGUAGES.

CSE 802. Prepared by Martin Law

Presentation transcript:

An SVMs Based Multi-lingual Dependency Parsing Yuchang CHENG, Masayuki ASAHARA and Yuji MATSUMOTO Nara Institute of Science and Technology

Approaches to Dependency Parsing Bottom-up deterministic (local discrimination) –Iterative, projective [Kudo & Matsumoto 02][Yamada & Matsumoto 03][Cheng, Asahara, Matsumoto 04] –Shift-reduce, projective [Nivre, Scholz 04] –Shift-reduce, pseudo-projective [Nivre, Nilsson 05] N-best + Large margin discrimination (global discrimination) –Projective [McDonald, Crammer, Pereira 05] –Non-projective[McDonald, Pereira, Ribarow, Hajic 05]

Comparison between Iterative and Shift-reduce methods Nivre algorithm (Shift-reduce) –depth first –O(n) Iterative –breadth first –O(n 2 ):worst case, empirically near linear + efficient － limited look-ahead Training and parsing are done in the same process ⇒ Number of training instances = Number of parsing steps

consulted context Limited right-side contextual info. saw girl with telescope. I saw a girl with a telescope. I a a A configuration in Nivre method A configuration in Y&M method

Preliminary comparison English dependency parsing (Penn Treebank 02-06:training, 23:test) –right context = 2 –right context = 4 IterativeNivre Dep. Acc Root Acc IterativeNivre Dep. Acc Root Acc Chinese case: Almost no difference/ a little better result in Nivre method

Common Disadvantage Local discrimination Single model throughout whole sentence –local (near leaves) and long-distance (near top) parsing should be different models Distinct model at the lowest level –dependency between adjacent words –implemented as a pre-processing

consulted context Shallow pre-processing + Nivre method I saw a girl with a telescope. saw girl with telescope. I a a Preprocessing of adjacent words Then, apply Nivre method Labels are decided by MaxEnt classifiers

Language: with preprocessingwithout preprocessing LAS:UAS:LAcc.LAS:UAS:LAcc. Arabic Chinese Czech Danish Dutch German Japanese Portugese Slovene Spanish Swedish Turkish AV: Bulgarian

Speed-up of Kernel SVM Fast methods for kernel-based text analysis [Kudo & Matsumoto 04] Training with 3 rd degree polynomial Kernel Mining of feature combinations in positive/negative support vectors Linearization with obtained feature combinations ( times speed up)