David Mareček and Zdeněk Žabokrtský

Slides:

Advertisements

Similar presentations

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.

Advertisements

Dependency tree projection across parallel texts David Mareček Charles University in Prague Institute of Formal and Applied Linguistics.

Combining Word-Alignment Symmetrizations in Dependency Tree Projection David Mareček Charles University in Prague Institute of.

Authors Sebastian Riedel and James Clarke Paper review by Anusha Buchireddygari Incremental Integer Linear Programming for Non-projective Dependency Parsing.

Tracking L2 Lexical and Syntactic Development Xiaofei Lu CALPER 2010 Summer Workshop July 14, 2010.

Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Doctoral thesis defense September.

LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.

The University of Wisconsin-Madison Universal Morphological Analysis using Structured Nearest Neighbor Prediction Young-Bum Kim, João V. Graça, and Benjamin.

Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.

Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.

The Use of Corpora for Automatic Evaluation of Grammar Inference Systems Andrew Roberts & Eric Atwell Corpus Linguistics ’03 – 29 th March Computer Vision.

April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester1 Treebanks and Parsing Jan Hajič Institute of Formal and Applied Linguistics School of.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Monday seminar, ÚFAL April 2, 2012,

Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓

Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.

1 The Hidden Vector State Language Model Vidura Senevitratne, Steve Young Cambridge University Engineering Department.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars Kewei TuVasant Honavar Departments of Statistics and Computer Science University.

Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.

Spring /22/071 Beyond PCFGs Chris Brew Ohio State University.

2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.

Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

Bug Localization with Machine Learning Techniques Wujie Zheng

Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová

Inductive Dependency Parsing Joakim Nivre

1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa

Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Semi-supervised Training of Statistical Parsers CMSC Natural Language Processing January 26, 2006.

A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart

1 DISTRIBUTED SYSTEMS RESEARCH GROUP CHARLES UNIVERSITY IN PRAGUE Faculty of Mathematics and Physics 2 INTERNATIONAL INSTITUTE.

11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.

Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth Hyphenated compounds are tagged as NN. Example: H-ras Digit letter.

Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.

Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Supertagging CMSC Natural Language Processing January 31, 2006.

Annotation Procedure in Building the Prague Czech-English Dependency Treebank Marie Mikulová and Jan Štěpánek Institute of Formal and Applied Linguistics.

Automatic Grammar Induction and Parsing Free Text - Eric Brill Thur. POSTECH Dept. of Computer Science 심 준 혁.

Exploiting Reducibility in Unsupervised Dependency Parsing David Mareček and Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University.

Intra-Chunk Dependency Annotation : Expanding Hindi Inter-Chunk Annotated Treebank Prudhvi Kosaraju, Bharat Ram Ambati, Samar Husain Dipti Misra Sharma,

CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.

Probabilistic Text Structuring: Experiments with Sentence Ordering Mirella Lapata Department of Computer Science University of Sheffield, UK (ACL 2003)

Automatic acquisition for low frequency lexical items Nuria Bel, Sergio Espeja, Montserrat Marimon.

Building Sub-Corpora Suitable for Extraction of Lexico-Syntactic Information Ondřej Bojar, Institute of Formal and Applied Linguistics, ÚFAL.

Arabic Syntactic Trees Zdeněk Žabokrtský Otakar Smrž Center for Computational Linguistics Faculty of Mathematics and Physics Charles University in Prague.

Towards Semi-Automated Annotation for Prepositional Phrase Attachment Sara Rosenthal William J. Lipovsky Kathleen McKeown Kapil Thadani Jacob Andreas Columbia.

Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.

Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Natural Language Processing Vasile Rus

Language Identification and Part-of-Speech Tagging

Lecture 9: Part of Speech

Coarse-grained Word Sense Disambiguation

Raymond J. Mooney University of Texas at Austin

CSC 594 Topics in AI – Natural Language Processing

LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.

A Statistical Model for Parsing Czech

An ICALL writing support system tunable to varying levels

PREPOSITIONAL PHRASES

Resource Recommendation for AAN

Natural Language Processing

Statistical NLP Spring 2011

Preposition error correction using Graph Convolutional Networks

Presentation transcript:

Gibbs Sampling with Treenes constraint in Unsupervised Dependency Parsing David Mareček and Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University in Prague September 15, 2011, Hissar, Bulgaria 1

Motivations for unsupervised parsing We want to parse texts for which we do not have any manually annotated treebanks texts from different domains different languages We want to learn sentence structures from the corpus only What if the structures produced by linguists are not suitable for NLP? Annotations are expensive It’s a challenge: can we beat the supervised techniques in some application? 2

Outline Parser description Sampling constraints Evaluation Conclusions Priors Models Sampling Sampling constraints Treeness Root fertility Noun-root dependency repression Evaluation on Czech treebank on all 19 CoNLL treebanks from shared task 2006-2007 Conclusions 3

Basic features of our approach Learning is based on Gibbs sampling We approximate probability of a tree by a product of probabilities of individual edges We used only POS tags for predicting a dependency relation but we plan to use lexicalization and unsupervised POS tagging in the future We introduce treeness as a hard constraint in the sampling procedure It allows non-projective edges

Models We use two simple models in our experiments the parent POS tag conditioned by the child POS tag the edge length (signed distance between the two words) conditioned by the child POS tag

Gibbs sampling We sample each dependency edge independently 50 iterations The rich get richer (self-reinforcing behavior) counts are taken from the history Exchangability we can deal with each edge as it was the last one in the corpus nominators and denominators in the product are exchangable Dirichlet hyperparameters α1 α2 were set experimentally

Basic sampling For each node, sample its parent with respect to the probability distribution The sampling order of the nodes is random Problem: it may create cycles and discontinuous graphs 0.01 0.02 0.05 0.04 0.03 0.05 0.07 ROOT Její dcera byla včera v zoologické zahradě. 5 3 2 6 7 1 4 7

Treeness constraint In case a cycle is created: choose one edge in the cycle (by sampling) and delete it take the formed subtree and attach it to one of the remaining nodes (by sampling) 0.02 0.01 0.02 0.04 0.02 0.02 0.05 0.02 ROOT Její dcera byla včera v zoologické zahradě. 8

Root fertility constraint Individual phrases tend to be attached to the technical root A sentence has usualy only one word (the main verb) that dominate the others We constrain the root fertility to be one If it has more than one child, we do the resampling sample one child that will stay under the root resample parents of other children 0.04 0.02 0.01 0.02 0.05 0.04 0.02 0.03 ROOT Její dcera byla včera v zoologické zahradě. 9

Noun-ROOT dependency repression Nouns (especially subjects) often substitute verbs in the governing positions. Majority of grammars are verbocentric Nouns can be easily recognized as the most frequent coarse-grained tag category in the corpus We add the following model: This model is useless when an unsupervised POS tagging is used 10

Evaluation measures Evaluation of unsupervised parser on GOLD data is problematic many linguistics decisions must have been done before annotating each corpus how to deal with coordination structures, auxiliary verbs, prepositions, subordinating conjunctions? We use three following measures: UAS (unlabeled attachment score) – standard metric for evaluating dependency parsers UUAS (undirected unlabeled attachment score) – edge direction is disregarded (it is not a mistake if governor and dependent are switched) NED (neutral edge direction, Schwartz et al, 2011) which treats not only a node’s gold parent and child as the correct answer, but also its gold grandparent UAS < UUAS < NED 11

Evaluation on Czech Czech dependency treebank from CoNLL 2007 shared task Punctuation removed max 15-word sentences Configuration UAS UUAS NED Random baseline 12.0 19.9 27.5 LeftChain baseline 30.2 53.6 67.2 RightChain baseline 25.5 52.0 60.6 Base 36.7 50.1 55.1 Base+Treeness 36.2 46.6 50.0 Base+Treeness+RootFert 41.2 58.6 70.8 Base+Treeness+RootFert+NounRootRepression 49.8 62.6 73.0 12

Error analysis for Czech Many errors are caused by the reversed dependencies preposition – noun subordinating conjunction – verb

Evaluation on 19 CoNLL languages We have taken the dependency treebanks from CoNLL shared tasks 2006 and 2007 POS tags from the fifth column were used The parsing was run on concatenated trainining and development sets Punctuation was removed Evaluation on the development sets only We compare our results with the state-of-the-art system, which is based on DMV (Spitkovsky et al, 2011) 14

Evaluation on 19 CoNLL languages

Conclusions We introduced a new approach to unsupervised dependency parsing Even though only a couple of experiments were done so far and only POS tags with no lexicalization are used, the results seem to be competitive to the state-of-the-art unsuperrvised parsers (DMV) We have better UAS for 12 languages out of 19 If we do not use noun-root dependency repression, which is useful only with supervised POS tags, we have better scores for 7 languages out of 19

Future work We would like to add: Word fertility model Lexicalization to model number of children for each node Lexicalization the word forms itself must be useful Unsupervised POS taging some recent experiments show that using word classes instead of supervised POS tags can improve the parsing accuracy 17

Thank you for your attention. 18