[A Contrastive Study of Syntacto-Semantic Dependencies]

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

Deep Grammars in Hybrid Machine Translation University of Bergen Helge Dyvik.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
ANTLR in SSP Xingzhong Xu Hong Man Aug Outline ANTLR Abstract Syntax Tree Code Equivalence (Code Re-hosting) Future Work.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
June 6, 20073rd PIRE Meeting1 Tectogrammatical Representation of English in Prague Czech-English Dependency Treebank Lucie Mladová Silvie Cinková, Kristýna.
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
Language Data Resources Treebanks. A treebank is a … database of syntactic trees corpus annotated with morphological and syntactic information segmented,
Semantic Role Labeling Abdul-Lateef Yussiff
Deep Learning in NLP Word representation and how to use it for Parsing
A Framework for Automated Corpus Generation for Semantic Sentiment Analysis Amna Asmi and Tanko Ishaya, Member, IAENG Proceedings of the World Congress.
1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Introduction to treebanks Session 1: 7/08/
Tasks Talk: ULA08 Workshop March 18, 2007 A Talk about Tasks Unified Linguistic Annotation Workshop Adam Meyers New York University March 18, 2008.
Växjö University Joakim Nivre Växjö University. 2 Who? Växjö University (800) School of Mathematics and Systems Engineering (120) Computer Science division.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Treebanks as Training Data for Parsers Joakim Nivre Växjö University and Uppsala University
LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science.
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
The Prague (Czech-)English Dependency Treebank Jan Hajič Charles University in Prague Computer Science School Institute of Formal and Applied Linguistics.
Te Kaitito A dialogue system for CALL Peter Vlugter, Alistair Knott, and Victoria Weatherall Department of Computer Science School of Māori, Pacific, and.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
Ngoc Minh Le - ePi Technology Bich Ngoc Do – ePi Technology
Deep Processing for Restricted Domain QA Yi Zhang Universit ä t des Saarlandes
Deeper Sentiment Analysis Using Machine Translation Technology Kanauama Hiroshi, Nasukawa Tetsuya Tokyo Research Laboratory, IBM Japan Coling 2004.
Annotation for Hindi PropBank. Outline Introduction to the project Basic linguistic concepts – Verb & Argument – Making information explicit – Null arguments.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Towards Semi-Automated Annotation for Prepositional Phrase Attachment Sara Rosenthal William J. Lipovsky Kathleen McKeown Kapil Thadani Jacob Andreas Columbia.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
A Probabilistic Quantifier Fuzzification Mechanism: The Model and Its Evaluation for Information Retrieval Felix Díaz-Hemida, David E. Losada, Alberto.
WP4 Models and Contents Quality Assessment
EXTRACTING COMPLEX PREDICATES IN HINDI ACROSS PARALLEL CORPORA
Sentiment analysis algorithms and applications: A survey
PRESENTED BY: PEAR A BHUIYAN
The Resolution of Speculation and Negation
David Mareček and Zdeněk Žabokrtský
Authorship Attribution Using Probabilistic Context-Free Grammars
Semantic Parsing for Question Answering
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Two Discourse Driven Language Models for Semantics
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Learning to Parse Database Queries Using Inductive Logic Programming
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
Computational Linguistics: New Vistas
Statistical Machine Translation Papers from COLING 2004
Text Mining & Natural Language Processing
Dialogue State Tracking & Dialogue Corpus Survey
Using Uneven Margins SVM and Perceptron for IE
CS224N Section 3: Corpora, etc.
University of Illinois System in HOO Text Correction Shared Task
CS224N Section 3: Project,Corpora
Progress report on Semantic Role Labeling
Owen Rambow 6 Minutes.
Presentation transcript:

[A Contrastive Study of Syntacto-Semantic Dependencies] University of Oslo Who Did What to Whom? [A Contrastive Study of Syntacto-Semantic Dependencies] Angelina Ivanova, Stephan Oepen, Lilja Øvrelid, and Dan Flickinger (Stanford University) The 6th Linguistic Annotation Workshop (The LAW VI), ACL 2012

Introduction Dependency representations: Useful for diverse tasks Can be obtained automatically Semantic Search Sentiment Analysis Machine Translation Question Answering Ontology Learning

Motivation Variety of incompatible representation formats that challenges the task of parser evaluation. Example: A similar technique is almost impossible to apply to other crops, such as cotton, soybeans and rice.

Goals Theoretical Practical Commonalities and differences between a broad range of dependency formats Making LinGO Redwoods Treebank accessible for a broader range of users Theoretical Practical

Dependency formats overview PEST corpus: Language: English Two sets: 10 sentences and 15 sentences from Wall Street Journal CoNLL Syntactic Dependencies (CD) CoNLL PropBank Semantics (CP) Stanford Basic Dependencies (SB) Stanford Collapsed Dependencies (SD) Enju Predicate-Argument Structures (EP) Conversion from LinGO ERG: DELPH-IN Syntactic Derivation Tree (DT) DELPH-IN MRS-derived Dependencies (DM)

Summary of dependency formats Abbr. Format Head status Is the structure an acyclic tree? Are the tokens connected? CD CoNLL Syntactic Dependencies Functional + CP CoNLL PropBank Semantics Substantive - SB Stanford Basic Dependencies SD Stanford Collapsed Dependencies EP Enju Predicate-Argument Structures DT DELPH-IN Syntactic Derivation Tree DM DELPH-IN MRS-derived Dependencies

Root choice A similar technique is almost impossible to apply to other crops, such as cotton, soybeans and rice. Root choice CD CoNLL Syntactic Dependencies is CP CoNLL PropBank Semantics - SB Stanford Basic Dependencies impossible SD Stanford Collapsed Dependencies EP Enju Predicate-Argument Structures DT DELHP-IN Syntactic Derivation Tree DM DELPH-IN MRS-derived Dependencies almost

Conjunction CoNLL Syntactic Dependencies (CD) CoNLL PropBank Semantics (CP) Stanford Basic Dependencies (SB) Stanford Collapsed Dependencies (SD) Cotton, soybeans and rice Enju Predicate-Argument Structures (EP) DELPH-IN Derivation Tree (DT) DELPH-IN MRS-derived Dependencies (DM)

Infinitive CoNLL Syntactic Dependencies (CD) Enju Predicate-Argument Structures (EP) DELPH-IN Syntactic Derivation Tree (DT) Stanford Basic Dependencies (SB) Stanford Collapsed Dependencies (SD) CoNLL PropBank Semantics (CP) DELPH-IN MRS-derived Dependencies (DM)

Article CoNLL Syntactic Dependencies (CD) Stanford Basic Dependencies (SB) Stanford Collapsed Dependencies (SD) DELPH-IN Syntactic Derivation Tree (DT) Enju Predicate-Argument Structures (EP) DELPH-IN MRS-derived Dependencies (DM) CoNLL PropBank Semantics (CP)

Adjective CoNLL Syntactic Dependencies (CD) Stanford Basic Dependencies (SB) Stanford Collapsed Dependencies (SD) DELPH-IN Syntactic Derivation Tree (DT) Enju Predicate-Argument Structures (EP) DELPH-IN MRS-derived Dependencies (DM) CoNLL PropBank Semantics (CP)

Preposition CoNLL Syntactic Dependencies (CD) Stanford Basic Dependencies (SB) DELPH-IN Syntactic Derivation Tree (DT) Enju Predicate-Argument Structures (EP) DELPH-IN MRS-derived Dependencies (DM) CoNLL PropBank Semantics (CP) Stanford Collapsed Dependencies (SD)

Tough adjective The long-distance dependency is detected only in: CoNLL PropBank Semantics (CP) Enju Predicate-Argument Structures (EP) DELPH-IN MRS-derived Dependencies (DM)

Pairwise Jaccard similarity on PEST CD CP SB SD EP DT DM .171 .427 .248 .187 .488 .115 .177 .122 .158 .173 .541 .123 .319 .147 .14 .264 .144 .192 .462 .13 DELPH-IN Syntactic Derivation Tree (DT) format is closer to CoNLL Syntactic Dependencies (CD) DELPH-IN MRS-derived Dependencies (DM) are closer to Enju Predicate-Argument Structures (EP)

Goals Theoretical Practical Commonalities and differences between a broad range of dependency formats Making LinGO Redwoods Treebank accessible for a broader range of users Theoretical Practical

The LinGO Redwoods Treebank Language: English Size: 45 000 utterances Linguistic approach: HPSG Grammar: LinGO ERG Data: Verbmobil and e-commerce corpora LOGON Norwegian-English MT corpus English Wikipedia (from WeScience) Brown corpus (SemCor) other

DELPH-IN Syntactic Derivation Tree DELPH-IN Syntactic Derivation Tree representation Conversion to bilexical dependencies

Special cases during the conversion Contracted negation: doesn’t does n’t Punctuation: bark. bark . Multiword expressions: such as such as Hyphenated words: end- state end-state NEG PUNCT CMPN

Elementary Dependency Structure { e12 _1:_a_q(BV x6) e9:_similar_a_to(ARG1 x6) x6:_technique_n_1 e12:_almost_a_1(ARG1 e3) e3:_impossible_a_for(ARG1 e18) e18:_apply_v_to(ARG2 x6, ARG3 x19) _2:udef_q(BV x19) e25:_other_a_1(ARG1 x19) x19:_crop_n_1 …}

Elementary Dependency Structure {… x33:_cotton_n_1 _5:udef_q(BV i38) x27:implicit_conj(L-INDEX x33, R-INDEX i38) _6:udef_q(BV x43) x43:_soybeans/nns_u_unknown i38:_and_c(L-INDEX x43, R-INDEX x47) _7:udef_q(BV x47) x47:_rice_n_1 }

Conclusions Qualitative and quantitative comparison of various dependency formats Automatic mapping from HPSG representations to syntactic and semantic dependencies

Future work Release of the converted Redwoods treebanks and conversion software Modification of DELPH-IN MRS-derived Dependencies conversion into a dependency tree Training parsers on DELPH-IN Syntactic Derivation Tree and DELPH-IN MRS-derived Dependencies formats Experimentation in domain adaptation for parsing on Redwoods treebanks

Thank you for your attention! Questions?