A Method for Automatically Constructing Case Frames for English Daisuke Kawahara and Kiyotaka Uchimoto (LREC2008, 2008/05/29) National Institute of Information.

Slides:



Advertisements
Similar presentations
Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
Advertisements

CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
Subcategories 3: Transitivity
Grammatical Relations and Lexical Functional Grammar Grammar Formalisms Spring Term 2004.
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
Statistical NLP: Lecture 3
Explanation Producing Combination of NLP and Logical Reasoning through Translation of Text to KR Formalisms CHITTA BARAL ARIZONA STATE UNIVERSITY 1 School.
LING 388: Language and Computers Sandiway Fong Lecture 24.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
The Hindi-Urdu Treebank Lecture 7: 7/29/ Multi-representational, Multi-layered treebank Traditional approach: – Syntactic treebank: PS or DS, but.
Faculty Of Applied Science Simon Fraser University Cmpt 825 presentation Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri.
July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.
TIK 6 DEGREES OF PREDICATES AND GENERIC SENTENCES
6. Degrees of Predicate and Generic Sentence
Automatic Acquisition of Lexical Classes and Extraction Patterns for Information Extraction Kiyoshi Sudo Ph.D. Research Proposal New York University Committee:
1/17 Acquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation Hiram Calvo and Alexander Gelbukh Presented.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Predicate Nominatives and Adjectives
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
The time-course of prediction in incremental sentence processing: Evidence from anticipatory eye movements Yuki Kamide, Gerry T.M. Altman, and Sarah L.
NLU: Frames Frame KR is a good way to represent common sense –can define stereotypical aspects of some domain we are interested in analyzing –sentences.
ELN – Natural Language Processing Giuseppe Attardi
Introduction to English Syntax Level 1 Course Ron Kuzar Department of English Language and Literature University of Haifa Chapter 2 Sentences: From Lexicon.
Greek Nouns: Part 2 O XANQIAS (b). Cases Each noun changes form depending on how it is used in the sentence – this is known as case The case is identified.
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
Learning Narrative Schemas Nate Chambers, Dan Jurafsky Stanford University IBM Watson Research Center Visit.
Endings Are Everything The Cases of Latin Nouns. Nouns are the names of persons places or things.
1 The Interaction Between Verbs And Constructions Lucas Champollion Oct 18 th, 2004 Goldberg, Adele E. (1995): Constructions. Ch. 2.
September 17, : Grammars and Lexicons Lori Levin.
Korean Treebank & Propbank Martha Palmer, Narae Han, Jinyoung Choi, Shijong Ryu University of Pennsylvania May 23, 2005.
Dr. Kenny. COPY THE FOLLOWING: It was (she, her) who came with us to the movies. (I, Me) gave into the pressure. All of us would rather be with (he, him)
Acquiring Reliable Predicate- argument Structures from Raw Corpora for Case Frame Compilation Daisuke Kawahara 1 and Sadao Kurohashi 1,2 LREC2010, 2010/05/20.
Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI,
Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
Linguistic Essentials
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)
Sentence Analysis Lesson Notes – Step 4: Complements.
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
NLP. Introduction to NLP Last week, Min broke the window with a hammer. The window was broken with a hammer by Min last week With a hammer, Min broke.
SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland.
Towards Semi-Automated Annotation for Prepositional Phrase Attachment Sara Rosenthal William J. Lipovsky Kathleen McKeown Kapil Thadani Jacob Andreas Columbia.
The Passive Voice past simple tense Form: was were part of the verb ‘to be’ + past participle Example: The house was built in past participle.
NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Chapter 8 Lexical Acquisition February 19, 2007 Additional Notes to Manning’s slides.
Endings Are Everything The Cases of Latin Nouns. Nouns are the names of persons places or things.
Automatically Labeled Data Generation for Large Scale Event Extraction
English Proposition Bank: Status Report
Coarse-grained Word Sense Disambiguation
Pronouns Subject and Object.
Statistical NLP: Lecture 3
Endings Are Everything
Two Discourse Driven Language Models for Semantics
Nominative & Objective Cases
Background & Overview Proposed Model Experimental Results Future Work
Unit-4 Lexical Semantics M.B.Chandak, HoD CSE,
Donna M. Gates Carnegie Mellon University
Latin: The Written Language
Personal Pronouns.
Linguistic Essentials
CS224N Section 3: Corpora, etc.
CS224N Section 3: Project,Corpora
Presentation transcript:

A Method for Automatically Constructing Case Frames for English Daisuke Kawahara and Kiyotaka Uchimoto (LREC2008, 2008/05/29) National Institute of Information and Communications Technology

2 Background NLP analyzers so far –(Mainly) supervised, (relatively) knowledge-poor e.g., PP-attachment or parsing Mary ate the salad with a fork Mary ate the salad with mushrooms –Only 1.5% of bilexical dependency was learned [Bikel, 04]  Toward knowledge-oriented NLP –Automatically compile case frames and integrate them into NLP analyzers/applications

3 Related work Subcategorization frames –[Brent, 93] [Ushioda et al., 93] [Manning, 93] [Briscoe and Carroll, 97] [Korhonen, 02] … e.g., She greeted me. NP(sbj) greet NP(obj) e.g., She gave him a book. NP(sbj) give NP(obj) NP(obj) # of SCFs# of verbscorpus sizeAcc [Brent, 1993] M85% [Ushioda et al., 1993] M86% [Manning, 1993] M82% [Ersan & Charniak, 1996] M70% [Caroll & Rooth, 1998] M77% [Briscoe & Caroll, 1997] M81% [Sarkar & Zeman, 2000] M88%

4 Related work Subcategorization frames –[Brent, 93] [Ushioda et al., 93] [Manning, 93] [Briscoe and Carroll, 97] [Korhonen, 02] … (Handmade) frames –FrameNet [Baker et al., 98], PropBank [Palmer et al., 05] Japanese case frames –Semantics-based: [Haruno, 95] [Utsuro et al., 96] –Example-based: [Kawahara and Kurohashi, 06]

5 CSexamples (in English) yaku (1) (bake) gaI:18, person:15, craftsman:10, … wobread:2484, meat:1521, cake:1283, … deoven:1630, frying pan:1311, … yaku (2) (have difficulty) gateacher:3, government:3, person:3, … wohand:2950 niattack:18, action:15, son:15, … yaku (3) (burn) gacompany:1, distributor:1, … wodata:178, file:107, copy:9, … niR:1583, CD:664, CDR:3, … … ga: nominative, wo: accusative, ni: dative, de: instrument Construction of case frames for Japanese [Kawahara and Kurohashi, LREC2006]

6 Case frames for 10K predicates Construction of case frames for English 100M sentences (English Gigaword) Filtering and Parsing Predicate-argument structures Clustering WordNet MSTParser 47M sents. sbj:you pred:borrow obj:idea pp:from:artist sbj:she pred:borrow obj:idea pp:over:year sbj:i pred:borrow obj:dollar pp:from:friend sbj:farmer pred:borrow obj:money pp:for:supply sbj:he pred:borrow obj:money pp:from:company sbj:{you,she} pred:borrow obj:idea pp:from:artist pp:over:year sbj:i pred:borrow obj:dollar pp:from:friend sbj:{farmer,he} pred:borrow obj:money pp:for:supply pp:from:company sbj:{you,she} pred:borrow obj:idea pp:from:artist pp:over:year sbj:{farmer,he} pred:borrow obj:{money,dollar} pp:for:supply pp:from:{company,friend}

7 Specification of our case frames Case slots –surface cases (dependency labels) and prepositions sbj, obj, obj2, pp:for, pp:in, … Instances –words –several semantic markers,,

8 Details of case frame construction Use only reliable parses –Sentence length <= 20 words –MSTParser [McDonald et al., 06] Extract predicate-argument structures –From labeled dependency parses Group and cluster p-a structures –Grouping by a dominant case slot pre-defined order: obj, sbj, pp:* –Clustering based on WordNet Labeled dependency acc.:89.9% → 91.5% Complete rate: 36.3% → 56.4%

9 sbj: { i } obj: { dollar } pp:from: { friend } sbj: { farmer, he } obj: { money } pp:from: { company } ratio of common cases: similarity between instances (words): 0.73 CF 1 CF 2 pp:for:supply Clustering of case frames similarity between case frames 3

10 Results Obtained case frames for 9,300 verbs Evaluated case frames of 20 verbs –Criteria: Verb usage is disambiguated by dominant arguments Case frames must have obligatory case slots Case slots, except a dominant one, may contain an ineligible example –Accuracy: 88.4%

11 Examples of obtained case frames CSexamples burn (1)sbjthey:262, it:113, protester:99, … objflag:247, effigy:81, house:67, … pp:in :29, ramallah:14, brisbane:11, … pp:forweek:15, hour:6, month:5, … burn (2)sbjcandle:26, lamp:5 pp:onmotor-scooter:7, altar:3, platform:1, … pp:forday:2, steinhaeuser:1 …

12 Conclusion and future work Constructed broad-coverage case frames for English –Described real use of English verbs Future work –Use more sophisticated methods for extracting reliable parses [Kawahara and Uchimoto, 08] –Integrate case frames to parsing (and other applications) cf.[Zeman, 02] for subcategorization frames [Kawahara and Kurohashi, 06] for case frames