Download presentation
Presentation is loading. Please wait.
Published byNorma Jacobs Modified over 9 years ago
1
A Method for Automatically Constructing Case Frames for English Daisuke Kawahara and Kiyotaka Uchimoto (LREC2008, 2008/05/29) National Institute of Information and Communications Technology
2
2 Background NLP analyzers so far –(Mainly) supervised, (relatively) knowledge-poor e.g., PP-attachment or parsing Mary ate the salad with a fork Mary ate the salad with mushrooms –Only 1.5% of bilexical dependency was learned [Bikel, 04] Toward knowledge-oriented NLP –Automatically compile case frames and integrate them into NLP analyzers/applications
3
3 Related work Subcategorization frames –[Brent, 93] [Ushioda et al., 93] [Manning, 93] [Briscoe and Carroll, 97] [Korhonen, 02] … e.g., She greeted me. NP(sbj) greet NP(obj) e.g., She gave him a book. NP(sbj) give NP(obj) NP(obj) # of SCFs# of verbscorpus sizeAcc [Brent, 1993] 6631.2M85% [Ushioda et al., 1993] 6330.3M86% [Manning, 1993] 192004.1M82% [Ersan & Charniak, 1996] 163036M70% [Caroll & Rooth, 1998] 1510030M77% [Briscoe & Caroll, 1997] 16171.2M81% [Sarkar & Zeman, 2000] 1379140.3M88%
4
4 Related work Subcategorization frames –[Brent, 93] [Ushioda et al., 93] [Manning, 93] [Briscoe and Carroll, 97] [Korhonen, 02] … (Handmade) frames –FrameNet [Baker et al., 98], PropBank [Palmer et al., 05] Japanese case frames –Semantics-based: [Haruno, 95] [Utsuro et al., 96] –Example-based: [Kawahara and Kurohashi, 06]
5
5 CSexamples (in English) yaku (1) (bake) gaI:18, person:15, craftsman:10, … wobread:2484, meat:1521, cake:1283, … deoven:1630, frying pan:1311, … yaku (2) (have difficulty) gateacher:3, government:3, person:3, … wohand:2950 niattack:18, action:15, son:15, … yaku (3) (burn) gacompany:1, distributor:1, … wodata:178, file:107, copy:9, … niR:1583, CD:664, CDR:3, … … ga: nominative, wo: accusative, ni: dative, de: instrument Construction of case frames for Japanese [Kawahara and Kurohashi, LREC2006]
6
6 Case frames for 10K predicates Construction of case frames for English 100M sentences (English Gigaword) Filtering and Parsing Predicate-argument structures Clustering WordNet MSTParser 47M sents. sbj:you pred:borrow obj:idea pp:from:artist sbj:she pred:borrow obj:idea pp:over:year sbj:i pred:borrow obj:dollar pp:from:friend sbj:farmer pred:borrow obj:money pp:for:supply sbj:he pred:borrow obj:money pp:from:company sbj:{you,she} pred:borrow obj:idea pp:from:artist pp:over:year sbj:i pred:borrow obj:dollar pp:from:friend sbj:{farmer,he} pred:borrow obj:money pp:for:supply pp:from:company sbj:{you,she} pred:borrow obj:idea pp:from:artist pp:over:year sbj:{farmer,he} pred:borrow obj:{money,dollar} pp:for:supply pp:from:{company,friend}
7
7 Specification of our case frames Case slots –surface cases (dependency labels) and prepositions sbj, obj, obj2, pp:for, pp:in, … Instances –words –several semantic markers,,
8
8 Details of case frame construction Use only reliable parses –Sentence length <= 20 words –MSTParser [McDonald et al., 06] Extract predicate-argument structures –From labeled dependency parses Group and cluster p-a structures –Grouping by a dominant case slot pre-defined order: obj, sbj, pp:* –Clustering based on WordNet Labeled dependency acc.:89.9% → 91.5% Complete rate: 36.3% → 56.4%
9
9 sbj: { i } obj: { dollar } pp:from: { friend } sbj: { farmer, he } obj: { money } pp:from: { company } 5 3 10 8 1 11 0.82 0.73 1.0 ratio of common cases: similarity between instances (words): 0.73 CF 1 CF 2 pp:for:supply Clustering of case frames similarity between case frames 3
10
10 Results Obtained case frames for 9,300 verbs Evaluated case frames of 20 verbs –Criteria: Verb usage is disambiguated by dominant arguments Case frames must have obligatory case slots Case slots, except a dominant one, may contain an ineligible example –Accuracy: 88.4%
11
11 Examples of obtained case frames CSexamples burn (1)sbjthey:262, it:113, protester:99, … objflag:247, effigy:81, house:67, … pp:in :29, ramallah:14, brisbane:11, … pp:forweek:15, hour:6, month:5, … burn (2)sbjcandle:26, lamp:5 pp:onmotor-scooter:7, altar:3, platform:1, … pp:forday:2, steinhaeuser:1 …
12
12 Conclusion and future work Constructed broad-coverage case frames for English –Described real use of English verbs Future work –Use more sophisticated methods for extracting reliable parses [Kawahara and Uchimoto, 08] –Integrate case frames to parsing (and other applications) cf.[Zeman, 02] for subcategorization frames [Kawahara and Kurohashi, 06] for case frames
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.