Download presentation
Presentation is loading. Please wait.
Published byNathan Nicholson Modified over 8 years ago
1
Chinese Proposition Bank Nianwen Xue, Chingyi Chia Scott Cotton, Seth Kulick, Fu-Dong Chiou, Martha Palmer, Mitch Marcus
2
Outline Motivation Overview Guidelines and Frame Files Annotation procedure Project status
3
Machine Translation 他 /he 在 /at 这 /this 个 /CL 文件 /document 上 /on 签 /sign 了 /ASP 自己 /self 的 /DE 名字 /name SYSTRAN: He has signed own name in this document Correct: He signed his own name on this document 他 /he 在 /at 这 /this 个 /CL 文件 /document 上 /on 签字 /sign SYSTRAN: He signs in this document Correct: He signed this document. Problem: Prepositional phrase is NOT semantic adjunct.
4
MT: Further examples 俄罗斯 /Russia 撤回 /withdraw 军队 /army. SYSTRAN: Russia withdraws the army. Correct: Russia withdrew the army. 俄罗斯 /Russia 军队 /army 撤回 /withdraw 莫斯科 /Moscow. SYSTRAN: The Russian army withdraws Moscow Correct: The Russian army withdrew to Moscow. Problem: Argument is the goal (arg2), not theme (arg1)!
5
Where We Are Motivation Overview Guidelines and Frame Files Annotation procedure Project status
6
An Example 国会 /Congress 最近 /recently 通过 /pass 了 /ASP 银行法 /banking law “The Congress passed the banking law recently.” 银行法 /banking law 最近 /recently 通过 /pass 了 /ASP “The banking law passed recently.” 火车 /train 正在 /now 通过 /pass 遂道 /tunnel “The train is passing through the tunnel.” 火车 /train 正在 /now 通过 /pass “The train is passing.” Frameset1 Frameset2
7
Annotation Model VERB FS 0 FS 1 FS 2 …… FS i F 0 F 1 F 2 …… F j Arg 0 Arg 1 Arg 2 …… Arg k 国会 /Congress 通过 /pass 了 /ASP 银行法 /banking law
8
Where We Are Motivation Overview Guidelines and Frame Files Annotation procedure Project status
9
Annotation Approach Guidelines: specify how to create frame files and address some general annotation issues, e.g. the annotation of semantic adjuncts. Frames files: specify how each verb is annotated.
10
Frame Files Description of the framesets and subcat frames belonging to each frameset Description of the set of roles associated with each frameset Mapping between the syntactic entities and the argument labels, e.g. 法案 /bill 通过 /pass 了 /AS: SUBJECT->arg1, VERB -> REL Annotated example for each subcat frame.
11
Defining Framesets (1) Defining framesets involves characterizing the arguments of a verb in terms of (a) their syntactic realizations (subcat frames) and (b) their “semantic” properties. Two subcat frames are the same if they have the same type and number of arguments, otherwise they are different One subcat frame subsumes another if the arguments of the latter is a subset of the former. All subcat frames that belong to a frameset should either be identical to or subsume one another.
12
Defining Frameset (2) Syntactic realizations and semantic properties are expected to coincide most of the time: difference (similarity) in meaning is reflected in difference (similarity) in syntactic realizations (c.f. Levin 1993), e.g. 通过 /pass
13
Defining Framesets (3) Framesets are NOT distinguished if a verb has different “senses” that are realized in the same subcat frame or set of subcat frames, e.g. 统一 Sense 1: standardize 分词 /segmentation 标准 /standard 要 /should 统一 /standardize 我们 /we 要 /will 统一 /standardize 分词 /segmentation 标准 /standard Sense 2: reunite 韩国 /Korea 要 /should 统一 /reunite 他们 /they 要 /should 统一 /reunite 韩国 /Korea
14
Don’t Forget the Adjuncts Adjuncts are more global, i.e., not specific to individual verbs or a class of verbs. The adjuncts are tagged as ArgM + functional tags indicating type. The annotation of the adjuncts are specified in the guidelines.
15
Functional Tags for Adjuncts ADV: adverbial, default tag BNF: beneficiary CND: condition DIR: direction DGR: degree FRQ: frequency LOC: locative MNR: manner PRP: purpose or reason TMP: temporal TPC: topic
16
Functional Tags for Arguments and Phrasal Verbs PRD: predicate AS : 为, 是, 作, 做 AT: 在, 于 INTO: 成, 入, 进 ONTO: 上 TO: 到, 至 TOWARDS: 向, 往
17
An Actual Example 商检 /commercial inspection 部门 /department 最近 /recently 将 /ba 检验 /inspection 时间 /time 由 /from 七 /seven 至 /to 十 /ten 天 /day 缩短 /shorten 到 /to 一 /one 至 /to 三 /three 天 /day. “Commercial inspection department recently shortened the inspection time from 7 ~ 10 days to 1 ~ 3 days.” REL: 缩短 /shorten Arg0 (agent): 商检 /commercial inspection 部门 /department Arg1 (theme): 检验 /inspection 时间 /time Arg2 (range): Arg3 (starting point): 由 /from 七 /seven 至 /to 十 /ten 天 /day Arg4 (end point): 到 /to 一 /one 至 /to 三 /three 天 /day ArgM-TMP: 最近 /recently
18
Where We Are Motivation Overview Guidelines and Frame Files Annotation Procedure Project Status
19
Annotation Procedure Automatic Preprocessing: Preliminary results: 30 predicates, 95%(verbs only), 79%(with nominalizations) (Xue and Kulick, HlT’03) Manual checking Double blind annotation and Adjudication
20
Extracting Subcat Frames Traversing a parse tree, picking up constituents of interest, i.e., potential arguments, and generating a template representing the subcat frame. “Normalizing” special constructions such as ba- and bei-constructions, and verb compounds
21
Using the Subcat Frames Tagging the arguments Input: subcat frames, mappings Output: argument labels Sorting the verbs by subcat frames
22
Where We Are Motivation Overview Guidelines and Frame Files Annotation procedure Project status
23
Project Status Guidelines ready. No major revision expected. About 300 frame files created, at a 40~50 verbs per week. Automatic tagger ready Annotation interface ready
24
What’s Coming Continued creation of frame files Double-blind hand-correction Adjudication
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.