Presentation is loading. Please wait.

Presentation is loading. Please wait.

The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin.

Similar presentations


Presentation on theme: "The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin."— Presentation transcript:

1 The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin

2 Semantic role annotation in SALSA  SALSA: The Saarbrücken Lexical Semantics Annotation and Analysis project  Manual annotation of the German TIGER corpus with lexical semantic information Basis: The Berkeley FrameNet database Verbs annotated with their Frame (~ sense), plus semantic roles  TIGER corpus:  1.5 million words / 80 K sentences of German newspaper text (Frankfurter Rundschau)  Stuttgart/Potsdam/Saarbrücken  Phrase types and grammatical functions

3 Annotation Scheme (They didn‘t want to pay the move back because the employee had quit.) Semantics:  Independent frames  Trees of depth one  One edge points to target, others to frame elements  Sem. roles point to syn. constituents TIGER Syntax:  Node labels: constituents  Edge labels: gramm. functions  Crossing edges  POS

4 Experiences with the semantic role annotation in Salsa  Frame (~ sense) assignment more difficult than role assignment  Multiple tags possible, at frame level and at role level  Limited compositionality phenomena, each with separate annotation format in Salsa: Light verbs, metaphor, idioms Distinction often difficult: metaphor vs idiom, bleaching If I did this again, one format, multiple tags possible  Annotation beyond the sentence boundary Message role in Communication frames  Annotation below the word boundary: German noun compounds Mietrechtsdiskussion: discussion of tenant law

5 Encoding sem. role annotation: TIGER XML as a great basis  TIGER XML: each constituent is an XML element with a globally unique ID Syn. edges explicitly encoded: elements links two nodes, referring to their IDs Models discontinuous constituents  Salsa/Tiger XML: Sem. annotation by adding a modular block to the XML structure of a sentence Semantics points to syn. constituents using their IDs Annotation beyond sentence boundary possible: globally unique syn. IDs

6 Extracting a lexicon: need for a deeper, richer syntax  Extracting syntax/semantics mapping: needs to identify gramm. functions filled by sem. roles  Problems: Constituent structure rather than dependencies: subjects hard to retrieve TIGER does not mark voice Shallow format for PPs: determining heads is hard Coordination is a pain


Download ppt "The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin."

Similar presentations


Ads by Google