Download presentation
Presentation is loading. Please wait.
Published byThomas Fontes Gameiro Modified over 7 years ago
1
Development of a lexical resource annotated with semantic roles for Portuguese
Leonardo Zilio Supervisors: Prof. Dr. Maria José Bocorny Finatto Prof. Dr. Aline Villavicencio
2
Goals To make available a lexical resource for Portuguese with manually annotated semantic roles To compare the use of verbs in specialized and non-specialized language contexts
3
Semantic Roles John went to the park. Agent Destination . John opened the door with his key. Agent Patient Instrument The door opens with a key. Patient Instrument
4
Agent, Patient, Theme, etc.
Related Work PropBank1,2 Numbered Roles Arg0 to Arg5 + Adjuncts VerbNet3,4 36 Descriptive Roles Agent, Patient, Theme, etc.
5
Workflow PALAVRAS Parser5 Dependencies Corpora:
Cardiology and Newspaper SCF extraction tool6,7 Selection of verbs to be annotated Database Manually annotate the arguments Transform results from database into XML Make results available
6
SCF Extractor6,7 1 - Input: corpora analysed with the PALAVRAS parser5 ― Dependency trees
7
João viu o cachorro. (John saw the dog.)
Dependency tree João viu o cachorro. (John saw the dog.) João [João] <hum> PROP M #1->2 viu [ver] <vH> <fmc> <mv> V PS 3S IND #2->0 o [o] <artd> DET M #3->4 cachorro [cachorro] <Azo> N M #4->2 $. #5->0 </s> Lemma Syntax Extras Dependency Grammar
8
Dependency tree Root (0) Ver (2) João (1) cachorro (4) o (3)
9
SCF Extractor6,7 2 - Processing of all sentences in the corpora
3 - Extraction of all dependencies of main verbs 4 – Analysis of the relevant dependencies (exclusion of adverbs) (4.1 – Classification of syntactic elements) 5 – Output: Database of SCFs (SQL file)
10
Interface
11
PHP-Interface – List of verbs
Frequency Show frames
12
PHP-Interface – List of examples
Syntactic classification Sentence Arguments
13
Semantic Role Labeling
Dropbox with all available semantic roles
14
The Resource Newspaper Cardiology 191 verbs 77 verbs 5,301 instances 1,931 instances 11,089 arguments 4,192 arguments Availability: up-to-date XML files can be downloaded at the CAMELEON Project website8 under Resources > Semantic Role Labeling
15
Today’s Stage Analysis of semantic roles for cross-genre comparison
Comparison with other resources, like VerbNet.Br4 and PropBank.Br2
16
Thank you!
17
Bibliography 1 = Palmer, Martha, Daniel Gildea e Paul Kingsbury The Proposition Bank: A Corpus Annotated with Semantic Roles. In: Computational Linguistics Journal, 31:1. 2 = Duran, Magali Sanches e Sandra Maria Aluísio Propbank-Br: a Brazilian treebank annotated with semantic role labels. In: Proceedings of the LREC 2012, May 21-27, Istanbul, Turquia. 3 = Kipper-Schuler, Karin VerbNet: a broad-coverage, comprehensive verb lexicon. University of Pennsylvania. 4 = Scarton, Carolina VerbNet.Br: construção semiautomática de um léxico verbal online e independente de domínio para o português do Brasil. NILC/USP. 5 = Bick, E The Parsing System"Palavras": Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework, Volume 202. Aarhus University Press Aarhus. 6 = ZANETTE, Adriano. (2010) Aquisição de Subcategorization Frames para Verbos da Língua Portuguesa. Projeto de Diplomação. UFRGS. Orientadora: Aline Villavicencio. 7 = Zilio, Leonardo, Adriano Zanette and Carolina Scarton Automatic extraction of subcategorization frames from portuguese corpora. In: Aluisio, S. M. and Tagnin. S. E. O. (eds.) New Languages Technologies and Linguistic Research: a Two-Way Road. Cambridge Scholars Publishing, pp. 78-96. 8 =
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.