Aina Peris and Mariona Taulé CBA 2010: Corpus-Based Approaches to Paraphrasing and Nominalization Barcelona, 1-2 December 2010.

Slides:



Advertisements
Similar presentations
Subjunctive in Noun Clauses
Advertisements

When we first studied the subjunctive, we learned that two clauses can be connected to form a sentence by using the relative pronoun que. Ex. Tienen un.
The passive se In Spanish, the pronoun se is often used to express the passive voice when the agent performing the action is not stated. The third person.
Statements Yes/No Questions Interrogative questions
I can… Talk about what people like and don’t like.
Español Uno con Señor Beckham Do Now: Follow the directions on the sheet in front of you.
La plaza tiene una torre
El complemento directo y el complemento indirecto del verbo en español SPAN (FALL 2010)
Copyright © 2008 Vista Higher Learning. All rights reserved. 9.3–1 The definite articles el, la, los, and las modify masculine or feminine nouns. The neuter.
2 types of Articles The English word THE is called a ( Definite Article ) because it is used to refer to a Definite or Specific person or thing.  The.
As you know, se can be used as a reflexive pronoun (Él se despierta.).
Exploring the Effectiveness of Lexical Ontologies for Modeling Temporal Relations with Markov Logic Eun Y. Ha, Alok Baikadi, Carlyle Licata, Bradford Mott,
Grammar Review Nouns and Gender All nouns have gender in Spanish. The adjectives and articles are matched to the gender of the noun they are attached to:
Definite and Indefinite Articles
A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.
Spanish II Final Exam Outline Listening For each of 15 questions, you will hear some background information in English once. Then you will hear a passage.
Linguistic and Logical Tools for an Advanced Interactive Speech System in Spanish J. Álvarez, V. Arranz, N. Castell & M. Civit TALP Research Centre UPC,
Computational Lexicography Frank Van Eynde Centre for Computational Linguistics.
The NOUN 1 General characteristics and classification
Class-based nominal semantic role labeling: a preliminary investigation Matt Gerber Michigan State University, Department of Computer Science.
Apuntes el 4 de noviembre Los artículos definidos (the definite articles) EL, LA, LOS, LAS are the definite articles. They all mean THE.
Definite and Indefinite Articles
The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin.
Statements Yes/No Questions Interrogative questions.
1.AA 1. 2 times as big as France 2. 4 times as big as France 3. 8 times as big as France 4. more than 10 times as big as France Q37- Canada is : Solution.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
1 A Hidden Markov Model- Based POS Tagger for Arabic ICS 482 Presentation A Hidden Markov Model- Based POS Tagger for Arabic By Saleh Yousef Al-Hudail.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Semantic Frames: FrameNet. What is FrameNet? FrameNet is an ongoing project at the International Computer Science Institute located in Berkeley California.
Introduction to Computational Linguistics Lecture 2.
Corpus Linguistics and Second Language Acquisition – The use of ACORN in the teaching of Spanish Grammar Guadalupe Ruiz Yepes.
Para hacer ahora Write the following times in Spanish 1) 2:11 in the afternoon 2) 4:23 in the morning 3) 9:09 at night 4) 12:03 in the afternoon 5) 1:16.
CAPÍTULO 5 READING STRATEGIES. MAKE A FOLDABLE AND ANSWER ON SEPARATE SHEET OF PAPER: READ PG. 180 DESPUÉS DE LEER(PG. 180) 6.Can you relate to the information.
Hoy es martes, el 9 de abril 1.United States is a big country. 2. Bolivia is not in Central America. 3. The independence of Cuba is in Estados Unidos.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : Jorge Villalon and Rafael A. Calvo 2011, EST Concept Maps as Cognitive Visualizations.
English Review for Final These are the chapters to review. In Textbook: Chapter 1 Nouns Chapter 2 Pronouns Chapter 3 Adjectives Chapter 4 Verbs Chapter.
NOUNS ARTICLES.
INTRODUCTION: RESEARCH AREA 1. Chinese Semantics 2. Semantic difference related to syntax 3. Module Attribute Representation of Verbal Semantics (MARVS)
Spanish FrameNet Project Autonomous University of Barcelona Marc Ortega.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Developing An Emotion Corpus Sophia Lee, Hongzhi Xu and Chu-Ren Huang The Hong Kong Polytechnic University March 11, 2014 VariAMU Workshop.
English Review for Final These are the chapters to review. In Textbook: Chapter 9 Nouns Chapter 10 Pronouns Chapter 11 Adjectives Chapter 12 Verbs Chapter.
#11- Definite and Indefinite Articles. El estándard Communication 4.1: Students will demonstrate understanding of the nature of language through comparison.
17 de agosto. Para Empezar Definite articles and Indefinite articles: Adjectives.
Nouns and Articles. A noun names a person, place, thing, or idea. In Spanish, all nouns are masculine or feminine. Most masculine nouns end in –o and.
The determiner. Definition Determiners are words that precede premodifiers in a noun phrase and which denote such referential meanings as specific reference,
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
TSD, Brno, Institute of Formal and Applied Linguistics, 1 Czech Verbs of Communication and the Extraction of.
AnCora-Net Mapping the Spanish Ancora-Verb lexicon to VerbNet Mariona Taulé Maria Antònia Martí Oriol Borrega UNIVERSITAT DE BARCELONA.
Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.
The Plurals of Nouns & Articles
The theory of word classes in modern grammar studies
Definite and indefinite articles
#5- Los Artículos Definidos y Indefinidos (Definite and Indefinite
Definite and Indefinite Articles
Using Direct Object Pronouns
NOTES # 2: ARTICLES AND NOUNS (artÍculos y sustantivos)
Los sustantivos (Nouns)
Definite and Indefinite Articles
Definite and Indefinite Articles
Automatic Detection of Causal Relations for Question Answering
Introduction to ser Ser = to be.
30 Points STUDY PAGES and page 73
Definite and Indefinite Articles
Definite and Indefinite Articles
Definite and Indefinite Articles
Definite and Indefinite Articles
Definite and Indefinite Articles
Presentation transcript:

Aina Peris and Mariona Taulé CBA 2010: Corpus-Based Approaches to Paraphrasing and Nominalization Barcelona, 1-2 December 2010

 Introduction  Related Work  Methodology  AnCora-Nom  Conclusions and Future Work CBA 2010

 La solución pasaba por [abaratar el despido].  ‘The solution was to make dismissal cheaper’  La solución consistía en [el abaratamiento del despido].  ‘The solution consisted of the cost reduction of dismissal’. CBA 2010 Deverbal nominalizations contain rich semantic information

 New lexical resource: AnCora-Nom ◦ Deverbal nominalizations from AnCora-Es: 1,655 lemmas ◦ Semantic Information  Argument Structure  Denotation type: event, result, and underspecified ◦ Morhosyntactic information  Specifiers  Plurality  Constituents CBA 2010

 AnCora-Nom: 1,655 lexical entries linked to those of the corresponding verbs.  NLP approach: helpful for information extraction tasks.  Linguistic approach: an excellent resource for studying the argument realization of both nouns and verbs. CBA 2010

 Introduction  Related Work  Methodology  AnCora-Nom  Conclusions and Future Work CBA 2010

 NOMLEX (Macleod et al., 1998)  NOMLEX-PLUS (Meyers et al., 2004)  Berkeley FrameNet Project (Ruppenhoffer et al., 2006) ◦ Spanish FrameNet (Subirats, 2009)  The Essex Database of Russian Verbs and their Nominalizations (Spencer and Zaretskaya, 1999)  NOMAGE lexicon (Balvet et al., 2010) CBA 2010 Manually build vs. automatically obtained

 Introduction  Related Work  Methodology  AnCora-Nom  Conclusions and Future Work CBA 2010

ANCORA-ES ANCORA-NOM  AnCora-Nom represents in the lexical entry the information coded in AnCora-Es CBA ,000-word Spanish corpus 23,000 deverbal nominalization tokens 1,655 nominalization types and lexical entries

Annotation of Deverbal nominalizations in AnCora-Es CBA 2010

 Extraction process CBA 2010 Consult all the ocurrences of each lemma Stablish the different nominal senses Extract the features associated with each sense Lexicalized Constructions Same denotation and same verb sense AnCora-Nom Lexical Entry

 Introduction  Related Work  Methodology  AnCora-Nom  Conclusions and Future Work CBA 2010

AnCora-Nom Lexical Entries1,655 Nominal Senses3,094 Nominal Frames3, 204

Para el realizador y guionista, el protagonista masculino, Stéphane, " es muy interesante porque Ø encarna la tolerancia, aceptación de los demás. <sense cousin="no" denotation="event" id="2" lexicalized="no" originlemma="aceptar" originlink="verb.aceptar.1" wordnetsynset="16: "> … AnCora-Nom

 Cousin attribute CBA 2010 “no” morphological relationship from verb to noun aceptar>aceptación accept>acceptance “yes” morphological relationship from noun to verb revolución> revolucionar ‘revolution’> revolutionize semantic relationship between noun and verb escarnio – mofarse ‘mocking’ – ‘make fun’

 Denotation attribute ◦ Event [Su Poss-arg1-pat persecución ] es uno de esos atractivos marginales que aún le quedan a la Vuelta. [His persecution] is one of those marginal appeals that ‘la Vuelta’ still has’. ◦ Result El tema de conversación era [la actuación policial AP-arg0-agt ]. ‘The topic of discussion was [the police acting].’ ◦ Underspecified Se espera [la llegada de más de 450 observadores extranjeros] NP. ‘[ The arrival of more than 450 foreign observers] is expected. ’ CBA 2010

 Lexicalized attribute ◦ “no”: the nominalization does not take part in lexicalized construction ◦ “yes”: the nominalization does take part in lexicalized construction  Alternative_lemma attribute: golpe_de_estado ‘coup d’état’  Lexicalizationtype attribute: “nominal”, “verbal”, “adjectival”, “adverbial”, “prepositional” and “conjunctive”. CBA 2010

El empresario fiyiano George_Speight encabeza este golpe de Estado en_favor_de la comunidad de nativos fiyianos, como ha definido su acción. El ex presidente de Costa_de_Marfil Henri_Konan_Bédié, destituido el pasado 24_de_diciembre por un violento golpe de Estado militar, reclamó hoy en París la celebración de elecciones " libres y transparentes " en su país antes de Junio próximo. CBA 2010

CBA 2010 “passive”, “causative”, “locative”, “benefactive” Argument position Thematic Role Constituents and frequency "article” "indefinite” "demonstrative” "exclamative” "numeral” "interrogative” "possessive” "ordinal” "void”

Table 1: Distribution of nominal senses Table 2: Distribution of senses taking into account argument realization DenotationLexicalizedNon- lexicalized Total Event0631 Result1151,7711,886 Underspecified None850 Total2022,8923,094 ArgumentsEventResultUnderspecified More than Total6311, CBA 2010

 Introduction  Related Work  Methodology  AnCora-Nom  Conclusions and Future Work CBA 2010

 AnCora-Nom: New lexical resource  1,655 lexical entries of Spanish deverbal nominalizations  Developed from the information encoded in the AnCora-Es corpus.  Linked to the AnCora-Es corpus and to the AnCora- Verb Spanish lexicon CBA 2010

 AnCora-Nom as input of the ADN-classifier  Enlarge this lexicon with deadjectival nominalizations and relational nouns  Build a Catalan deverbal nominalization lexicon CBA 2010