The interface between model-theoretic and corpus-based semantics

Slides:



Advertisements
Similar presentations
Nükte Durhan METU, Northern Cyprus Campus, School of Foreign Languages (Ankara, 30 May 2012)
Advertisements

CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Chapter 4 Key Concepts.
Psycholinguistic what is psycholinguistic? 1-pyscholinguistic is the study of the cognitive process of language acquisition and use. 2-The scope of psycholinguistic.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
A / A* Communicate a lot of relevant information in well sequenced paragraphs Narrate events, give full descriptions Express and explain ideas and points.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE 6 The Cognitive Perspective: Generative Lexicon.
Components for a semantic textual similarity system Focus on word and sentence similarity Formal side: define similarity in principle.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Introduction to Lexical Semantics Vasileios Hatzivassiloglou University of Texas at Dallas.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Learning Subjective Adjectives from Corpora Janyce M. Wiebe Presenter: Gabriel Nicolae.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Second Language Acquisition and Real World Applications Alessandro Benati (Director of CAROLE, University of Greenwich, UK) Making.
Type shifting and coercion Henriëtte de Swart November 2010.
Choosing an Instructional Strategy 7 th Grade Language Arts StrategyObjectiveExample Expository InstructionThe objective is to acquire new information.
1. Introduction Which rules to describe Form and Function Type versus Token 2 Discourse Grammar Appreciation.
Translation Studies 10. The teaching of translation Krisztina Károly, Spring, 2006 Sources: Klaudy, 2003.
Detection of Relations in Textual Documents Manuela Kunze, Dietmar Rösner University of Magdeburg C Knowledge Based Systems and Document Processing.
 Final: This classroom  Course evaluations Final Review.
Assessing Reading Exceeding Year 5 Expectations Focus Education Year 5 Exceeding Expectations: Reading Comprehension Express opinions about a text,
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Albert Gatt LIN 3098 Corpus Linguistics. In this lecture Some more on corpora and grammar Construction Grammar as a theoretical framework Collostructional.
INFORMED APPROACHES Autor: Pedro Devera Carnet:
PSY 369: Psycholinguistics Language Production & Comprehension: Conversation & Dialog.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Focus Education Assessing Reading: Meeting Year 2 Expectations Year 2 Expectations: Word Reading Decode automatically and fluently Read accurately.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Claudia Marzi Institute for Computational Linguistics, “Antonio Zampolli” – Italian National Research Council University of Pavia – Dept. of Theoretical.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Word Sense Disambiguation (WSD)
SALSA The Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado,
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Multi-Prototype Vector Space Models of Word Meaning __________________________________________________________________________________________________.
ELA Common Core Shifts. Shift 1 Balancing Informational & Literary Text.
PENNSYLVANIA COMMON CORE STANDARDS 1.2 Reading Informational Text Students read, understand, and respond to informational text—with emphasis on comprehension,
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Using a Story-Based Approach to Teach Grammar
Oracy O 6.1 Understand the main points and simple opinions in a spoken story, song or passage listen attentively, re-tell and discuss the main ideas agree.
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
PSY 369: Psycholinguistics Conversation & Dialog: Language Production and Comprehension in conjoined action.
ERIKA LUSKY JULIE RAINS Collaborative Dialogue in the Classroom
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
IN THE NAME OF GOD IN THE NAME OF GOD. Grammar Grammar Chapter 2 Chapter 2.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
LING 322. DEVELOPMENT OF VOCABULARY AS LANGUAGE RESOURCE VOCABULARY AND CONCEPTUAL DEVELOPMENT DIFFERENT ASPECTS OF VOCABULARY KNOWLEDGE PRINCIPLES FOR.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
 2003 CSLI Publications Ling 566 Oct 17, 2011 How the Grammar Works.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Chapter 5 The Oral Approach.
Learning Attributes and Relations
Communicative Competence (Canale and Swain, 1980)
Communicative Competence (Canale and Swain, 1980)
Presentation transcript:

The interface between model-theoretic and corpus-based semantics Sebastian Pado

Natural language semantics Model-theoretic semantics Compositional calculation of sentence meaning Formal descriptions of ambiguities Inference Corpus-based semantics Distributional, graded meaning representation Probabilistic knowledge acquisition from corpora Prediction of linguistic behaviour based on context Anbindung an andere Gebiete: Modelltheoretische Semantik: formale Linguistik (insbes. Grammatikformalismen), Logik, theoretische Informatik; Korpusbasierte Semantik: kognitive Psychologie, Maschinelles Lernen

Complementary benefits Corpus-based semantics Good for lexical level (open word classes) High coverage, robustness Approximative Model-theoretic semantics Good for sentence level (closed word classes) Limited coverage Correct Kriterien für Erfolg: ich persönlich denke an eine anwendungs (NLP)-bezogene Evaluation; bin aber für weitere Vorschläge sehr offen How to divide work between the approaches?

Strategies More expressive representations for corpus-based models of meaning: Compositionality in vector spaces Ongoing collaboration with Katrin Erk (Dept. of Linguistics, U. Texas at Austin) Corpus-based methods for enrichment of formal meaning representations Core of SFB project proposal Spezifikation durch Kontext

Strategy 1 More expressive representations for corpus-based models of meaning

Compositionality in Vector Spaces Vector space: Representation of word meaning by context co-occurrences What is the representation of a phrase? Centroid of two vectors? No: Must take mode of combination into account “a horse draws…” : pull “draw a horse” : sketch

A first step Structured vector space model [Erk & Pado 2008] Covers Verb+Object, Verb+Subject combinations Word meaning consists of lexical vector plus selectional preferences (=experiences) for dependents/governors Note: selectional preference representation is exemplar-based

A first step Structured vector space model [Erk & Pado 2008] Covers Verb+Object, Verb+Subject combinations Phrase meaning consists of two vectors: Verb meaning modified by nominal expectations about governor Noun meaning modified by verbal expectations about dependent

Current state Evaluation: Better distinction between contextually appropriate and inappropriate paraphrases (WSD-style task) Further research questions Generalisation to longer phrases More expressive model of expectations Modelling of phrases involving closed word classes E.g. Negation

Strategy 2 Corpus-based methods for enrichment of formal meaning representations

Formal models of meaning in context Lexicon entries cannot provide the full range of readings for words/phrases Readings often productively negotiated in text Type/sort conflict Examples: Metonymy/Metaphor Telic adjectives (“fast typist”) Coercion/Reinterpretation

Example: Coercion Wegen einer 15-jährigen kam es zu einem Streit, in dessen Verlauf sie verletzt wurde. […] Sie hatte sich mit einem 21-jährigen unterhalten. Red and blue expressions are coreferring, but red expression has wrong type (wegen takes <e,t>; expression is <e>). Here, context overtly provides missing event Often, this is not the case: Operator must be recovered from general knowledge

The role of corpus methods Acquisition of general reinterpretation operators from corpora Recovery/prediction of operators for instances with type/sort conflict Making implicit meaning explicit: can be seen as context-driven semantic specification Interest primarily empirical

Project Steps Creation of multilingual corpus of type/sort conflict cases with human annotations Informed by formal considerations Development of CL methods to predict operators for conflict resolution Ideally, task-based evaluation (to be determined) Consequences/insights for formal descriptions

Research Questions When can operators be found overtly in context; when must general operators be recovered? Influence of local discourse? CL methods for efficient and accurate prediction of operators What linguistic levels are helpful? Semantic classes, semantic roles, dependency relations, …? Focus on more than one language: Can bilingual processing help? What is the level of generality of acquired operators? What shape do people’s expectations have? Do peoples’ judgments of recovered operators agree? Can empirical results have impact on formal descriptions? E.g. do sort and type conflicts behave differently or similarly? Relation to work on textual entailment?

Collaborations D1 (Representation of ambiguities) Formal descriptions as information source for corpus development Attempt to transfer of empirical results back into theory B5 (Polysemy in a conceptual system) Ontological information as knowledge source for CL operator models Entailment as shared evaluation task Open for other ideas