Download presentation
Presentation is loading. Please wait.
Published byMae Cole Modified over 9 years ago
1
The interface between model-theoretic and corpus-based semantics
Sebastian Pado
2
Natural language semantics
Model-theoretic semantics Compositional calculation of sentence meaning Formal descriptions of ambiguities Inference Corpus-based semantics Distributional, graded meaning representation Probabilistic knowledge acquisition from corpora Prediction of linguistic behaviour based on context Anbindung an andere Gebiete: Modelltheoretische Semantik: formale Linguistik (insbes. Grammatikformalismen), Logik, theoretische Informatik; Korpusbasierte Semantik: kognitive Psychologie, Maschinelles Lernen
3
Complementary benefits
Corpus-based semantics Good for lexical level (open word classes) High coverage, robustness Approximative Model-theoretic semantics Good for sentence level (closed word classes) Limited coverage Correct Kriterien für Erfolg: ich persönlich denke an eine anwendungs (NLP)-bezogene Evaluation; bin aber für weitere Vorschläge sehr offen How to divide work between the approaches?
4
Strategies More expressive representations for corpus-based models of meaning: Compositionality in vector spaces Ongoing collaboration with Katrin Erk (Dept. of Linguistics, U. Texas at Austin) Corpus-based methods for enrichment of formal meaning representations Core of SFB project proposal Spezifikation durch Kontext
5
Strategy 1 More expressive representations for corpus-based models of meaning
6
Compositionality in Vector Spaces
Vector space: Representation of word meaning by context co-occurrences What is the representation of a phrase? Centroid of two vectors? No: Must take mode of combination into account “a horse draws…” : pull “draw a horse” : sketch
7
A first step Structured vector space model [Erk & Pado 2008]
Covers Verb+Object, Verb+Subject combinations Word meaning consists of lexical vector plus selectional preferences (=experiences) for dependents/governors Note: selectional preference representation is exemplar-based
8
A first step Structured vector space model [Erk & Pado 2008]
Covers Verb+Object, Verb+Subject combinations Phrase meaning consists of two vectors: Verb meaning modified by nominal expectations about governor Noun meaning modified by verbal expectations about dependent
9
Current state Evaluation: Better distinction between contextually appropriate and inappropriate paraphrases (WSD-style task) Further research questions Generalisation to longer phrases More expressive model of expectations Modelling of phrases involving closed word classes E.g. Negation
10
Strategy 2 Corpus-based methods for enrichment of formal meaning representations
11
Formal models of meaning in context
Lexicon entries cannot provide the full range of readings for words/phrases Readings often productively negotiated in text Type/sort conflict Examples: Metonymy/Metaphor Telic adjectives (“fast typist”) Coercion/Reinterpretation
12
Example: Coercion Wegen einer 15-jährigen kam es zu einem Streit, in dessen Verlauf sie verletzt wurde. […] Sie hatte sich mit einem 21-jährigen unterhalten. Red and blue expressions are coreferring, but red expression has wrong type (wegen takes <e,t>; expression is <e>). Here, context overtly provides missing event Often, this is not the case: Operator must be recovered from general knowledge
13
The role of corpus methods
Acquisition of general reinterpretation operators from corpora Recovery/prediction of operators for instances with type/sort conflict Making implicit meaning explicit: can be seen as context-driven semantic specification Interest primarily empirical
14
Project Steps Creation of multilingual corpus of type/sort conflict cases with human annotations Informed by formal considerations Development of CL methods to predict operators for conflict resolution Ideally, task-based evaluation (to be determined) Consequences/insights for formal descriptions
15
Research Questions When can operators be found overtly in context; when must general operators be recovered? Influence of local discourse? CL methods for efficient and accurate prediction of operators What linguistic levels are helpful? Semantic classes, semantic roles, dependency relations, …? Focus on more than one language: Can bilingual processing help? What is the level of generality of acquired operators? What shape do people’s expectations have? Do peoples’ judgments of recovered operators agree? Can empirical results have impact on formal descriptions? E.g. do sort and type conflicts behave differently or similarly? Relation to work on textual entailment?
16
Collaborations D1 (Representation of ambiguities)
Formal descriptions as information source for corpus development Attempt to transfer of empirical results back into theory B5 (Polysemy in a conceptual system) Ontological information as knowledge source for CL operator models Entailment as shared evaluation task Open for other ideas
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.