Components for a semantic textual similarity system Focus on word and sentence similarity Formal side: define similarity in principle
Characterizing word meaning in context Given a word in a particular sentence context: Can we characterize its meaning without reference to dictionary senses? Why? – For many lemmas, hard to draw sentence boundaries (-> Kilgarriff, Hanks in lexicography; Kintsch in cognition; Cruse, Tuggy in cognitive linguistics)
Characterizing word meaning in context How? – Compute vector space representation for word in particular sentence context – Read off: contextually appropriate paraphrases
Approaches Make clusters that correspond to senses. In given context, compute weighting over clusters / choose cluster (Reisinger & Mooney, Dinu & Lapata) One vector per word: mixes senses – Use context to “bend” word vector, adapt it to given context (Mitchell & Lapata, Erk & Pado, Thater et al) Language modeling (Washtell, Moon & Erk)
Using contextualized word vectors Part of sentence similarity approach (Reddy et al) Paraphrases Determine inference rule applicability
Viewpoint from vector space approaches to sentence similarity Mitchell & Lapata; Clark, Coecke, Sadrzadeh, Grefenstette; Baroni & Zamparelli, Socher et al Mostly applied to phrase pairs / sentence pairs with same structure Even Socher et al seem to focus on cases with mostly parallel sentence structure
Similarity between sentences of dissimilar structure Central: MWE and alternation detection lemma-specific paraphrases and MWEs: covered by automatically induced inference rules alternations: – passivization – John broke the vase / the vase broke Principled approach: Graph rewriting system to transform sentence structure (Bar-Haim et al)
A plug for events and SRL Central: identifying events & participants about which the sentences speak Semantically equivalent sentences talk about the same events Hence, SRL, coreference
A plug for events and SRL Once events have been identified: – time and date expressions – modals – negation – embedded propositions