Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text Chapters 20 April 3, 2013 CSCE 771 Natural Language Processing
– 2 – CSCE 771 Spring 2013 Overview Last Time (Programming) Features in NLTK NL queries SQL NLTK support for Interpretations and Models Propositional and predicate logic support Prover9Today Last Lectures slides Features in NLTK Computational Lexical SemanticsReadings: Text 19,20 NLTK Book: Chapter 10 Next Time: Computational Lexical Semantics II
– 3 – CSCE 771 Spring 2013 Model Building in NLTK - Chapter 10 continued Mace model builder lp = nltk.LogicParser() # install Mace4 config_mace4('c:\Python26\Lib\site-packages\prover9') a3 = lp.parse('exists x.(man(x) & walks(x))') c1 = lp.parse('mortal(socrates)') c2 = lp.parse('-mortal(socrates)') mb = nltk.Mace(5) print mb.build_model(None, [a3, c1]) True print mb.build_model(None, [a3, c2]) True print mb.build_model(None, [c1, c2]) False
– 4 – CSCE 771 Spring 2013 >>> a4 = lp.parse('exists y. (woman(y) & all x. (man(x) -> love(x,y)))') >>> a5 = lp.parse('man(adam)') >>> a6 = lp.parse('woman(eve)') >>> g = lp.parse('love(adam,eve)') >>> mc = nltk.MaceCommand(g, assumptions=[a4, a5, a6]) >>> mc.build_model() True
– 5 – CSCE 771 Spring The Semantics of English Sentences Principle of compositionality --
– 6 – CSCE 771 Spring 2013 Representing the λ-Calculus in NLTK (33) a.(walk(x) ∧ chew_gum(x)) b.λx.(walk(x) ∧ chew_gum(x)) c.\x.(walk(x) & chew_gum(x)) -- the NLTK way!
– 7 – CSCE 771 Spring 2013 Lambda0.py import nltk from nltk import load_parser lp = nltk.LogicParser() e = lp.parse(r'\x.(walk(x) & chew_gum(x))') print e \x.(walk(x) & chew_gum(x)) e.free() print lp.parse(r'\x.(walk(x) & chew_gum(y))') \x.(walk(x) & chew_gum(y))
– 8 – CSCE 771 Spring 2013 Simple β-reductions >>> e = lp.parse(r'\x.(walk(x) & chew_gum(x))(gerald)') >>> print e \x.(walk(x) & chew_gum(x))(gerald) >>> print e.simplify() [1] (walk(gerald) & chew_gum(gerald))
– 9 – CSCE 771 Spring 2013 Predicate reductions >>> e3 = lp.parse('\P.exists x.P(x)(\y.see(y, x))') >>> print e3 (\P.exists x.P(x))(\y.see(y,x)) >>> print e3.simplify() exists z1.see(z1,x)
– 10 – CSCE 771 Spring 2013 Figure 19.7 Inheritance of Properties Exists e,x,y Eating(e) ^ Agent(e, x) ^ Theme(e, y) “hamburger edible?” from wordnet Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. Speech and Language Processing, Second Edition Daniel Jurafsky and James H. Martin
– 11 – CSCE 771 Spring 2013 Figure 20.1 Possible sense tags for bass Chapter 20 – Word Sense disambiguation (WSD) Machine translation Supervised vs unsupervised learning Semantic concordance – corpus with words tagged with sense tags
– 12 – CSCE 771 Spring 2013 Feature Extraction for WSD Feature vectors Collocation [w i-2, POS i-2, w i-1, POS i-1, w i, POS i, w i+1, POS i+1, w i+2, POS i+2 ] [w i-2, POS i-2, w i-1, POS i-1, w i, POS i, w i+1, POS i+1, w i+2, POS i+2 ] Bag-of-words – unordered set of neighboring words Represent sets of most frequent content words with membership vector [0,0,1,0,0,0,1] – set of 3 rd and 7 th most freq. content word Window of nearby words/features
– 13 – CSCE 771 Spring 2013 Naïve Bayes Classifier w – word vector s – sense tag vector f – feature vector [w i, POS i ] for i=1, …n Approximate by frequency counts But how practical?
– 14 – CSCE 771 Spring 2013 Looking for Practical formula. Still not practical
– 15 – CSCE 771 Spring 2013 Naïve == Assume Independence Now practical, but realistic?
– 16 – CSCE 771 Spring 2013 Training = count frequencies. Maximum likelihood estimator (20.8)
– 17 – CSCE 771 Spring 2013 Decision List Classifiers Naïve Bayes hard for humans to examine decisions and understand Decision list classifiers - like “case” statement sequence of (test, returned-sense-tag) pairs
– 18 – CSCE 771 Spring 2013 Figure 20.2 Decision List Classifier Rules
– 19 – CSCE 771 Spring 2013 WSD Evaluation, baselines, ceilings Extrinsic evaluation - evaluating embedded NLP in end- to-end applications (in vivo) Intrinsic evaluation – WSD evaluating by itself (in vitro) Sense accuracy Corpora – SemCor, SENSEVAL, SEMEVAL Baseline - Most frequent sense (wordnet sense 1) Ceiling – Gold standard – human experts with discussion and agreement
– 20 – CSCE 771 Spring 2013 Figure 20.3 Simplified Lesk Algorithm gloss/sentence overlap
– 21 – CSCE 771 Spring 2013 Simplified Lesk example The bank can guarantee deposits will eventually cover future tuition costs because it invests in adjustable rate mortgage securities.
– 22 – CSCE 771 Spring 2013 SENSEVAL competitions Check the Senseval-3 website.Senseval-3
– 23 – CSCE 771 Spring 2013 Corpus Lesk weights applied to overlap words inverse document frequency idf i = log (N docs / num docs containing w i )
– 24 – CSCE 771 Spring Selectional Restrictions and Preferences
– 25 – CSCE 771 Spring 2013 Wordnet Semantic classes of Objects
– 26 – CSCE 771 Spring 2013 Minimally Supervised WSD: Bootstrapping Yarowsky algorithm Heuritics: 1.one sense per collocations 2.one sense per discourse
– 27 – CSCE 771 Spring 2013 Figure 20.4 Two senses of plant
– 28 – CSCE 771 Spring 2013 Figure 20.5
– 29 – CSCE 771 Spring 2013 Figure 20.6 Path Based Similarity
– 30 – CSCE 771 Spring 2013 Figure 20.6 Path Based Similarity.\ sim path (c 1, c 2 )= 1/pathlen(c 1, c 2 ) (length + 1)
– 31 – CSCE 771 Spring 2013 Information Content word similarity
– 32 – CSCE 771 Spring 2013 Figure 20.7 Wordnet with P(c) values
– 33 – CSCE 771 Spring 2013 Figure 20.8
– 34 – CSCE 771 Spring 2013
– 35 – CSCE 771 Spring 2013 Figure 20.9
– 36 – CSCE 771 Spring 2013 Figure 20.10
– 37 – CSCE 771 Spring 2013 Figure 20.11
– 38 – CSCE 771 Spring 2013 Figure 20.12
– 39 – CSCE 771 Spring 2013 Figure 20.13
– 40 – CSCE 771 Spring 2013 Figure 20.14
– 41 – CSCE 771 Spring 2013 Figure 20.15
– 42 – CSCE 771 Spring 2013 Figure 20.16