Statistical NLP Spring 2010 Lecture 20: Compositional Semantics Dan Klein – UC Berkeley.

Slides:



Advertisements
Similar presentations
Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.
Advertisements

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
(It’s not that bad…). Error ID  They give you a sentence  Four sections are underlined  E is ALWAYS “No error”  Your job is to identify which one,
Statistical NLP: Lecture 3
Albert Gatt LIN3021 Formal Semantics Lecture 5. In this lecture Modification: How adjectives modify nouns The problem of vagueness Different types of.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
C. Varela; Adapted w/permission from S. Haridi and P. Van Roy1 Declarative Computation Model Defining practical programming languages Carlos Varela RPI.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Meaning and Language Part 1.
Formal Semantics Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris Manning, and Dan Klein.
English Errors Explained. No Big Deal  Do not freak out. Most grammar errors can be put into three categories: agreement, tense, and ambiguity.  You.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
Statistical NLP Spring 2010 Lecture 21: Compositional Semantics Dan Klein – UC Berkeley Includes slides from Luke Zettlemoyer.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
February 2009Introduction to Semantics1 Logic, Representation and Inference Introduction to Semantics What is semantics for? Role of FOL Montague Approach.
9/8/20151 Natural Language Processing Lecture Notes 1.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Statistical NLP Spring 2010 Lecture 25: Diachronics Dan Klein – UC Berkeley.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
November 2003CSA4050: Semantics I1 CSA4050: Advanced Topics in NLP Semantics I What is semantics for? Role of FOL Montague Approach.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Using Surface Syntactic Parser & Deviation from Randomness Jean-Pierre Chevallet IPAL I2R Gilles Sérasset CLIPS IMAG.
Artificial Intelligence: Natural Language
Linguistic Essentials
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Interpreting Language (with Logic)
Albert Gatt LIN3021 Formal Semantics Lecture 4. In this lecture Compositionality in Natural Langauge revisited: The role of types The typed lambda calculus.
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
CSA2050 Introduction to Computational Linguistics Parsing I.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Building a Semantic Parser Overnight
Statistical NLP Spring 2011 Lecture 21: Semantic Roles Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Natural Language Processing Slides adapted from Pedro Domingos
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
SYNTAX 1 NOV 9, 2015 – DAY 31 Brain & Language LING NSCI Fall 2015.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.
October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
Semantics cCS 224n / Lx 237 Tuesday, May With slides borrowed from Jason Eisner.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
CSC 594 Topics in AI – Natural Language Processing
Approaches to Machine Translation
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
Statistical NLP Spring 2010
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
Statistical NLP Spring 2011
Approaches to Machine Translation
Linguistic Essentials
CS246: Information Retrieval
Statistical NLP Spring 2011
David Kauchak CS159 – Spring 2019
Information Retrieval
Presentation transcript:

Statistical NLP Spring 2010 Lecture 20: Compositional Semantics Dan Klein – UC Berkeley

HW3: Part-of-Speech Tagging  Overall: great job – very well-written, good exploration of ideas, solid research style  Top / interesting results:  Nick: 97.2, k-MIRA, M3Ns  Xingyan: 96.6 / 85.4: Maxent/TnT hybrid, next word’s tagset, hypenation models  Jason: 96.9 / 84.8: Basic MEMM features with rare limits, some targeted to grammatical confusions  Eugene: 96.8 / 78.3: Voting for classifier combination  James Ide: 96.6 / 83.7 / 51.1: “HMM” with future-looking Maxent features  Anuj: 96.7 / 85.1: Maxent/HMM hybrid, n-gram features, hyphenated components, 150 iters  Isabelle: 96.7 / 84.2: Maxent/HMM hybrid, long training

Mid-Term Evals

Truth-Conditional Semantics  Linguistic expressions:  “Bob sings”  Logical translations:  sings(bob)  Could be p_1218(e_397)  Denotation:  [[bob]] = some specific person (in some context)  [[sings(bob)]] = ???  Types on translations:  bob : e (for entity)  sings(bob) : t(for truth-value) S NP Bob bob VP sings y.sings(y) sings(bob)

Truth-Conditional Semantics  Proper names:  Refer directly to some entity in the world  Bob : bob [[bob]] W  ???  Sentences:  Are either true or false (given how the world actually is)  Bob sings : sings(bob)  So what about verbs (and verb phrases)?  sings must combine with bob to produce sings(bob)  The -calculus is a notation for functions whose arguments are not yet filled.  sings : x.sings(x)  This is predicate – a function which takes an entity (type e) and produces a truth value (type t). We can write its type as e  t.  Adjectives? S NP Bob bob VP sings y.sings(y) sings(bob)

Compositional Semantics  So now we have meanings for the words  How do we know how to combine words?  Associate a combination rule with each grammar rule:  S :  (  )  NP :  VP :  (function application)  VP : x.  (x)   (x)  VP :  and :  VP :  (intersection)  Example: S NP VP BobVPand sings VP dances bob y.sings(y) z.dances(z) x.sings(x)  dances(x) [ x.sings(x)  dances(x)](bob) sings(bob)  dances(bob)

Denotation  What do we do with logical translations?  Translation language (logical form) has fewer ambiguities  Can check truth value against a database  Denotation (“evaluation”) calculated using the database  More usefully: assert truth and modify a database  Questions: check whether a statement in a corpus entails the (question, answer) pair:  “Bob sings and dances”  “Who sings?” + “Bob”  Chain together facts and use them for comprehension

Other Cases  Transitive verbs:  likes : x. y.likes(y,x)  Two-place predicates of type e  (e  t).  likes Amy : y.likes(y,Amy) is just like a one-place predicate.  Quantifiers:  What does “Everyone” mean here?  Everyone : f.  x.f(x)  Mostly works, but some problems  Have to change our NP/VP rule.  Won’t work for “Amy likes everyone.”  “Everyone likes someone.”  This gets tricky quickly! S NP VP EveryoneVBPNP Amy likes x. y.likes(y,x) y.likes(y,amy) amy f.  x.f(x) [ f.  x.f(x)]( y.likes(y,amy))  x.likes(x,amy)

Indefinites  First try  “Bob ate a waffle” : ate(bob,waffle)  “Amy ate a waffle” : ate(amy,waffle)  Can’t be right!   x : waffle(x)  ate(bob,x)  What does the translation of “a” have to be?  What about “the”?  What about “every”? S NP VP BobVBDNP a waffle ate

Grounding  Grounding  So why does the translation likes : x. y.likes(y,x) have anything to do with actual liking?  It doesn’t (unless the denotation model says so)  Sometimes that’s enough: wire up bought to the appropriate entry in a database  Meaning postulates  Insist, e.g  x,y.likes(y,x)  knows(y,x)  This gets into lexical semantics issues  Statistical version?

Tense and Events  In general, you don’t get far with verbs as predicates  Better to have event variables e  “Alice danced” : danced(alice)   e : dance(e)  agent(e,alice)  (time(e) < now)  Event variables let you talk about non-trivial tense / aspect structures  “Alice had been dancing when Bob sneezed”   e, e’ : dance(e)  agent(e,alice)  sneeze(e’)  agent(e’,bob)  (start(e) < start(e’)  end(e) = end(e’))  (time(e’) < now)

Adverbs  What about adverbs?  “Bob sings terribly”  terribly(sings(bob))?  (terribly(sings))(bob)?   e present(e)  type(e, singing)  agent(e,bob)  manner(e, terrible) ?  It’s really not this simple.. S NP VP BobVBPADVP terriblysings

Propositional Attitudes  “Bob thinks that I am a gummi bear”  thinks(bob, gummi(me)) ?  thinks(bob, “I am a gummi bear”) ?  thinks(bob, ^gummi(me)) ?  Usual solution involves intensions (^X) which are, roughly, the set of possible worlds (or conditions) in which X is true  Hard to deal with computationally  Modeling other agents models, etc  Can come up in simple dialog scenarios, e.g., if you want to talk about what your bill claims you bought vs. what you actually bought

Trickier Stuff  Non-Intersective Adjectives  green ball : x.[green(x)  ball(x)]  fake diamond : x.[fake(x)  diamond(x)] ?  Generalized Quantifiers  the : f.[unique-member(f)]  all : f. g [  x.f(x)  g(x)]  most?  Could do with more general second order predicates, too (why worse?)  the(cat, meows), all(cat, meows)  Generics  “Cats like naps”  “The players scored a goal”  Pronouns (and bound anaphora)  “If you have a dime, put it in the meter.”  … the list goes on and on! x.[fake(diamond(x))

Multiple Quantifiers  Quantifier scope  Groucho Marx celebrates quantifier order ambiguity: “In this country a woman gives birth every 15 min. Our job is to find that woman and stop her.”  Deciding between readings  “Bob bought a pumpkin every Halloween”  “Bob put a warning in every window”  Multiple ways to work this out  Make it syntactic (movement)  Make it lexical (type-shifting)

 Add a “sem” feature to each context-free rule  S  NP loves NP  S[sem=loves(x,y)]  NP[sem=x] loves NP[sem=y]  Meaning of S depends on meaning of NPs  TAG version: Implementation, TAG, Idioms NP V loves VP S NP x y loves(x,y) NP the bucket V kicked VP S NP x died(x)  Template filling: S[sem=showflights(x,y)]  I want a flight from NP[sem=x] to NP[sem=y]

Modeling Uncertainty  Gaping hole warning!  Big difference between statistical disambiguation and statistical reasoning.  With probabilistic parsers, can say things like “72% belief that the PP attaches to the NP.”  That means that probably the enemy has night vision goggles.  However, you can’t throw a logical assertion into a theorem prover with 72% confidence.  Not clear humans really extract and process logical statements symbolically anyway.  Use this to decide the expected utility of calling reinforcements?  In short, we need probabilistic reasoning, not just probabilistic disambiguation followed by symbolic reasoning! The scout saw the enemy soldiers with night goggles.

CCG Parsing  Combinatory Categorial Grammar  Fully (mono-) lexicalized grammar  Categories encode argument sequences  Very closely related to the lambda calculus  Can have spurious ambiguities (why?)

Syntax-Based MT

Learning MT Grammars

Extracting syntactic rules Extract rules(Galley et. al. ’04, ‘06)

Rules can...  capture phrasal translation  reorder parts of the tree  traverse the tree without reordering  insert (and delete) words

Bad alignments make bad rules This isn’t very good, but let’s look at a worse example...

Sometimes they’re really bad One bad link makes a totally unusable rule!

在 at 办公室 office 里 in 读了 read 书 book read the book in the office Alignment: Words, Blocks, Phrases

b 0 b 1 b 2 Features φ ( b 0, s, s’ ) φ ( b 1, s, s’ ) φ ( b 2, s, s’ ) the entire country in recent years 全 国 近年 来 to warn 提醒 Discriminative Block ITG [Haghighi, Blitzer, Denero,and Klein, ACL 09]

EN 中文 Build a model Syntactic Correspondence

Synchronous Grammars?

Block ITG Alignment Adding Syntax: Weak Synchronization

Separate PCFGs Adding Syntax: Weak Synchronization

Get points for synchronization; not required Adding Syntax: Weak Synchronization

NPVP S NP IP b0b0 b1b1 b2b2 VP AP (IP, s) (b 0, s, s’) (NP, s) (b 1, s, s’) (VP, s) (b 2, s, s’) (S, s’) (IP, b 0 ) (NP, s’) (b 0, S) (AP, s’) (b 1, NP) (VP, s’) (IP, b 0, S) Weakly Synchronous Features Parsing Alignment Agreement

办公室 office Feature Type 1: Word Alignment EN 中文 Feature Type 3: Agreement Feature Type 2: Monolingual Parser EN PP in the office EN 中文 EN 中文 EN 中文 EN 中文 EN 中文 [HBDK09] Weakly Synchronous Model

Problem: Summing over weakly aligned hypotheses is intractable Factored approximation: EN 中文 1)Initialize: 2)Iterate: Set to minimize EN 中文 Algorithm Inference: Structured Mean Field

Results [Burkett, Blitzer, and Klein, NAACL 10]

Incorrect English PP Attachment

Corrected English PP Attachment

Reference At this point the cause of the plane collision is still unclear. The local caa will launch an investigation into this. Baseline (GIZA++) The cause of planes is still not clear yet, local civil aviation department will investigate this. 目前导致飞机相撞的原因尚不清楚, 当地民航部门将对此展开调查 Cur- rently causeplanecrashDEreasonyetnotclear,local civil aero- nautics bureauwilltowardopen investi- gations Bilingual Adaptation Model The cause of plane collision remained unclear, local civil aviation departments will launch an investigation. Improved Translations

Machine Translation Approach Source Text Target Text nous acceptons votre opinion. we accept your view.

Translations from Monotexts Source Text Target Text  Translation without parallel text?  Need (lots of) sentences [Fung 95, Koehn and Knight 02, Haghighi and Klein 08]

Task: Lexicon Matching Source Text Target Text Matching m state world name Source Words s natio n estad o polític a Target Words t mund o nombre [Haghighi and Klein 08]

Data Representation state Source Text What are we generating? Orthographic Features 1.0 #st tat te# Context Features world politics society

Data Representation state Orthographic Features 1.0 #st tat te# Context Features world politics society Source Text estado Orthographic Features 1.0 #es sta do# Context Features mundo politica socieda d Target Text What are we generating?

Generative Model (CCA) estado state Source Space Target Space Canonical Space

Generative Model (Matching) Source Words s Target Words t Matching m state world name nation estado nombre politica mundo

E-Step: Find best matching M-Step: Solve a CCA problem Inference: Hard EM

Experimental Setup  Data: 2K most frequent nouns, texts from Wikipedia  Seed: 100 translation pairs  Evaluation: Precision and Recall against lexicon obtained from Wiktionary  Report p 0.33, precision at recall 0.33

Lexicon Quality (EN-ES) Precision Recall

Seed Lexicon Source  Automatic Seed  Edit distance seed 92 4k EN-ES Wikipedia Articles Precision

Analysis

Interesting MatchesInteresting Mistakes

Language Variation