Formal Semantics Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris Manning, and Dan Klein.

Slides:



Advertisements
Similar presentations
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
Advertisements

Natural Language Processing Lecture 2: Semantics.
First-Order Logic (and beyond)
Semantics (Representing Meaning)
Kaplan’s Theory of Indexicals
The Meaning of Language
SEMANTICS.
ASPECTS OF LINGUISTIC COMPETENCE 5 SEPT 11, 2013 – DAY 7 Brain & Language LING NSCI Harry Howard Tulane University.
Statistical NLP: Lecture 3
CAS LX 502 8a. Formal semantics Truth and meaning The basis of formal semantics: knowing the meaning of a sentence is knowing under what conditions.
Albert Gatt LIN3021 Formal Semantics Lecture 5. In this lecture Modification: How adjectives modify nouns The problem of vagueness Different types of.
Knowledge Representation I Suppose I tell you the following... The Duck-Bill Platypus and the Echidna are the only two mammals that lay eggs. Only birds.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Introduction to Semantics To be able to reason about the meanings of utterances, we need to have ways of representing the meanings of utterances. A formal.
CS 4705 Semantic Analysis: Syntax-Driven Semantics.
CS 4705 Lecture 17 Semantic Analysis: Syntax-Driven Semantics.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
CS 4705 Semantic Analysis: Syntax-Driven Semantics.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Type shifting and coercion Henriëtte de Swart November 2010.
Meaning and Language Part 1.
Syntax.
The Study of Meaning in Language
THAT’S JUST SEMANTICS Katie Welch, PhD LING
Formal Semantics Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris Manning, and Dan Klein.
Statistical NLP Spring 2010 Lecture 21: Compositional Semantics Dan Klein – UC Berkeley Includes slides from Luke Zettlemoyer.
February 2009Introduction to Semantics1 Logic, Representation and Inference Introduction to Semantics What is semantics for? Role of FOL Montague Approach.
October 2004csa4050: Semantics II1 CSA4050: Advanced Topics in NLP Semantics II The Lambda Calculus Semantic Representation Encoding in Prolog.
CAS LX 502 Semantics 3a. A formalism for meaning (cont ’ d) 3.2, 3.6.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 Words and rules Linguistics lecture #2 October 31, 2006.
ASPECTS OF LINGUISTIC COMPETENCE 4 SEPT 09, 2013 – DAY 6 Brain & Language LING NSCI Harry Howard Tulane University.
CSE PredLogic 1 Knowledge Representation with Logic: First Order Predicate Calculus Outline –Introduction to First Order Predicate Calculus (FOPC)
Atomic Sentences Chapter 1 Language, Proof and Logic.
November 2003CSA4050: Semantics I1 CSA4050: Advanced Topics in NLP Semantics I What is semantics for? Role of FOL Montague Approach.
Chapter 1, Part II: Predicate Logic With Question/Answer Animations.
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
October 2004CSA4050: Semantics III1 CSA4050: Advanced Topics in NLP Semantics III Quantified Sentences.
Computing Science, University of Aberdeen1 CS4025: Logic-Based Semantics l Compositionality in practice l Producing logic-based meaning representations.
1 Relational Algebra and Calculas Chapter 4, Part A.
Artificial Intelligence: Natural Language
1 Predicate (Relational) Logic 1. Introduction The propositional logic is not powerful enough to express certain types of relationship between propositions.
Linguistic Essentials
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Interpreting Language (with Logic)
Albert Gatt LIN3021 Formal Semantics Lecture 4. In this lecture Compositionality in Natural Langauge revisited: The role of types The typed lambda calculus.
LECTURE 2: SEMANTICS IN LINGUISTICS
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Statistical NLP Spring 2011 Lecture 21: Semantic Roles Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
November 2006Semantics I1 Natural Language Processing Semantics I What is semantics for? Role of FOL Montague Approach.
Lecture 2 (Chapter 2) Introduction to Semantics and Pragmatics.
SYNTAX 1 NOV 9, 2015 – DAY 31 Brain & Language LING NSCI Fall 2015.
Lecture 1 Ling 442.
CMSC 330: Organization of Programming Languages Operational Semantics.
End of the beginning Let’s wrap up some details and be sure we are all on the same page Good way to make friends and be popular.
Meaning and Language Part 1. Plan We will talk about two different types of meaning, corresponding to two different types of objects: –Lexical Semantics:
MENTAL GRAMMAR Language and mind. First half of 20 th cent. – What the main goal of linguistics should be? Behaviorism – Bloomfield: goal of linguistics.
Lec. 10.  In this section we explain which constituents of a sentence are minimally required, and why. We first provide an informal discussion and then.
SYNTAX.
Knowledge Representation Lecture 2 out of 5. Last Week Intelligence needs knowledge We need to represent this knowledge in a way a computer can process.
Natural Language Processing
Language, Logic, and Meaning
CSC 594 Topics in AI – Applied Natural Language Processing
The Study of Meaning in Language
Linguistic Essentials
Natural Language Processing
Presentation transcript:

Formal Semantics Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris Manning, and Dan Klein

Formal Semantics It comes in two flavors: Lexical Semantics: The meaning of words Compositional semantics: How the meaning of individual units combine to form the meaning of larger units

What is meaning Meaning ≠ Dictionary entries Dictionaries define words using words. Circularity!

Reference Referent: the thing/idea in the world that a word refers to Reference: the relationship between a word and its referent

Reference Barack president Obama The president is the commander-in-chief. = Barack Obama is the commander-in-chief.

Reference Barack president Obama I want to be the president. ≠ I want to be Barack Obama.

Reference Tooth fairy? Phoenix? Winner of the 2016 presidential election?

What is meaning? Meaning ≠ Dictionary entries Meaning ≠ Reference

Sense Sense: The mental representation of a word or phrase, independent of its referent.

Sense ≠ Mental Image A word may have different mental images for different people. E.g., “mother” A word may conjure a typical mental image (a prototype), but can signify atypical examples as well.

Sense v. Reference A word/phrase may have sense, but no reference: King of the world The camel in CIS 8538 The greatest integer The A word may have reference, but no sense: Proper names: Dan McCloy, Kristi Krein (who are they?!)

Sense v. Reference A word may have the same referent, but more than one sense: The morning star / the evening star (Venus) A word may have one sense, but multiple referents: Dog, bird

Some semantic relations between words Hyponymy: subclass Poodle < dog Crimson < red Red < color Dance < move Hypernymy: superclass Synonymy: Couch/sofa Manatee / sea cow Antonymy: Dead/alive Married/single

Lexical Decomposition Word sense can be represented with semantic features:

Compositional Semantics

Compositional Semantics The study of how meanings of small units combine to form the meaning of larger units The dog chased the cat ≠ The cat chased the dog. ie, the whole does not equal the sum of the parts. The dog chased the cat = The cat was chased by the dog ie, syntax matters to determining meaning.

Principle of Compositionality The meaning of a sentence is determined by the meaning of its words in conjunction with the way they are syntactically combined.

Exceptions to Compositionality Anomaly: when phrases are well-formed syntactically, but not semantically Colorless green ideas sleep furiously. (Chomsky) That bachelor is pregnant.

Exceptions to Compositionality Metaphor: the use of an expression to refer to something that it does not literally denote in order to suggest a similarity Time is money. The walls have ears.

Exceptions to Compositionality Idioms: Phrases with fixed meanings not composed of literal meanings of the words Kick the bucket = die (*The bucket was kicked by John.) When pigs fly = ‘it will never happen’ (*She suspected pigs might fly tomorrow.) Bite off more than you can chew = ‘to take on too much’ (*He chewed just as much as he bit off.)

Idioms in other languages

Logical Foundations for Compositional Semantics We need a language for expressing the meaning of words, phrases, and sentences Many possible choices; we will focus on First-order predicate logic (FOPL) with types Lambda calculus

Truth-conditional Semantics Linguistic expressions “Bob sings.” Logical translations sings(Bob) but could be p_5789023(a_257890) Denotation: [[bob]] = some specific person (in some context) [[sings(bob)]] = true, in situations where Bob is singing; false, otherwise Types on translations: bob: e(ntity) sings(bob): t(rue or false, a boolean type)

Truth-conditional Semantics Some more complicated logical descriptions of language: “All girls like a video game.” x:e . y:e . girl(x)  [video-game(y)  likes(x,y)] “Alice is a former teacher.” (former(teacher))(Alice) “Alice saw the cat before Bob did.” x:e, y:e, z:e, t1:e, t2:e . cat(x)  see(y)  see(z)  agent(y, Alice)  patient(y, x)  agent(z, Bob)  patient(z, x)  time(y, t1)  time(z, t2)  <(t1, t2)

FOPL Syntax Summary A set of types T = {t1, … } A set of constants C = {c1, …}, each associated with a type from T A set of relations R = {r1, …}, where each ri is a subset of Cn for some n. A set of variables X = {x1, …} , , , , , , ., :

Truth-conditional semantics Proper names: Refer directly to some entity in the world Bob: bob Sentences: Are either t or f Bob sings: sings(bob) So what about verbs and VPs? sings must combine with bob to produce sings(bob) The λ-calculus is a notation for functions whose arguments are not yet filled. sings: λx.sings(x) This is a predicate, a function that returns a truth value. In this case, it takes a single entity as an argument, so we can write its type as e  t Adjectives?

Lambda calculus FOPL + λ (new quantifier) will be our lambda calculus Intuitively, λ is just a way of creating a function E.g., girl() is a relation symbol; but λx . girl(x) is a function that takes one argument. New inference rule: function application (λx . L1(x)) (L2) → L1(L2) E.g., (λx . x2) (3) → 32 E.g., (λx . sings(x)) (Bob) → sings(Bob) Lambda calculus lets us describe the meaning of words individually. Function application (and a few other rules) then lets us combine those meanings to come up with the meaning of larger phrases or sentences.

Compositional Semantics with the λ-calculus So now we have meanings for the words How do we know how to combine the words? Associate a combination rule with each grammar rule: S : β(α)  NP : α VP : β (function application) VP : λx. α(x) ∧ β(x)  VP : α and : ∅ VP : β (intersection) Example:

Composition: Some more examples Transitive verbs: likes : λx.λy.likes(y,x) Two-places predicates, type e(et) VP “likes Amy” : λy.likes(y,Amy) is just a one-place predicate Quantifiers: What does “everyone” mean? Everyone : λf.x.f(x) Some problems: Have to change our NP/VP rule Won’t work for “Amy likes everyone” What about “Everyone likes someone”? Gets tricky quickly!

Composition: Some more examples Indefinites The wrong way: “Bob ate a waffle” : ate(bob,waffle) “Amy ate a waffle” : ate(amy,waffle) Better translation: ∃x.waffle(x) ^ ate(bob, x) What does the translation of “a” have to be? What about “the”? What about “every”?

Denotation What do we do with the logical form? It has fewer (no?) ambiguities Can check the truth-value against a database More usefully: can add new facts, expressed in language, to an existing relational database Question-answering: can check whether a statement in a corpus entails a question-answer pair: “Bob sings and dances”  Q:“Who sings?” has answer A:“Bob” Can chain together facts for story comprehension

Grounding What does the translation likes : λx. λy. likes(y,x) have to do with actual liking? Nothing! (unless the denotation model says it does) Grounding: relating linguistic symbols to perceptual referents Sometimes a connection to a database entry is enough Other times, you might insist on connecting “blue” to the appropriate portion of the visual EM spectrum Or connect “likes” to an emotional sensation Alternative to grounding: meaning postulates You could insist, e.g., that likes(y,x) => knows(y,x)

More representation issues Tense and events In general, you don’t get far with verbs as predicates Better to have event variables e “Alice danced” : danced(Alice) vs. “Alice danced” : ∃e.dance(e)^agent(e, Alice)^(time(e)<now) Event variables let you talk about non-trivial tense/aspect structures: “Alice had been dancing when Bob sneezed”

More representation issues Propositional attitudes (modal logic) “Bob thinks that I am a gummi bear” thinks(bob, gummi(me))? thinks(bob, “He is a gummi bear”)? Usually, the solution involves intensions (^p) which are, roughly, the set of possible worlds in which predicate p is true. thinks(bob, ^gummi(me)) Computationally challenging Each agent has to model every other agent’s mental state This comes up all the time in language – E.g., if you want to talk about what your bill claims that you bought, vs. what you think you bought, vs. what you actually bought.

More representation issues Multiple quantifiers: “In this country, a woman gives birth every 15 minutes. Our job is to find her, and stop her.” -- Groucho Marx Deciding between readings “Bob bought a pumpkin every Halloween.” “Bob put a warning in every window.”

More representation issues Other tricky stuff Adverbs Non-intersective adjectives Generalized quantifiers Generics “Cats like naps.” “The players scored a goal.” Pronouns and anaphora “If you have a dime, put it in the meter.” … etc., etc.

Mapping Sentences to Logical Forms

CCG Parsing Combinatory Categorial Grammar Lexicalized PCFG Categories encode argument sequences A/B means a category that can combine with a B to the right to form an A A \ B means a category that can combine with a B to the left to form an A A syntactic parallel to the lambda calculus

Learning to map sentences to logical form Zettlemoyer and Collins (IJCAI 05, EMNLP 07)

Some Training Examples

CCG Lexicon

Parsing Rules (Combinators) Application Right: X : f(a)  X/Y : f Y : a Left: X : f(a)  Y : a X\Y : f Additional rules: Composition Type-raising

CCG Parsing Example

Parsing a Question

Lexical Generation Input Training Example Sentence: Texas borders Kansas. Logical form: borders(Texas, Kansas)

GENLEX Input: a training example (Si, Li) Computation: Create all substrings of consecutive words in Si Create categories from Li Create lexical entries that are the cross products of these two sets Output: Lexicon Λ

GENLEX Cross Product Input Training Example Sentence: Texas borders Kansas. Logical form: borders(Texas, Kansas) Output Lexicon Output Substrings Texas borders Kansas Texas borders borders Kansas Texas borders Kansas X (cross product) Output Categories NP : texas NP : kansas (S\NP)/NP : λx.λy.borders(y,x)

(S\NP)/NP : λx.λy.borders(y,x) GENLEX Output Lexicon Words Category Texas NP : texas NP : kansas (S\NP)/NP : λx.λy.borders(y,x) borders Borders … Texas borders Kansas

Weighted CCG Given a log-linear model with a CCG lexicon Λ, a feature vector f, and weights w: The best parse is: y* = argmax w ∙ f(x,y) where we consider all possible parses y for the sentence x given the lexicon Λ. y

Parameter Estimation for Weighted CCG Parsing Inputs: Training set {(Si,Li) | i = 1, …, n} Initial lexicon Λ, initial weights w, num. iter. T Computation: For t=1 … T, i = 1 … n: Step 1: Check correctness If y* = argmax w ∙ f(Si,y) is Li, skip to next i Step 2: Lexical generation Set λ = Λ ∪ GENLEX(Si,Li) Let y’ = argmax w ∙ f(Si,y) Define λi to be the lexical entries in y’ Set Λ = Λ ∪ λi Step 3: Update Parameters Let y’’ = argmax w ∙ f(Si,y) If y’’ ≠ Li Set w = w + f(Si, y’) – f(Si,y’’) Output: Lexicon Λ and parameters w y s.t. L(y) = Li y

Example Learned Lexical Entries

Challenge Revisited

Disharmonic Application

Missing Content Words

Missing content-free words

A complete parse

Geo880 Test Set Precision Recall F1 Zettlemoyer & Collins 2007 95.49 83.20 88.93 Zettlemoyer & Collins 2005 96.25 79.29 86.95 Wong & Mooney 2007 93.72 80.00 86.31

Summing Up Hypothesis: Principle of Compositionality Semantics of NL sentences and phrases can be composed from the semantics of their subparts Rules can be derived which map syntactic analysis to semantic representation (Rule-to-Rule Hypothesis) Lambda notation provides a way to extend FOPC to this end But coming up with rule2rule mappings is hard Idioms, metaphors and other non-compositional aspects of language makes things tricky (e.g. fake gun)