On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics Islam Beltagy and Katrin Erk The University of Texas at Austin IWCS 2015.

Slides:



Advertisements
Similar presentations
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Advertisements

Inference Rules Universal Instantiation Existential Generalization
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
Bayesian Abductive Logic Programs Sindhu Raghavan Raymond J. Mooney The University of Texas at Austin 1.
Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Default Reasoning the problem: in FOL, universally-quantified rules cannot have exceptions –  x bird(x)  can_fly(x) –bird(tweety) –bird(opus)  can_fly(opus)
We have seen that we can use Generalized Modus Ponens (GMP) combined with search to see if a fact is entailed from a Knowledge Base. Unfortunately, there.
1 DCP 1172 Introduction to Artificial Intelligence Chang-Sheng Chen Topics Covered: Introduction to Nonmonotonic Logic.
The ancestor problem For the ancestor problem we have the following rules in AnsProlog: anc(X,Y)
Converting formulas into a normal form Consider the following FOL formula stating that a brick is an object which is on another object which is not a pyramid,
Logic.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Department of Computer Science The University of Texas at Austin Probabilistic Abduction using Markov Logic Networks Rohit J. Kate Raymond J. Mooney.
Markov Logic: Combining Logic and Probability Parag Singla Dept. of Computer Science & Engineering Indian Institute of Technology Delhi.
1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.
Islam Beltagy, Cuong Chau, Gemma Boleda, Dan Garrette, Katrin Erk, Raymond Mooney The University of Texas at Austin Richard Montague Andrey Markov Montague.
Montague meets Markov: Combining Logical and Distributional Semantics
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Adbuctive Markov Logic for Plan Recognition Parag Singla & Raymond J. Mooney Dept. of Computer Science University of Texas, Austin.
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
University of Texas at Austin
Inference in Probabilistic Ontologies with Attributive Concept Descriptions and Nominals Rodrigo Bellizia Polastro and Fabio Gagliardi Cozman.
Recursive Random Fields Daniel Lowd University of Washington June 29th, 2006 (Joint work with Pedro Domingos)
Recursive Random Fields Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Lecture 19 Exam: Tuesday June15 4-6pm Overview. General Remarks Expect more questions than before that test your knowledge of the material. (rather then.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Intro to AI Fall 2002 © L. Joskowicz 1 Introduction to Artificial Intelligence LECTURE 11: Nonmonotonic Reasoning Motivation: beyond FOL + resolution Closed-world.
KNOWLEDGE REPRESENTATION, REASONING AND DECLARATIVE PROBLEM SOLVING Chitta Baral Arizona State University Tempe, AZ
Propositional Logic Reasoning correctly computationally Chapter 7 or 8.
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Web Query Disambiguation from Short Sessions Lilyana Mihalkova* and Raymond Mooney University of Texas at Austin *Now at University of Maryland College.
Ming Fang 6/12/2009. Outlines  Classical logics  Introduction to DL  Syntax of DL  Semantics of DL  KR in DL  Reasoning in DL  Applications.
Markov Logic And other SRL Approaches
Chapter 1, Part II: Predicate Logic With Question/Answer Animations.
The Bernays-Schönfinkel Fragment of First-Order Autoepistemic Logic Peter Baumgartner MPI Informatik, Saarbrücken.
1 Knowledge Representation. 2 Definitions Knowledge Base Knowledge Base A set of representations of facts about the world. A set of representations of.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
Chapter 1, Part II: Predicate Logic With Question/Answer Animations.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.
1 Predicate (Relational) Logic 1. Introduction The propositional logic is not powerful enough to express certain types of relationship between propositions.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
Modeling Speech Acts and Joint Intentions in Modal Markov Logic Henry Kautz University of Washington.
Chapter 2 Logic 2.1 Statements 2.2 The Negation of a Statement 2.3 The Disjunction and Conjunction of Statements 2.4 The Implication 2.5 More on Implications.
Uncertainty in AI. Birds can fly, right? Seems like common sense knowledge.
C. Varela1 Logic Programming (PLP 11) Predicate Calculus, Horn Clauses, Clocksin-Mellish Procedure Carlos Varela Rennselaer Polytechnic Institute November.
Natural Language Semantics using Probabilistic Logic Islam Beltagy Doctoral Dissertation Proposal Supervising Professors: Raymond J. Mooney, Katrin Erk.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 Nov, 23, 2015 Slide source: from Pedro Domingos UW.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
Markov Logic: A Representation Language for Natural Language Semantics Pedro Domingos Dept. Computer Science & Eng. University of Washington (Based on.
Lecture 041 Predicate Calculus Learning outcomes Students are able to: 1. Evaluate predicate 2. Translate predicate into human language and vice versa.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
Scalable Statistical Relational Learning for NLP William Y. Wang William W. Cohen Machine Learning Dept and Language Technologies Inst. joint work with:
1 11 Natural Language Semantics Combining Logical and Distributional Methods using Probabilistic Logic Raymond J. Mooney Katrin Erk Islam Beltagy, Stephen.
Happy Mittal Advisor : Parag Singla IIT Delhi Lifted Inference Rules With Constraints.
New Rules for Domain Independent Lifted MAP Inference
Scalable Statistical Relational Learning for NLP
Limitations of First-Order Logic
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Natural Language Semantics using Probabilistic Logic
Carlos Varela Rensselaer Polytechnic Institute November 10, 2017
Computer Security: Art and Science, 2nd Edition
Predicates and Quantifiers
Sanjna Kashyap 11th March 2019
Presentation transcript:

On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics Islam Beltagy and Katrin Erk The University of Texas at Austin IWCS 2015

Logic-based Semantics First-order logic and theorem proving Deep semantic representation: –Negation, Quantifiers, Conjunction, Disjunction …. 2

Probabilistic Logic Semantics Logic + Reasoning with Uncertainty –Confidence rating of Word Sense Disambiguation –Weight of Paraphrase rules –Distributional similarity values [Beltagy et al., 2013] baby  toddler | w 1 eating doll  playing with a toy | w 2 –... 3

Probabilistic Logic Semantics Quantifiers and Negations do not work as expected Domain Closure Assumption: finite domain –Problems with quantifiers –“Tweety is a bird and it flies”  “All birds fly” Closed-World Assumption: low prior probabilities –Problems with negations –“All birds fly”  “The sky is not blue” 4

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks (MLNs) –Recognizing Textual Entailment (RTE) Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 5

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 6

7 Probabilistic Logic Frameworks that combine logical and statistical knowledge [Nilsson, 1986], [Getoor and Taskar, 2007] Use weighted first-order logic rules –Weighted rules are soft rules (compared to hard logical constraints) Provide a mechanism for probabilistic inference: P(Q|E, KB) Bayesian Logic Programs (BLP) [Kersting & De Raedt, 2001] Markov Logic Networks (MLN) [Richardson and Domingos, 2006] Probabilistic Soft Logic (PSL) [Kimmig et al., NIPS 2012]

Markov Logic Networks [Richardson and Domingos, 2006]  x. smoke(x)  cancer(x) | 1.5  x,y. friend(x,y)  (smoke(x)  smoke(y)) | 1.1 Two constants: Anna (A) and Bob (B) P(Cancer(Anna) | Friends(Anna,Bob), Smokes(Bob)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B) 8

Markov Logic Networks [Richardson and Domingos, 2006] Probability Mass Function (PMF) Inference: calculate probability of atoms given evidence set –P(Cancer(Anna) | Friends(Anna,Bob), Smokes(Bob)) Weight of formula i No. of true groundings of formula i in x Normalization constant a possible truth assignment 9 the set of all atoms

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 10

Recognizing Textual Entailment (RTE) RTE requires deep semantic understanding [Dagan et al., 2013] Given two sentences Text ( T) and Hypothesis ( H), finding if T Entails, Contradicts or not related (Neutral) to H 11

Recognizing Textual Entailment (RTE) Examples (from the SICK dataset) [Marelli et al., 2014] –Entailment: T: “A man is walking through the woods. H: “A man is walking through a wooded area.” –Contradiction: T: “A man is jumping into an empty pool.” H: “A man is jumping into a full pool.” –Neutral: T: “A young girl is dancing.” H: “A young girl is standing on one leg.” 12

Recognizing Textual Entailment (RTE) Translate sentences to logic using Boxer [Bos 2008] T: John is driving a car  x,y,z. john(x)  agent(y, x)  drive(y)  patient(y, z)  car(z) H: John is driving a vehicle  x,y,z. john(x)  agent(y, x)  drive(y)  patient(y, z)  vehicle(z) KB: (collected from difference sources)  x. car(x)  vehicle(x) | w P(H|T, KB) 13

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 14

Domain Closure Assumption (DCA) There are no objects in the world other than the named constants (Finite Domain) e.g.  x. smoke(x)  cancer(x) | 1.5  x,y. friend(x,y)  (smoke(x)  smoke(y)) | 1.1 Two constants: Anna (A) and Bob (B) 15 Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B) Ground Atoms

Domain Closure Assumption (DCA) There are no objects in the universe other than the named constants (Finite Domain) –Constants need to be explicitly added –Universal quantifiers do not behave as expected because of finite domain –e.g. “Tweety is a bird and it flies”  “All birds fly” P(H|T,KB)TH  SkolemizationNo problems  ExistenceUniversals in H 16

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Skolemization:  in T Existence:  in T Universals in Hypothesis :  in H Closed-World Assumption Evaluation Future work and Conclusion 17

Skolemization (  in T ) Explicitly introducing constants T:  x,y. john(x)  agent(y, x)  eat(y) Skolemized T: john(J)  agent(T, J)  eat(T) Embedded existentials –T :  x. bird(x)   y. agent(y, x)  fly(y) –Skolemized T:  x. bird(x)  agent(f(x), x)  fly(f(x)) –Simulate skolem functions –  x. bird(x)   y. skolem f (x,y)  agent(y, x)  fly(y) –skolem f (B1, C1), skolem f (B2, C2) … 18

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Skolemization:  in T Existence:  in T Universals in Hypothesis :  in H Closed-World Assumption Evaluation Future work and Conclusion 19

Existence (  in T ) T: All birds fly H: Some birds fly Logically, T ⇏ H but pragmatically it does –“All birds fly” presupposes that “there exist birds” Solution: simulate this existential presupposition –From parse tree, Q(restrictor, body) –“All birds fly” becomes: all(bird, fly) –Introduce additional evidence for the restrictor bird(B) 20

Existence (  in T ) Negated Existential –T: No bird flies = no (bird, fly)   x,y. bird(x)  agent(y, x)  fly(y)  x. bird(x)    y. agent(y, x)  fly(y) –Additional evidence bird(B) Exception –T: There are no birds   x. bird(x) –No additional evidence because the existence presupposition is explicitly negated 21

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Skolemization:  in T Existence:  in T Universals in Hypothesis :  in H Closed-World Assumption Evaluation Future work and Conclusion 22

Universals in Hypothesis (  in H ) T: Tweety is a bird, and Tweety flies bird(Tweety)  agent(F, Tweety)  fly (F) H: All birds fly  x. bird(x)   y. agent(y, x)  fly(y) T  H because universal quantifiers work only on the constants of the given finite domain Solution: –As in Existence, add evidence for the restrictor: bird(Woody) –If the new bird can be shown to fly, then there is an explicit universal quantification in T 23

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 24

Closed-World Assumption (CWA) The assumption that everything (all ground atoms) have very low prior probability CWA fits the RTE task because: –In the world, most things are false –Inference results are less sensitive to the domain size –Enable inference optimization [Beltagy and Mooney, 2014] 25

Closed-World Assumption (CWA) Because of CWA, negated H comes true regardless of T H :   x,y. bird(x)  agent(y, x)  fly(y) Solution –Add positive evidence that contradicts the negated parts of H –A set of ground atoms with high prior probability (in contrast with low prior probability on all other ground atoms) –R: bird(B)  agent(F, B)  fly(F) | w=1.5 –P(H| CWA)  1 –P(H|R, CWA)  0 26

Closed-World Assumption (CWA) Entailing example: T: No bird flies:   x,y. bird(x)  agent(y, x)  fly(y) H: No penguin flies:   x,y. penguin(x)  agent(y, x)  fly(y) R: penguin(P)  agent(F, P)  fly(F) | w=1.5 KB:  x. penguin(x)  bird(x) P(H|T, R, KB) = 1 T  KB contradicts R, which lets H be true. 27

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 28

29 Evaluation Probabilistic Logic Framework: Markov Logic Network –Proposed handling of DCA and CWA applies to other Probabilistic Logic frameworks that make similar assumptions, e.g, PSL (Probabilistic Soft Logic) Evaluation Task: RTE –Proposed handling of DCA and CWA applies to other tasks where the logical formulas have existential and universal quantifiers, e.g, STS (Textual Similarity) and Question Answering

30 Evaluation 1)Synthetic Dataset Template: Q 1 NP 1 V Q 2 NP 2 = Q 1 (NP 1, Q 2 (NP 2,V)) Example –T: No man eats all food –H: Some hungry men eat not all delicious food

Evaluation 31 1) Synthetic Dataset Dataset size: 952 Neutral + 72 Entail = 1024

32 Detection of Contradiction Entailment: P(H| T, KB, W t,h ) Contradiction: P(  H| T, KB, W t,  h ) World configuration: Domain size Prior probabilities

33 Evaluation 2) Sentences Involving Compositional Knowledge (SICK) [Marelli et al., SemEval 2014] –10,000 pairs of sentences annotated as Entail, Contradict or Neutral

34 Evaluation 3) FraCas [Cooper et al., 1996]: hand-built entailments pairs –We evaluate of the first section (out of 9 sections) –Unsupported quantifiers (few, most, many, at least) (28/74 pairs)

Outline Probabilistic Logic Semantics (overview of previous work) –Markov Logic Networks –Recognizing Textual Entailment Domain Closure Assumption –Definition –Inference problems with Quantifiers Closed-World Assumption Evaluation Future work and Conclusion 35

Future Work Generalized Quantifiers: How to extend this work to generalized quantifiers like Few and Most 36

37 Conclusion Domain Closure Assumption, its implication on the probabilistic logic inferences, and how to formulate the RTE problem in a way that we get the expected inferences Closed-World Assumption, why we make that assumption, and what its effect on the negation, and how to formulate the RTE problem to get correct inferences.

Thank You 38