Islam Beltagy, Cuong Chau, Gemma Boleda, Dan Garrette, Katrin Erk, Raymond Mooney The University of Texas at Austin Richard Montague Andrey Markov Montague.

Slides:



Advertisements
Similar presentations
Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Advertisements

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
Bayesian Abductive Logic Programs Sindhu Raghavan Raymond J. Mooney The University of Texas at Austin 1.
Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Fast Algorithms For Hierarchical Range Histogram Constructions
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.
Logic.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Department of Computer Science The University of Texas at Austin Probabilistic Abduction using Markov Logic Networks Rohit J. Kate Raymond J. Mooney.
Markov Logic: Combining Logic and Probability Parag Singla Dept. of Computer Science & Engineering Indian Institute of Technology Delhi.
1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.
Montague meets Markov: Combining Logical and Distributional Semantics
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network Jude Shavlik Sriraam Natarajan Computer.
Components for a semantic textual similarity system Focus on word and sentence similarity Formal side: define similarity in principle.
Adbuctive Markov Logic for Plan Recognition Parag Singla & Raymond J. Mooney Dept. of Computer Science University of Texas, Austin.
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
University of Texas at Austin
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference.
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
1 Human Detection under Partial Occlusions using Markov Logic Networks Raghuraman Gopalan and William Schwartz Center for Automation Research University.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics Islam Beltagy and Katrin Erk The University of Texas at Austin IWCS 2015.
1 Learning the Structure of Markov Logic Networks Stanley Kok & Pedro Domingos Dept. of Computer Science and Eng. University of Washington.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Overview of the Fourth Recognising Textual Entailment Challenge NIST-Nov. 17, 2008TAC Danilo Giampiccolo (coordinator, CELCT) Hoa Trang Dan (NIST)
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Dan Piett STAT West Virginia University
Web Query Disambiguation from Short Sessions Lilyana Mihalkova* and Raymond Mooney University of Texas at Austin *Now at University of Maryland College.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Markov Logic And other SRL Approaches
Transfer in Reinforcement Learning via Markov Logic Networks Lisa Torrey, Jude Shavlik, Sriraam Natarajan, Pavan Kuppili, Trevor Walker University of Wisconsin-Madison,
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
BioSnowball: Automated Population of Wikis (KDD ‘10) Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/11/30 1.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
The interface between model-theoretic and corpus-based semantics
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
CS6133 Software Specification and Verification
CPSC 322, Lecture 31Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 25, 2015 Slide source: from Pedro Domingos UW & Markov.
Natural Language Semantics using Probabilistic Logic Islam Beltagy Doctoral Dissertation Proposal Supervising Professors: Raymond J. Mooney, Katrin Erk.
CPSC 422, Lecture 32Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 32 Nov, 27, 2015 Slide source: from Pedro Domingos UW & Markov.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 Nov, 23, 2015 Slide source: from Pedro Domingos UW.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
1 First Order Logic CS 171/271 (Chapter 8) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Markov Logic: A Representation Language for Natural Language Semantics Pedro Domingos Dept. Computer Science & Eng. University of Washington (Based on.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
NLP. Introduction to NLP What is the meaning of: (5+2)*(4+3)? Parse tree N N N N + + E E E E F F* FE E E 49.
Scalable Statistical Relational Learning for NLP William Y. Wang William W. Cohen Machine Learning Dept and Language Technologies Inst. joint work with:
Recognising Textual Entailment Johan Bos School of Informatics University of Edinburgh Scotland,UK.
1 11 Natural Language Semantics Combining Logical and Distributional Methods using Probabilistic Logic Raymond J. Mooney Katrin Erk Islam Beltagy, Stephen.
New Rules for Domain Independent Lifted MAP Inference
More precise fuzziness, more fuzzy precision
Scalable Statistical Relational Learning for NLP
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Vector-Space (Distributional) Lexical Semantics
Ensembling Diverse Approaches to Question Answering
Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.
Sanjna Kashyap 11th March 2019
Presentation transcript:

Islam Beltagy, Cuong Chau, Gemma Boleda, Dan Garrette, Katrin Erk, Raymond Mooney The University of Texas at Austin Richard Montague Andrey Markov Montague Meets Markov: Deep Semantics with Probabilistic Logical Form

Semantic Representations Formal Semantics – Uses first-order logic – Deep – Brittle Distributional Semantics – Statistical method – Robust – Shallow 2 Goal: combine advantages of both logical and distributional semantics in one framework

Semantic Representations Combining both logical and distributional semantics – Represent meaning using a probabilistic logic (in contrast with standard first-order logic) Markov Logic Network (MLN) – Generate soft inference rules From distributional semantics 3  x hamster(x) → gerbil(x) | f(w)

Agenda Introduction Background: MLN RTE STS Future work and Conclusion 4

Agenda Introduction Background: MLN RTE STS Future work and Conclusion 5

Markov Logic Networks [Richardson & Domingos, 2006] MLN: Soft FOL – Weighted rules FOL rules Rules weights 6

Markov Logic Networks [Richardson & Domingos, 2006] Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B) MLN: Template for constructing Markov networks Two constants: Anna (A) and Bob (B) 7

Markov Logic Networks [Richardson & Domingos, 2006] Probability Mass Function (PMF) Inference: calculate probability of atoms – P(Cancer(Anna) | Friends(Anna,Bob), Smokes(Bob)) Weight of formula i No. of true groundings of formula i in x Normalization constant a possible truth assignment 8

Agenda Introduction Background: MLN RTE STS Future work and Conclusion 9

Recognizing Textual Entailment (RTE) Given two sentences, a premise and a hypothesis, does the first entails the second ? e.g – Premise: “A male gorilla escaped from his cage in Berlin zoo and sent terrified visitors running for cover, the zoo said yesterday.” – Hypothesis: “A gorilla escaped from his cage in a zoo in Germany. ” – Entails: true 10

System Architecture 11 Sent1 BOXER Rule Base result Sent2 LF1 LF2 Dist. Rule Constructor Vector Space ALCHEMY MLN Inference BOXER [Bos, et al. 2004]: maps sentences to logical form Distributional Rule constructor: generates relevant soft inference rules based on distributional similarity ALCHEMY: probabilistic MLN inference Result: degree of entailment

Sample Logical Forms Premise: “A man is cutting pickles” –  x,y,z ( man(x) ^ cut(y) ^ agent(y, x) ^ pickles(z) ^ patient(y, z)) Hypothesis: “A guy is slicing cucumber” –  x,y,z ( guy(x) ^ slice(y) ^ agent(y, x) ^ cucumber(z) ^ patient(y, z) ) Hypothesis in the query form – analogy to negated hypothesis in standard theorem proving –  x,y,z ( guy(x) ^ slice(y) ^ agent(y, x) ^ cucumber(z) ^ patient(y, z) → result()) Query – result() [Degree of Entailment] 12

13 Distributional Lexical Rules For every pair of words (a, b) where a is in S1 and b is in S2 add a soft rule relating the two –  x a(x) → b(x) | wt(a, b) – wt(a, b) = f( cos(a, b) ) Premise: “A man is cutting pickles” Hypothesis: “A guy is slicing cucumber” –  x man(x) → guy(x)| wt(man, guy) –  x cut(x) → slice(x)| wt(cut, slice) –  x pickle(x) → cucumber(x)| wt(pickle, cucumber) →

Distributional Phrase Rules Premise: “A boy is playing” Hypothesis:“A little boy is playing” Need rules for phrases –  x boy(x) → little(x) ^ boy(x) | wt(boy, "little boy") Compute vectors for phrases using vector addition [Mitchell & Lapata, 2010] – "little boy" = little + boy 14

15 Preliminary Results: RTE-1(2005) System Accuracy Logic only: [Bos & Markert, 2005]52% Our System57%

Agenda Introduction Background: MLN RTE STS Future work and Conclusion 16

Semantic Textual Similarity (STS) Rate the semantic similarity of two sentences on a 0 to 5 scale Gold standards are averaged over multiple human judgments Evaluate by measuring correlation to human rating S1S2score A man is slicing a cucumberA guy is cutting a cucumber5 A man is slicing a cucumberA guy is cutting a zucchini4 A man is slicing a cucumberA woman is cooking a zucchini3 A man is slicing a cucumberA monkey is riding a bicycle1 17

Softening Conjunction for STS 18 Logical conjunctions requires satisfying all conjuncts to satisfy the clause, which is too strict for STS Hypothesis: –  x,y,z ( guy(x) ^ cut(y) ^ agent(y, x) ^ cucumber(z) ^ patient(y, z) → result()) Break the sentence into “micro-clauses” then combine them using an “averaging combiner” [Natarajan et al., 2010] Becomes: –  x,y,z guy(x) ^ agent(y, x)→ result() –  x,y,z cut(y) ^ agent(y, x)→ result() –  x,y,z cut(y) ^ patient(y, z) → result() –  x,y,z cucumber(z) ^ patient(y, z) → result()

Preliminary Results: STS Microsoft video description corpus – Sentence pairs given human 0-5 rating – 1,500 pairs equally split into training/test System Pearson r Our System with no distributional rules [Logic only]0.52 Our System with lexical rules0.60 Our System with lexical and phrase rules0.73 Vector Addition [Distributional only]0.78 Ensemble our best score with vector addition0.85 Best system in STS 2012 (large ensemble)0.87

Agenda Introduction Background: MLN RTE STS Future work and Conclusion 20

21 Future Work Scale MLN inference to longer and more complex sentences Use multiple parses to reduce impact of parse errors Better Rule base – Vector space methods for asymmetric weights wt(cucumber→vegetable) > wt(vegetable→cucumber) – Inference rules from existing paraphrase collections – More sophisticated phrase vectors

Conclusion 22 Using MLN to represent semantics Combining both logical and distributional approaches – Deep semantics: represent sentences using logic – Robust system: Probabilistic logic and Soft inference rule Wide coverage of distributional semantics

Thank You