1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.

Slides:

Advertisements

Similar presentations

Unsupervised Semantic Parsing

Advertisements

1 Unsupervised Ontology Induction From Text Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)

Learning to Interpret Natural Language Instructions Monica Babeş-Vroman +, James MacGlashan *, Ruoyuan Gao +, Kevin Winner * Richard Adjogah *, Marie desJardins.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.

Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.

Coreference Based Event-Argument Relation Extraction on Biomedical Text Katsumasa Yoshikawa 1), Sebastian Riedel 2), Tsutomu Hirao 3), Masayuki Asahara.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

Relational Representations Daniel Lowd University of Oregon April 20, 2015.

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

Global Learning of Type Entailment Rules Jonathan Berant, Ido Dagan, Jacob Goldberger June 21 st, 2011.

Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u

School of Computing Science Simon Fraser University Vancouver, Canada.

CPSC Compiler Tutorial 9 Review of Compiler.

Relational Learning of Pattern-Match Rules for Information Extraction Mary Elaine Califf Raymond J. Mooney.

Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:

Scalable Text Mining with Sparse Generative Models

1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

ANSWERING CONTROLLED NATURAL LANGUAGE QUERIES USING ANSWER SET PROGRAMMING Syeed Ibn Faiz.

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

Pedro Domingos Dept. of Computer Science & Eng.

1 Statistical Relational Learning for Knowledge Extraction from the Web Hoifung Poon Dept. of Computer Science & Eng. University of Washington 1.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

Markov Logic And other SRL Approaches

Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.

Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.

Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.

BioSnowball: Automated Population of Wikis (KDD ‘10) Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/11/30 1.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.

1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.

What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.

XML Access Control Koukis Dimitris Padeleris Pashalis.

A Quantitative Trust Model for Negotiating Agents A Quantitative Trust Model for Negotiating Agents Jamal Bentahar, John Jules Ch. Meyer Concordia University.

By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.

Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web Danushka Bollegala Yutaka Matsuo Mitsuru Ishizuka International.

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.

1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

Concept-based P2P Search How to find more relevant documents Ingmar Weber Max-Planck-Institute for Computer Science Joint work with Holger Bast Torino,

Natural Language Generation with Tree Conditional Random Fields Wei Lu, Hwee Tou Ng, Wee Sun Lee Singapore-MIT Alliance National University of Singapore.

Interpreters and Higher-Order Functions CSE 413 Autumn 2008 Credit: CSE341 notes by Dan Grossman.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.

NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.

Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.

Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.

Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.

Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.

Neural Machine Translation

Lecture 7: Constrained Conditional Models

Learning Deep Generative Models by Ruslan Salakhutdinov

Linguistic Graph Similarity for News Sentence Searching

Web News Sentence Searching Using Linguistic Graph Similarity

Semantic Parsing for Question Answering

Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin

Implementing Language Extensions with Model Transformations

المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen

Implementing Language Extensions with Model Transformations

Precise Condition Synthesis for Program Repair

Probabilistic Databases with MarkoViews

Representations & Reasoning Systems (RRS) (2.2)

Presentation transcript:

1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong

2 Outline Motivation Unsupervised semantic parsing Learning and inference Conclusion

3 Semantic Parsing Natural language text  Formal and detailed meaning representation (MR) Also called logical form Standard MR language: First-order logic E.g., Microsoft buys Powerset. BUYS(MICROSOFT,POWERSET)

4 Shallow Semantic Processing Semantic role labeling Given a relation, identify arguments E.g., agent, theme, instrument Information extraction Identify fillers for a fixed relational template E.g., seminar (speaker, location, time) In contrast, semantic parsing is Formal: Supports reasoning and decision making Detailed: Obtains far more information

5 Supervised Learning User provides: Target predicates and objects Example sentences with meaning annotation System learns grammar and produces parser Examples: Zelle & Mooney [1993] Zettlemoyer & Collins [2005, 2007, 2009] Wong & Mooney [2007] Lu et al. [2008] Ge & Mooney [2009]

6 Limitations of Supervised Approaches Applicable to restricted domains only For general text Not clear what predicates and objects to use Hard to produce consistent meaning annotation Crucial to develop unsupervised methods Also, often learn both syntax and semantics Fail to leverage advanced syntactic parsers Make semantic parsing harder

7 Unsupervised Approaches For shallow semantic tasks, e.g.: Open IE: TextRunner [Banko et al. 2007] Paraphrases: DIRT [Lin & Pantel 2001] Semantic networks: SNE [Kok & Domingos 2008] Show promise of unsupervised methods But … none for semantic parsing

8 This Talk: USP First unsupervised approach for semantic parsing Based on Markov Logic [Richardson & Domingos, 2006] Sole input is dependency trees Can be used in general domains Applied it to extract knowledge from biomedical abstracts and answer questions Substantially outperforms TextRunner, DIRT

9 Outline Motivation Unsupervised semantic parsing Learning and inference Conclusion

10 USP: Key Idea # 1 Target predicates and objects can be learned Viewed as clusters of syntactic or lexical variations of the same meaning BUYS(-,-)   buys, acquires, ’s purchase of, …   Cluster of various expressions for acquisition MICROSOFT   Microsoft, the Redmond software giant, …   Cluster of various mentions of Microsoft

11 USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster arbitrary expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, …

12 USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster same forms at the atom level

13 USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster forms in composition with same forms

14 USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster forms in composition with same forms

15 USP: Key Idea # 2 Relational clustering  Cluster relations with same objects USP  Recursively cluster expressions with similar subexpressions Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … Cluster forms in composition with same forms

16 USP: Key Idea # 3 Start directly from syntactic analyses Focus on translating them to semantics Leverage rapid progress in syntactic parsing Much easier than learning both

17 USP: System Overview Input: Dependency trees for sentences Converts dependency trees into quasi-logical forms (QLFs) QLF subformulas have natural lambda forms Starts with lambda-form clusters at atom level Recursively builds up clusters of larger forms Output: Probability distribution over lambda-form clusters and their composition MAP semantic parses of sentences

18 Probabilistic Model for USP Joint probability distribution over a set of QLFs and their semantic parses Use Markov logic A Markov Logic Network (MLN) is a set of pairs ( F i, w i ) where F i is a formula in first-order logic w i is a real number

19 Markov Logical Networks undirected graph model log linear model nsubj(n 1,n 2 ) Microsoft(n 2 )buys(n 1 ) Number of true groundings of F i

20 Generating Quasi-Logical Forms buys Microsoft Powerset nsubjdobj Convert each node into an unary atom

21 Generating Quasi-Logical Forms nsubjdobj n 1, n 2, n 3 are Skolem constants buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 )

22 Generating Quasi-Logical Forms nsubjdobj Convert each edge into a binary atom buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 )

23 Generating Quasi-Logical Forms Convert each edge into a binary atom buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 ) nsubj(n 1,n 2 )dobj(n 1,n 3 )

24 A Semantic Parse buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 ) nsubj(n 1,n 2 )dobj(n 1,n 3 ) Partition QLF into subformulas

25 A Semantic Parse buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 ) nsubj(n 1,n 2 )dobj(n 1,n 3 ) Subformula  Lambda form: Replace Skolem constant not in unary atom with a unique lambda variable

26 A Semantic Parse buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 ) λx 2.nsubj(n 1,x 2 ) Subformula  Lambda form: Replace Skolem constant not in unary atom with a unique lambda variable λx 3.dobj(n 1,x 3 )

27 A Semantic Parse buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 ) λx 2.nsubj(n 1,x 2 ) Follow Davidsonian Semantics Core form: No lambda variable Argument form: One lambda variable λx 3.dobj(n 1,x 3 ) Core form Argument form

28 A Semantic Parse buys(n 1 ) Microsoft(n 2 ) Powerset(n 3 ) λx 2.nsubj(n 1,x 2 ) Assign subformula to lambda-form cluster λx 3.dobj(n 1,x 3 )  C BUYS  C MICROSOFT  C POWERSET

29 Lambda-Form Cluster buys(n 1 ) Distribution over core forms C BUYS 0.1 acquires(n 1 ) 0.2 …… One formula in MLN Learn weights for each pair of cluster and core form

30 Lambda-Form Cluster buys(n 1 ) May contain variable number of argument types C BUYS 0.1 acquires(n 1 ) 0.2 …… A BUYER A BOUGHT A PRICE ……

31 Argument Type: A BUYER λx 2.nsubj(n 1,x 2 ) Distributions over argument forms, clusters, and number …… C MICROSOFT 0.2 C GOOGLE 0.1 …… None 0.1 One 0.8 …… λx 2.agent(n 1,x 2 ) Three MLN formulas

32 Abstract Lambda Form buys(n 1 )  λx 2.nsubj(n 1,x 2 )  λx 3.dobj(n 1,x 3 ) C BUYS (n 1 )  λx 2. A BUYER (n 1,x 2 )  λx 3. A BOUGHT (n 1,x 3 ) Final logical form is obtained via lambda reduction

33 Outline Motivation Unsupervised semantic parsing Learning and inference Conclusion

34 Learning Observed: Q (QLFs) Hidden: S (semantic parses) Maximizes log-likelihood of observing the QLFs

35 Search Operators MERGE(C 1, C 2 ) : Merge clusters C 1, C 2 E.g.:  buys ,  acquires    buys, acquires  COMPOSE(C 1, C 2 ) : Create a new cluster resulting from composing lambda forms in C 1, C 2 E.g.:  Microsoft ,  Corporation    Microsoft Corporation 

36 USP-Learn Initialization: Partition  Atoms Greedy step: Evaluate search operations and execute the one with highest gain in log-likelihood Efficient implementation: Inverted index, etc.

37 Search Operations Four Operations Examples

38 MAP Semantic Parse Goal: Given QLF Q and learned Θ, find semantic parse S to maximize P Θ (Q, S) Again, use greedy search

39 Experiments etc…