1 Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques Ruifang Ge Ph.D. Final Defense Supervisor: Raymond J. Mooney Machine Learning.

Slides:

Advertisements

Similar presentations

Albert Gatt Corpora and Statistical Methods Lecture 11.

Advertisements

Learning Accurate, Compact, and Interpretable Tree Annotation Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Miloš.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning Semantic Parsers Using Statistical.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

1 Natural Language Processing COMPSCI 423/723 Rohit Kate.

A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.

1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.

NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.

PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.

111 CS 388: Natural Language Processing: Semantic Parsing Raymond J. Mooney University of Texas at Austin.

Partial Prebracketing to Improve Parser Performance John Judge NCLT Seminar Series 7 th December 2005.

Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.

Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.

Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.

Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University

Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

1 Learning to Interpret Natural Language Navigation Instructions from Observation Ray Mooney Department of Computer Science University of Texas at Austin.

Semantic Parsing: The Task, the State of the Art and the Future

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars Kewei TuVasant Honavar Departments of Statistics and Computer Science University.

1 Statistical NLP: Lecture 10 Lexical Acquisition.

BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.

Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.

For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.

Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning Language Semantics from Ambiguous Supervision Rohit J. Kate.

A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science.

Spring /22/071 Beyond PCFGs Chris Brew Ohio State University.

The Impact of Grammar Enhancement on Semantic Resources Induction Luca Dini Giampaolo Mazzini

1 Statistical Parsing Chapter 14 October 2012 Lecture #9.

Learning to Transform Natural to Formal Language Presented by Ping Zhang Rohit J. Kate, Yuk Wah Wong, and Raymond J. Mooney.

An Extended GHKM Algorithm for Inducing λ-SCFG Peng Li Tsinghua University.

1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa

1 CS546: Machine Learning and Natural Language Latent-Variable Models for Structured Prediction Problems: Syntactic Parsing Slides / Figures from Slav.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart

11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.

1 David Chen & Raymond Mooney Department of Computer Sciences University of Texas at Austin Learning to Sportscast: A Test of Grounded Language Acquisition.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning for Semantic Parsing Raymond.

PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.

For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.

Supertagging CMSC Natural Language Processing January 31, 2006.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.

Natural Language Generation with Tree Conditional Random Fields Wei Lu, Hwee Tou Ng, Wee Sun Lee Singapore-MIT Alliance National University of Singapore.

Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning for Semantic Parsing of Natural.

Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning for Semantic Parsing of Natural.

Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.

Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:

Natural Language Processing Vasile Rus

A Kernel-based Approach to Learning Semantic Parsers

CSC 594 Topics in AI – Natural Language Processing

Authorship Attribution Using Probabilistic Context-Free Grammars

Semantic Parsing for Question Answering

Using String-Kernels for Learning Semantic Parsers

Learning to Transform Natural to Formal Languages

CS 388: Natural Language Processing: Statistical Parsing

CS 388: Natural Language Processing: Semantic Parsing

Learning to Parse Database Queries Using Inductive Logic Programming

Learning to Sportscast: A Test of Grounded Language Acquisition

Presentation transcript:

1 Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques Ruifang Ge Ph.D. Final Defense Supervisor: Raymond J. Mooney Machine Learning Group Department of Computer Science The University of Texas at Austin

2 Semantic Parsing Semantic Parsing: Transforming natural language (NL) sentences into completely formal meaning representations (MRs) Sample application domains where MRs are directly executable by another computer system to perform some task CLang: Robocup Coach Language Geoquery: A Database Query Application

3 CLang (RoboCup Coach Language) In RoboCup Coach competition, teams compete to coach simulated players The coaching instructions are given in a formal language called CLang Simulated soccer field Coach If our player 2 has the ball, then position our player 5 in the midfield. CLang ((bowner (player our {2})) (do (player our {5}) (pos (midfield)))) Semantic Parsing

4 GeoQuery: A Database Query Application Query application for U.S. geography database [Zelle & Mooney, 1996] User What are the rivers in Texas? Semantic Parsing DataBase Angelina, Blanco, … Query answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))

5 Motivation for Semantic Parsing Theoretically, it answers the question of how people interpret language Practical applications Question answering Natural language interface Knowledge acquisition Reasoning

6 Motivating Example Semantic parsing is a compositional process. Sentence structures are needed for building meaning representations. ((bowner (player our {2})) (do our {4} (pos (half our)))) If our player 2 has the ball, our player 4 should stay in our half bowner: ball owner pos: position

7 Syntax-Based Approaches Meaning composition follows the tree structure of a syntactic parse Composing the meaning of a constituent from the meanings of its sub-constituents in a syntactic parse Hand-built approaches (Woods, 1970, Warren and Pereira, 1982) Learned approaches Miller et al. (1996): Conceptually simple sentences Zettlemoyer & Collins (2005)): hand-built Combinatory Categorial Grammar (CCG) template rules

8 Example ourplayer 2 has theball PRP$NNCDVB DTNN NP VPNP S MR: bowner(player(our,2)) Use the structure of a syntactic parse

9 ourplayer 2 has theball PRP$-ourNN-player(_,_)CD-2VB-bowner(_) DT-nullNN-null NP VPNP S Example MR: bowner(player(our,2)) Assign semantic concepts to words

10 ourplayer 2 has theball PRP$-ourNN-player(_,_)CD-2VB-bowner(_) DT-nullNN-null NP VP NP-player(our,2) S Example MR: bowner(player(our,2)) Compose meaning for the internal nodes

11 ourplayer 2 has theball PRP$-ourNN-player(_,_)CD-2VB-bowner(_) DT-nullNN-null NP-null VP-bowner(_) NP-player(our,2) S Example MR: bowner(player(our,2)) Compose meaning for the internal nodes

12 ourplayer 2 has theball PRP$-ourNN-player(_,_)CD-2VB-bowner(_) DT-nullNN-null NP-null VP-bowner(_)NP-player(our,2) S-bowner(player(our,2)) Example MR: bowner(player(our,2)) Compose meaning for the internal nodes

13 Semantic Grammars Non-terminals in a semantic grammar correspond to semantic concepts in application domains Hand-built approaches (Hendrix et al., 1978) Learned approaches Tang & Mooney (2001), Kate & Mooney (2006), Wong & Mooney (2006)

14 ourplayer2 hastheball our2 player bowner Example MR: bowner(player(our,2)) bowner → player has the ball

15 Thesis Contributions Introduce two novel syntax-based approaches to semantic parsing Theoretically well-founded in computational semantics ( Blackburn and Bos, 2005 ) Great opportunity: leverage the significant progress made in statistical syntactic parsing for semantic parsing (Collins, 1997; Charniak and Johnson, 2005; Huang, 2008)

16 Thesis Contributions S CISSOR : a novel integrated syntactic- semantic parser S YN S EM: exploits an existing syntactic parser to produce disambiguated parse trees that drive the compositional meaning composition Investigate when the knowledge of syntax can help

17 Representing Semantic Knowledge in Meaning Representation Language Grammar (MRLG) ProductionPredicate CONDITION →(bowner PLAYER)P_BOWNER PLAYER →(player TEAM {UNUM})P_PLAYER UNUM → 2P_UNUM TEAM → ourP_OUR Assumes a meaning representation language (MRL) is defined by an unambiguous context-free grammar. Each production rule introduces a single predicate in the MRL. The parse of a MR gives its predicate-argument structure.

18 Roadmap S CISSOR S YNSEM Future Work Conclusions

19 Semantic Composition that Integrates Syntax and Semantics to get Optimal Representations Integrated syntactic-semantic parsing Allows both syntax and semantics to be used simultaneously to obtain an accurate combined syntactic-semantic analysis A statistical parser is used to generate a semantically augmented parse tree (SAPT) SCISSOR

20 Syntactic Parse PRP$NN CDVB DTNN NP VPNP S ourplayer2has theball

21 SAPT PRP$ -P_OUR NN -P_PLAYER CD - P_UNUMVB - P_BOWNER DT -NULL NN -NULL NP -NULL VP -P_BOWNER NP -P_PLAYER S -P_BOWNER ourplayer2has theball Non-terminals now have both syntactic and semantic labels Semantic labels: dominate predicates in the sub-trees

22 SAPT PRP$ -P_OUR NN -P_PLAYER CD - P_UNUMVB - P_BOWNER DT -NULL NN -NULL NP -NULL VP -P_BOWNER NP -P_PLAYER S -P_BOWNER ourplayer2has theball MR: P_BOWNER(P_PLAYER(P_OUR,P_UNUM))

23 S CISSOR Overview Integrated Semantic Parser SAPT Training Examples TRAINING learner

24 Integrated Semantic Parser SAPT Compose MR MR NL Sentence TESTING SCISSOR Overview

25 Extending Collins ’ (1997) Syntactic Parsing Model Find a SAPT with the maximum probability A lexicalized head-driven syntactic parsing model Extending the parsing model to generate semantic labels simultaneously with syntactic labels

26 Why Extending Collins ’ (1997) Syntactic Parsing Model Suitable for incorporating semantic knowledge Head dependency: predicate-argument relation Syntactic subcategorization: a set of arguments that a predicate appears with Bikel (2004) implementation: easily extendable

27 Parser Implementation Supervised training on annotated SAPTs is just frequency counting Testing: a variant of standard CKY chart- parsing algorithm Details in the thesis

28 Smoothing Each label in SAPT is the combination of a syntactic label and a semantic label Increases data sparsity Break the parameters down P h (H | P, w) = P h (H syn, H sem | P, w) = P h (H syn | P, w) × P h (H sem | P, w, H syn )

29 Experimental Corpora CLang ( Kate, Wong & Mooney, 2005 ) 300 pieces of coaching advice words per sentence Geoquery ( Zelle & Mooney, 1996 ) 880 queries on a geography database 7.48 word per sentence MRL: Prolog and FunQL

30 Prolog vs. FunQL (Wong, 2007) Prolog: answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas)))) What are the rivers in Texas? FunQL: answer(river(loc_2(stateid(texas)))) X 1 : river; x 2 : texas Logical forms: widely used as MRLs in computational semantics, support reasoning

31 Prolog: answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas)))) What are the rivers in Texas? FunQL: answer(river(loc_2(stateid(texas)))) Flexible order Strict order Better generalization on Prolog Prolog vs. FunQL (Wong, 2007)

32 Experimental Methodology standard 10-fold cross validation Correctness CLang: exactly matches the correct MR Geoquery: retrieves the same answers as the correct MR Metrics Precision: % of the returned MRs that are correct Recall: % of NLs with their MRs correctly returned F-measure: harmonic mean of precision and recall

33 Compared Systems COCKTAIL (Tang & Mooney, 2001) Deterministic, inductive logic programming WASP (Wong & Mooney, 2006) Semantic grammar, machine translation KRISP (Kate & Mooney, 2006) Semantic grammar, string kernels Z&C (Zettleymoyer & Collins, 2007) Syntax-based, combinatory categorial grammar (CCG) LU (Lu et al., 2008) Semantic grammar, generative parsing model

34 Compared Systems COCKTAIL (Tang & Mooney, 2001) Deterministic, inductive logic programming WASP (Wong & Mooney, 2006) Semantic grammar, machine translation KRISP (Kate & Mooney, 2006) Semantic grammar, string kernels Z&C (Zettleymoyer & Collins, 2007) Syntax-based, combinatory categorial grammar (CCG) LU (Lu et al., 2008) Semantic grammar, generative parsing model Hand-built lexicon for Geoquery Manual CCG Template rules

35 Compared Systems COCKTAIL (Tang & Mooney, 2001) Deterministic, inductive logic programming WASP (Wong & Mooney, 2006) Semantic grammar, machine translation KRISP (Kate & Mooney, 2006) Semantic grammar, string kernels Z&C (Zettleymoyer & Collins, 2007) Syntax-based, combinatory categorial grammar (CCG) LU (Lu et al., 2008) Semantic grammar, generative parsing model λ-WASP, handling logical forms

36 Results on CLang PrecisionRecallF-measure COCKTAIL--- SCISSOR WASP KRISP Z&C--- LU (LU: F-measure after reranking is 74.4%) Memory overflow Not reported

37 Results on CLang PrecisionRecallF-measure SCISSOR WASP KRISP LU (LU: F-measure after reranking is 74.4%)

38 Results on Geoquery PrecisionRecallF-measure SCISSOR WASP KRISP LU COCKTAIL λ -WASP Z&C (LU: F-measure after reranking is 85.2%) Prolog FunQL

39 Results on Geoquery (FunQL) PrecisionRecallF-measure SCISSOR WASP KRISP LU (LU: F-measure after reranking is 85.2%) competitive

40 Why Knowledge of Syntax does not Help Geoquery: 7.48 word per sentence Short sentence Sentence structure can be feasibly learned from NLs paired with MRs Gain from knowledge of syntax vs. flexibility loss

41 Limitation of Using Prior Knowledge of Syntax What state is the smallest N1N1 N2N2 answer(smallest(state(all))) Traditional syntactic analysis

42 Limitation of Using Prior Knowledge of Syntax What state is the smallest state is the smallest N1N1 WhatN2N2 N1N1 N2N2 answer(smallest(state(all))) Traditional syntactic analysisSemantic grammar Isomorphic syntactic structure with MR Better generalization

43 Why Prior Knowledge of Syntax does not Help Geoquery: 7.48 word per sentence Short sentence Sentence structure can be feasibly learned from NLs paired with MRs Gain from knowledge of syntax vs. flexibility loss LU vs. WASP and KRISP Decomposed model for semantic grammar

44 Detailed Clang Results on Sentence Length 0-10 (7%) ( 33 %) ( 46 %) (13%) 0-10 (7%) ( 33 %) ( 46 %) 0-10 (7%) ( 33 %) (13%) ( 46 %) 0-10 (7%) ( 33 %)

45 S CISSOR Summary Integrated syntactic-semantic parsing approach Learns accurate semantic interpretations by utilizing the SAPT annotations knowledge of syntax improves performance on long sentences

46 Roadmap S CISSOR S YNSEM Future Work Conclusions

47 S YN S EM Motivation SCISSOR requires extra SAPT annotation for training Must learn both syntax and semantics from same limited training corpus High performance syntactic parsers are available that are trained on existing large corpora (Collins, 1997; Charniak & Johnson, 2005)

48 SCISSOR Requires SAPT Annotation PRP$ -P_OUR NN -P_PLAYER CD - P_UNUMVB - P_BOWNER DT -NULL NN -NULL NP -NULL VP -P_BOWNER NP -P_PLAYER S -P_BOWNER ourplayer2has theball Time consuming. Automate it!

49 Part I: Syntactic Parse PRP$NN CDVB DTNN NP VPNP S ourplayer2has theball Use a statistical syntactic parser

50 Part II: Word Meanings P_OURP_PLAYERP_UNUMP_BOWNERNULL ourplayer2has theball Use a word alignment model ( Wong and Mooney (2006) ) ourplayer2hasballthe P_PLAYERP_BOWNERP_OURP_UNUM

51 Learning a Semantic Lexicon IBM Model 5 word alignment (GIZA++) top 5 word/predicate alignments for each training example Assume each word alignment and syntactic parse defines a possible SAPT for composing the correct MR

52 Introducing λ variables in semantic labels for missing arguments (a 1 : the first argument) ourplayer2hasballthe VP S NP P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL NP

53 ourplayer2hasballthe VP S NP P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUM P_OUR Part III: Internal Semantic Labels How to choose the dominant predicates? NP

54 λa 1 λa 2 P_PLAYERP_UNUM ? player2 P_BOWNER P_PLAYER P_UNUM P_OUR, a 2 =c 2 P_PLAYER λa 1 λa 2 PLAYER + P_UNUM  λa 1 (c 2 : child 2) Learning Semantic Composition Rules

55 ourplayer2hasballthe VP S NPP_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL λa 1 P_PLAYER ? λa 1 λa 2 PLAYER + P_UNUM  { λa 1 P_PLAYER, a 2 =c 2 } P_BOWNER P_PLAYER P_UNUM P_OUR Learning Semantic Composition Rules

56 ourplayer2hasballthe VP S P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL λa 1 P_PLAYER ? P_PLAYER P_OUR +λa 1 P_PLAYER  {P_PLAYER, a 1 =c 1 } P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules

57 ourplayer2hasballthe P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL λa 1 P_PLAYER P_PLAYER NULL λa 1 P_BOWNER ? P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules

58 ourplayer2hasballthe P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL λa 1 P_PLAYER P_PLAYER NULL λa 1 P_BOWNER P_BOWNER P_PLAYER + λa 1 P_BOWNER  {P_BOWNER, a 1 =c 1 } P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules

59 Ensuring Meaning Composition What state is the smallest N1N1 N2N2 answer(smallest(state(all))) Non-isomorphism

60 Ensuring Meaning Composition Non-isomorphism between NL parse and MR parse Various linguistic phenomena Machine translation between NL and MRL Use automated syntactic parses Introduce macro-predicates that combine multiple predicates. Ensure that MR can be composed using a syntactic parse and word alignment

61 Unambiguous CFG of MRL Training set, {(S,T,MR)} Training Semantic parsing Input sentence parse T Output MR Testing Before training & testing training/test sentence, S Syntactic parser syntactic parse tree,T Semantic knowledge acquisition Semantic lexicon & composition rules Parameter estimation Probabilistic parsing model S YN S EM Overview

62 Unambiguous CFG of MRL Training set, {(S,T,MR)} Training Semantic parsing Input sentence, S Output MR Testing Before training & testing training/test sentence, S Syntactic parser syntactic parse tree,T Semantic knowledge acquisition Semantic lexicon & composition rules Parameter estimation Probabilistic parsing model S YN S EM Overview

63 Parameter Estimation Apply the learned semantic knowledge to all training examples to generate possible SAPTs Use a standard maximum-entropy model similar to that of Zettlemoyer & Collins (2005), and Wong & Mooney (2006) Training finds a parameter that (approximately) maximizes the sum of the conditional log-likelihood of the training set including syntactic parses Incomplete data since SAPTs are hidden variables

64 Features Lexical features: Unigram features: # that a word is assigned a predicate Bigram features: # that a word is assigned a predicate given its previous/subsequent word. Rule features: # a composition rule applied in a derivation

65 answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas)))) What are the rivers in Texas? λv 1 P_ANSWER(x 1 ) λv 1 P_RIVER(x 1 ) λv 1 λv 2 P_LOC(x 1,x 2 ) λv 1 P_EQUAL(x 2 ) Handling Logical Forms Handle shared logical variables Use Lambda Calculus (v: variable)

66 Prolog Example answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas)))) What are the rivers in Texas? λv 1 P_ANSWER(x 1 ) (λv 1 P_RIVER(x 1 ) λv 1 λv 2 P_LOC(x 1,x 2 ) λv 1 P_EQUAL(x 2 )) Handle shared logical variables Use Lambda Calculus (v: variable)

67 Prolog Example answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas)))) What are the rivers in Texas? λv 1 P_ANSWER(x 1 ) (λv 1 P_RIVER(x 1 ) λv 1 λv 2 P_LOC(x 1,x 2 ) λv 1 P_EQUAL(x 2 )) Handle shared logical variables Use Lambda Calculus (v: variable)

68 Prolog Example What are the rivers in Texas NP PP IN SBARQ NP SQ VBPWHNP answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas)))) Start from a syntactic parse

69 Prolog Example What are the rivers in Texas PP SBARQ NP SQ λv 1 λa 1 P_ ANSWER NULL λv 1 P_RIVER λv 1 λv 2 P_LOC λv 1 P_EQUAL Add predicates to words answer(x 1, (river(x 1 ), loc(x 1,x 2 ), equal(x 2,stateid(texas))))

70 Prolog Example What are the rivers in Texas SBARQ NP SQ λv 1 λa 1 P_ ANSWER NULL λv 1 P_RIVER λv 1 λv 2 P_LOC λv 1 P_EQUAL λv 1 P_LOC answer(x 1, (river(x 1 ), loc(x 1, x 2 ), equal( x 2,stateid(texas)))) Learn a rule with variable unification λv 1 λv 2 P_LOC(x 1, x 2 ) + λv 1 P_EQUAL( x 2 )  λv 1 P_LOC

71 Experimental Results CLang Geoquery (Prolog)

72 Syntactic Parsers (Bikel,2004) WSJ only CLang(S YN 0): F-measure=82.15% Geoquery(S YN 0) : F-measure=76.44% WSJ + in-domain sentences CLang(S YN 20): 20 sentences, F-measure=88.21% Geoquery(S YN 40): 40 sentences, F-measure=91.46% Gold-standard syntactic parses ( G OLD S YN )

73 Questions Q1. Can S YN S EM produce accurate semantic interpretations? Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? Q3. Does it also improve on long sentences? Q4. Does it improve on limited training data due to the prior knowledge from large treebanks? Q5. Can it handle syntactic errors?

74 Results on CLang PrecisionRecallF-measure G OLD S YN S YN S YN SCISSOR WASP KRISP LU (LU: F-measure after reranking is 74.4%) S YN S EM SAPTs GOLDSYN > SYN20 > SYN0

75 Questions Q1. Can SynSem produce accurate semantic interpretations? [yes] Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes] Q3. Does it also improve on long sentences?

76 Detailed Clang Results on Sentence Length (13%) ( 46 %) 0-10 (7%) ( 33 %) Prior Knowledge Syntactic error + Flexibility + = ?

77 Questions Q1. Can SynSem produce accurate semantic interpretations? [yes] Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes] Q3. Does it also improve on long sentences? [yes] Q4. Does it improve on limited training data due to the prior knowledge from large treebanks?

78 Results on Clang (training size = 40 ) PrecisionRecallF-measure G OLD S YN S YN S YN SCISSOR WASP KRISP S YN S EM SAPTs The quality of syntactic parser is critically important!

79 Questions Q1. Can SynSem produce accurate semantic interpretations? [yes] Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes] Q3. Does it also improve on long sentences? [yes] Q4. Does it improve on limited training data due to the prior knowledge from large treebanks? [yes] Q5. Can it handle syntactic errors?

80 Handling Syntactic Errors Training ensures meaning composition from syntactic parses with errors For test NLs that generate correct MRs, measure the F-measures of their syntactic parses SYN0: 85.5% SYN20: 91.2% If DR2C7 is true then players 2, 3, 7 and 8 should pass to player 4

81 Questions Q1. Can SynSem produce accurate semantic interpretations? [yes] Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes] Q3. Does it also improve on long sentences? [yes] Q4. Does it improve on limited training data due to the prior knowledge of large treebanks? [yes] Q5. Is it robust to syntactic errors? [yes]

82 Results on Geoquery (Prolog) PrecisionRecallF-measure G OLD S YN S YN S YN COCKTAIL λ -WASP Z&C S YN S EM SYN0 does not perform well All other recent systems perform competitively

83 S YN S EM Summary Exploits an existing syntactic parser to drive the meaning composition process Prior knowledge of syntax improves performance on long sentences Prior knowledge of syntax improves performance on limited training data Handle syntactic errors

84 Discriminative Reranking for semantic Parsing Adapt global features used for reranking syntactic parsing for semantic parsing Improvement on CLang No improvement on Geoquery where sentences are short, and are less likely for global features to show improvement on

85 Roadmap S CISSOR S YNSEM Future Work Conclusions

86 Future Work Improve S CISSOR Discriminative S CISSOR (Finkel, et al., 2008) Handling logical forms S CISSOR without extra annotation ( Klein and Manning, 2002, 2004 ) Improve S YN S EM Utilizing syntactic parsers with improved accuracy and in other syntactic formalism

87 Future Work Utilizing wide-coverage semantic representations ( Curran et al., 2007 ) Better generalizations for syntactic variations Utilizing semantic role labeling (Gildea and Palmer, 2002) Provides a layer of correlated semantic information

88 Roadmap S CISSOR S YNSEM Future Work Conclusions

89 Conclusions S CISSOR : a novel integrated syntactic-semantic parser. S YN S EM: exploits an existing syntactic parser to produce disambiguated parse trees that drive the compositional meaning composition. Both produce accurate semantic interpretations. Using the knowledge of syntax improves performance on long sentences. S YN S EM also improves performance on limited training data.

90 Thank you! Questions?