An event-based denotational semantics for natural language queries of data represented in triple stores Richard Frost, Randy Fortier and Bryan St. Amour.

Slides:



Advertisements
Similar presentations
TU e technische universiteit eindhoven / department of mathematics and computer science Modeling User Input and Hypermedia Dynamics in Hera Databases and.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
An Event-based Approach for Querying Graph- Structured Data Using Natural Language Richard Frost, Wale Agboola, Eric Matthews and Jon Donais School of.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A Modified by Donghui Zhang.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Analyzing Minerva1 AUTORI: Antonello Ercoli Alessandro Pezzullo CORSO: Seminari di Ingegneria del SW DOCENTE: Prof. Giuseppe De Giacomo.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Ch.2: Syntax and Semantics Fall 2005.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
C. Varela; Adapted w/permission from S. Haridi and P. Van Roy1 Declarative Computation Model Defining practical programming languages Carlos Varela RPI.
References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
FALL 2004CENG 351 File Structures and Data Managemnet1 Relational Algebra.
1 Relational Algebra. 2 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Rutgers University Relational Algebra 198:541 Rutgers University.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
CSCD343- Introduction to databases- A. Vaisman1 Relational Algebra.
Relational Algebra, R. Ramakrishnan and J. Gehrke (with additions by Ch. Eick) 1 Relational Algebra.
SpeechWeb & Adobe Captivate towards a revolution in education Richard Frost School of Computer Science University of Windsor ITS 2012 Windsor.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
TRANSLATOR: A TRANSlator from LAnguage TO Rules David Hirtle David R. Cheriton School of Computer Science University of Waterloo (Work done at the University.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
COMPUTER PROGRAMMING Source: Computing Concepts (the I-series) by Haag, Cummings, and Rhea, McGraw-Hill/Irwin, 2002.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
Problem Statement: Users can get too busy at work or at home to check the current weather condition for sever weather. Many of the free weather software.
1 Relational Algebra. 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of data from a database. v Relational model supports.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 Relational Algebra and Calculas Chapter 4, Part A.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Xin Zhang.
Semantically Processing The Semantic Web Presented by: Kunal Patel Dr. Gopal Gupta UNIVERSITY OF TEXAS AT DALLAS.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
1 Lazy Combinators for Executable Specifications of General Attribute Grammars Rahmatullah Hafiz and Richard A. Frost
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Artificial Intelligence: Natural Language
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Xin Zhang.
CSCD34-Data Management Systems - A. Vaisman1 Relational Algebra.
SPIN in Five Slides Holger Knublauch, TopQuadrant Inc. Example file:
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
Semantic Phyloinformatic Web Services Using the EvoInfo Stack Speaker: John Harney LSDIS Lab, Dept. of Computer Science, University of Georgia Mentor(s):
A Monadic-Memoized Solution for Left-Recursion Problem of Combinatory Parser Rahmatullah Hafiz Fall, 2005.
Of 38 lecture 6: rdf – axiomatic semantics and query.
MIT Artificial Intelligence Laboratory — Research Directions The START Information Access System Boris Katz
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Notation, Technology and Ideas How an old tune went global and how natural- language speech applications can be built and deployed on the web Richard Frost.
NATURAL LANGUAGE PROCESSING
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Facilitating Semantic Web Search with Embedded Grammar Tags (EGTs) Gautham K.Dorai Yaser Yacoob Department of Computer Science University of Maryland –
Relational Algebra Chapter 4, Part A
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Relational Algebra.
Relational Algebra Chapter 4, Sections 4.1 – 4.2
The Relational Model Textbook /7/2018.
Semantic Markup for Semantic Web Tools:
CENG 351 File Structures and Data Managemnet
Relational Algebra & Calculus
Information Retrieval and Web Design
COMPILER CONSTRUCTION
Presentation transcript:

An event-based denotational semantics for natural language queries of data represented in triple stores Richard Frost, Randy Fortier and Bryan St. Amour School of Computer Science University of Windsor ICSC 2013

Objectives of our research To create an efficient, modular Natural Language (NL) speech interface to graphical data which enables answers to questions to be computed directly from the data “xxx xx xxxxx xxx xxxxxx x xx x xx?” ⇧⇧ ⇧ ⇧⇧⇧ ⇧ ⇧⇧⇧ ⇧ ⇧ ⇧ ⇧⇧ Data = {(a,r1,c), (a,r2,f), (c,r3,g),……….} Efficient: polynomial time and space complexity. Modular: new language constructs can be added without affecting any existing code. Graphical data: binary-relational triple stores, converted relational data, semantic web RDF data. ICSC 2013

Why do we need a compositional semantics ? “How many states which are members of the United Nations have capitals in the southern hemisphere?” Information retrieval systems can only answer if a similar statement, with the answer, is in the data store. Even so, the statement would need to be updated whenever a new member is added to the U.N. or a change in capital is declared which affects the result. ICSC 2013

Progress so far X-SAIGA – an environment for constructing language processors as modular executable specifications of attribute grammars. Based on a top-down polynomial space/time complexity parser for arbitrary (ambiguous/left-recursive) CFGs. SpeechWeb – an architecture for creating speech interfaces to hyperlinked applications on the Web. NL semantics for conventional relational databases Youtube, enter: SpeechWeb NEXT STEP – SEMANTICS FOR GRAPHICAL DATA ICSC 2013

A Breakthrough - Montague’s (1970’s) approach to natural-language semantics (simplified) “Mars spins” English [[ Mars ]] [[ spins ]] = λp(p e_mars) spins_pred Higher-order Intensional Logic (IL) => spins_pred e_mars Data Model => True ICSC 2013

Montague Semantics (MS) “every moon spins” ( [[ every ]] [[ moon ]] ) [[ spins ]] = (λpλq ∀ x(p x → q x) moon_pred) spins_pred => λq ∀ x(moon_pred x → q x) spins_pred => ∀ x(moon_pred x → spins_pred x) => True (if all things that are moons spin) ICSC 2013

MS is polymorphic “ Mars and Venus spin ” => ( [[ and ]] [[ Mars ]] [[ Venus ]] ) [[ spin ]] => (λsλt (λr(s r & t r)) λp(p e_mars) λp(p e_venus)) spins_pred =>> λr(λp(p e_mars) r & λp(p e_venus) r) spins_pred => λp(p e_mars) spins_pred & λp(p e_venus) spins_pred => spins_pred e_mars & spins_pred e_venus => True & True => True ICSC 2013

MS is very powerful The semantics covers a large sub-set of classical first-order English. - does (((every moon) $and (every planet)) spin) - how_many (moons $that (orbit (a (red planet))) (were (discovered_by (the (person $who (discovered Nereid))))) - which planet (is (orbited_by (no moon))) It covers intensions, modal expressions (although we do not) The meaning of words can be defined in terms of other words. [[ discoverer ]] = [[ person $who (discovered (a thing)) ]] ICSC 2013

Montague Semantics is ideally suited as a basis for computerized query processors Denotational: every word and phrase has a well-defined mathematical meaning (denotation). Compositional: The meaning of a phrase is obtained from the meanings of its parts through simple (function application). Referentially transparent: the meaning of a phrase, after syntactic disambiguation, is always the same. There is a one-to-one correspondence between syntactic and semantic rules BUT ICSC 2013

Shortcomings of MS for query processing Computationally intractable: ∀ x(moon_pred x → spins_pred x) No explicit denotation for transitive verbs: left uninterpreted until the end and then a syntactic re-write is used to give IL expression Prepositional phrases not easy to accommodate in MS entity-based semantics Needs intermediate language: IL needs to be mapped to the triple store/binary-rel/RDF data model OR to another intermediate language (although Montague said that IL was dispensable). ICSC 2013

Our semantics Has the 4 Montagovian properties: denotational/modular/ etc. Computationally tractable: set based rather than predicates. Event based: Able to easily accommodate prepositional phrases. Has an explicit denotation for transitive verbs: enabling accommodation of phrases such as “wrote or interpreted”. No intermediate language: NL denotations are defined directly in terms of basic triple store operations. This approach differs from many other NL query approaches which map NL to SQL or SPARQL. ICSC 2013

An example datastore – 5 events {(EV 1000, REL "type", TYPE "born_ev"), (EV 1000, REL "subject", ENT "capone"), (EV 1000, REL "date", ENTNUM 1899), (EV 1001, REL "type", TYPE "join_ev"), (EV 1001, REL "subject", ENT "capone") (EV 1001, REL "object", ENT "fpg"), (EV 1002, REL "type", TYPE "membership"), (EV 1002, REL "subject", ENT "capone"), (EV 1002, REL "object", ENT "thief_set"), (EV 1002, REL "date", ENTNUM 1918 ), (EV 1004, REL "type", TYPE "steal_ev"), (EV 1004, REL "subject", ENT "capone"), (EV 1004, REL "object", ENT "car_1"), (EV 1005, REL "type", TYPE "smoke_ev"), (EV 1005, REL "subject", ENT "capone"), easily add (EV 1000, REL "location", ENT "brooklyn"), ICSC 2013

Basic retrieval operators. getts (ANY, REL “subject”, ENT “capone”) => {(1000, REL “subject”, ENT “capone”), (1001, REL “subject”, ENT “capone”), etc. getts can be used to define other basic operators. Definitions in the paper.. Example uses: get_subjs_for_events {EV 1000, EV 1009} => {ENT "capone", ENT "torrio"} get_members “thief_set” => {ENT “capone"} get_subjs_of_event_type “born_ev” => {ENT “capone”} We can now define semantics using these basic operators ICSC 2013

Our new semantics Note in paper and from now on: bold italic thief = [[ thief ]] thief = get_members “thief_set" e.g. thief => {ENT “capone”} smokes = get_subjs_of_event_type “smoke_ev” e.g. smokes => {ENT “capone”} capone setofents = (ENT "capone") ∈ setofents e.g. capone smokes => True a nph vbph = #( nph ⋂ vbph) ~= 0 term_and tmph1 tmph2 = f where f setofevs = (tmph1 setofevs) & (tmph2 setofevs) e.g. ((a thief ) $term_and capone) smokes => True ICSC 2013

Our new semantics – major contribution 1 join = make_trans “join_ev” e.g. join (a gang) => {ENT “capone”, ENT “torrio”} Definition: make_trans event_type = f where f tmph = { subj | (subj, evs) ∈ (make_image event_type) & tmph ( ⋃ {map thirds (getts (ev, REL "object", ANY)) | ev ∈ evs})} where, for example: make_image “join_ev” => {(ENT “capone”, {EV 1001, EV 1003}), (ENT “torrio”, {EV 1009})} ICSC 2013

Prepositional phrases – major contribution 2 steal_with_time tmph date = {subj | (subj, evs) ∈ image_steal & tmph ( ⋃ {thirds (getts (ev,REL"object",ANY)) | ev ∈ evs & date(thirds ( getts (ev,REL "date", ANY)))})} The date argument is used to “filter” the events. e.g. steal_with_time (a car) (date_1918) => {ENT "capone"} Note : we need to generalize and create a more powerful version of the make_trans function (this should not be too difficult) ICSC 2013

The result: A wide range of English NL queries e.g. “Which gangster who stole a car in 1915 or 1918 joined a gang that was joined by Torrio?” ⇩ which (gangster $that (steal_with_time (a car) (date_1915 $term_or date_1908)) (join (a (gang $that (joined_by torrio)))) ⇩ {ENT “capone”} The brackets are introduced by the parser, which will produce more than one bracketed expressions for ambiguous input. ICSC 2013

Next steps 1. Generalize the method for accommodating prepositional phrases and create a more powerful version of the make_trans function to cover queries such as : “who stole a car in Brooklyn in 1915” (our solution is briefly described in the paper). 2.Extend the parser of the existing NL speech query processor to accommodate prepositional phrases. 3.Replace the entity-based NL semantics of the existing query processor with the new event-based semantics. 4. Interface the new query processor with an RDF semantic web data source (will require converting RDF triples to event-based triples). 5. Develop methods for optimising queries to semantic web data. ⇩ An NL speech query interface to semantic web data ICSC 2013

References for previous work PARSING: Frost, R., Hafiz, R., Callaghan, P., (2007) Modular and efficient top-down parsing for ambiguous left-recursive grammars. In: 10th ACL, IWPT, 109–120. Hafiz, R. and Frost, R, (2010) Lazy combinators for executable specifications of general attribute grammars, Proceedings of the 12th International Symposium on Practical aspects of declarative languages (PADL), LNCS 5937, SPEECH RECOGNITION: Frost, R. A. (2005). A call for a public-domain SpeechWeb. CACM 48 (11) Frost, R. A., Ma, X. and Shi, Y. (2007) A browser for a public-domain SpeechWeb. WWW 2007, SEMANTICS: Frost, R. A. (2006) Realization of natural language interfaces using lazy functional programming. ACM Comp. Surv. 38 (4) Article 11. Frost, R. A. and Fortier, R. (2007) An efficient denotational semantics for natural language database queries, NLDB 07, LNCS 4592, YouTube: SpeechWeb => ICSC 2013

Acknowledgements Rahmatullah Hafiz Paul Callaghan Nabil Abdullah Ali Karaki Paul Meyer Jon Donais Matthew Clifford Shane Peelar Stephen Karamatos Walid Mnaymneh Rob Mavrinac Cai Filiault NSERC – Natural Science and Engineering Council of Canada Research Services - University of Windsor ICSC 2013