Spatial tree logics to reason about Semistructured Data Speaker: Giovanni Conforti Joint work with: Giorgio Ghelli SEBD 2003 Dipartimento di Informatica.

Slides:



Advertisements
Similar presentations
From Handbook of Temporal Reasoning in Artificial Intelligence By Jan Chomicki & David Toman Temporal Databases Presented by Leila Jalali CS224 presentation.
Advertisements

XML: Extensible Markup Language
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
A Fixpoint Calculus for Local and Global Program Flows Swarat Chaudhuri, U.Penn (with Rajeev Alur and P. Madhusudan)
Adjunct Elimination in Context Logic for Trees Cristiano Calcagno Thomas Dinsdale-Young Philippa Gardner Imperial College, London.
Games for Static Ambient Logic Giorgio Ghelli joint work with Anuj Dawar and Philippa Gardner.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP.
Formal Logic Proof Methods Direct Proof / Natural Deduction Conditional Proof (Implication Introduction) Reductio ad Absurdum Resolution Refutation.
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
The Query Language TQL Speaker: Giovanni Conforti Joint work with: G. Ghelli, A. Albano, D. Colazzo, P. Manghi, and C. Sartiani Università di Pisa WebDB.
ModelicaXML A Modelica XML representation with Applications Adrian Pop, Peter Fritzson Programming Environments Laboratory Linköping University.
From Semistructured Data to XML: Migrating The Lore Data Model and Query Language Roy Goldman, Jason McHugh, Jennifer Widom Stanford University
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
Logical Agents Chapter 7. Why Do We Need Logic? Problem-solving agents were very inflexible: hard code every possible state. Search is almost always exponential.
Logical Agents Chapter 7. Why Do We Need Logic? Problem-solving agents were very inflexible: hard code every possible state. Search is almost always exponential.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
1 COS 425: Database and Information Management Systems XML and information exchange.
Containment and Equivalence for an XPath Fragment By Gerom e Mikla Dan Suciu Presented By Roy Ionas.
1 Statistics XML: –Altavista: 800,000 pages returned. –Amazon.com: 242 books. In comparison: –God: 12,000 books, 7 Million pages –Bible: 32,000 books,
1 Ivan Lanese Dipartimento di Informatica Università di Pisa Ugo Montanari From Graph Rewriting to Logic Programming joint work with.
1 New Ways of Querying the Web by Eliahu Brodsky and Alina Blizhovsky.
Managing XML and Semistructured Data Lecture 14: Constraints and Keys Prof. Dan Suciu Spring 2001.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Putting Semi-structured Data to Practice Alon Levy Seattle, Washingon University of Washington.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
4/20/2017.
On the Use of Regular Expressions for Searching Text Charles L.A. Clarke and Gordon V. Cormack Fast Text Searching.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
A Unified Framework for the Semantic Integration of XML Databases
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1 Static Type Analysis of Path Expressions in XQuery Using Rho-Calculus Wang Zhen (Selina) Oct 26, 2006.
An Introduction to Description Logics. What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic.
Lecture 3 [Self Study] Relational Calculus
1 XML-KSI, 2004 XML- : an extendible framework for manipulating XML data Jaroslav Pokorny Charles University Praha.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
Section 5.3. Section Summary Recursively Defined Functions Recursively Defined Sets and Structures Structural Induction.
Querying Structured Text in an XML Database By Xuemei Luo.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
The Bernays-Schönfinkel Fragment of First-Order Autoepistemic Logic Peter Baumgartner MPI Informatik, Saarbrücken.
Managing XML and Semistructured Data Lecture 13: XDuce and Regular Tree Languages Prof. Dan Suciu Spring 2001.
Database Systems Part VII: XML Querying Software School of Hunan University
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Management of XML and Semistructured Data Lecture 11: Schemas Wednesday, May 2nd, 2001.
Integrating high-level constructs into programming languages Language extensions to make programming more productive Underspecified programs –give assertions,
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
CS6133 Software Specification and Verification
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Validation - Formal verification -
Reasoning about the Behavior of Semantic Web Services with Concurrent Transaction Logic Presented By Dumitru Roman, Michael Kifer University of Innsbruk,
DEDUCTION PRINCIPLES AND STRATEGIES FOR SEMANTIC WEB Chain resolution and its fuzzyfication Dr. Hashim Habiballa University of Ostrava.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: –e.g., structured files, scientific data, XML. Managing.
1 Holistic Twig Joins: Optimal XML Pattern Matching Nicolas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 2002 Presented by Jun-Ki Min.
1 Finite Model Theory Lecture 12 Regular Expressions, FO k.
Welcome to CPSC 534B: Information Integration Laks V.S. Lakshmanan Rm. 315.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
1 Structural Templates In Type Theory Henson Graves June, 2012.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
XML: Extensible Markup Language
 DATAABSTRACTION  INSTANCES& SCHEMAS  DATA MODELS.
Logical Agents Chapter 7.
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
2/18/2019.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: e.g., structured files, scientific data, XML. Managing such.
Presentation transcript:

Spatial tree logics to reason about Semistructured Data Speaker: Giovanni Conforti Joint work with: Giorgio Ghelli SEBD 2003 Dipartimento di Informatica – Università di Pisa

What I’m going to talk about … A gentle introduction to Spatial Tree Logics (STL) STL and Semistructured Data (SSD) –Properties of SSD (Constraints, Types, Queries)  Spatial Tree Logic (STL) Formulas –Decision Problems for SSD  Validity/Satisfiability of STL Formulas Presentation of a decidable fragment of the TQL logic

Background: Spatial Logics Modal Logics to describe properties of structured worlds Many Applications: Ambient Calculus,  -calculus, tree structured data, shared data structures, … Spatial (and temporal) modal operators to describe structure (and behavior) Equivalence, model checking and validity problem are already studied for many spatial logics Many works involving Cardelli, Gordon, Caires, Ghelli, Gardner, …

A Simple Ground Spatial Tree Logic Worlds = Information trees : Unordered (multisets of) labeled trees F,F’ ::= 0 (empty root) | n[F] (an edge labelled n leading to the i.t. F) | F | F (the i.t. F “next to” the i.t F’) Logic = propositional logic connectives + modal operators describing the structure A,B :: = True | Not A | A and B 0 | n[A] | A | B

Examples F = book[ title[Databases[ 0 ]] | author[Ghelli[ 0 ]] | author[Albano[ 0 ]] ] F |= A F |= B F |= C F |= D An information tree: a tree labelled book with 3 subtrees Some formulas describing trees A = book[ author[Ghelli[ 0 ]]] B = book[ author[Ghelli[ 0 ]] | True] C = book[ Not (editor[True] | True) ] D = book[ title[True] And author[True] ]

First order and modal recursion The full TQL logic extends the ground fragment with: –X tree variables –x[A] locations with label variables –Exists x. A quantification over labels (and trees) –μξ. A fixpoint (ξ positive in A)

Decision Problems Given a formula A and a model F Model checking: F |= A ? Query Answering: find values of x such that F |= A(x) Satisfiability sat(A): Exists a F’ such that F’ |= A ? Validity vld(A): is true that For each model F’, F’ |= A ? Negation in the logic: Sat(A)  Not vld(Not A) Implication  F. F|=A implies F|=B  vld(Not A Or B) With the simple ground STL all these problems are decidable, but that is not true for satisfiability/validity if we introduce variables and quantification (or fixpoint)

A SSD Data model: labeled trees articles article author date title month year Gordon Apr, 2000 Feb TQL … author Cardelli date 2001 author Ghelli … articles[ article[ author[Cardelli] | author[Gordon] | title [Anywhere] | date[Apr, 2000] ] article[ author[Ghelli] | title[TQL] | conf[ETAPS] | date[ month[Feb] | year[2001] ] ] ] information trees

SSD Schema and Types Schema and Types to constraint the structure of SSD: –DTDs; –XML Schema; –Regular Expression Types; A schema: Article = article[ title[String],author[String]*,date[True]? ] A recurisve type: Section = section[ init[String], Section*, conc[String] ]

Types in STL Regular Type expressions and DTD can be expressed (up to document order) in STL extended with modal recursion A schema: article[ title[String],author[String]*,date[True]? ] In STL article[ title[True]| ( . 0 Or author[True]|  ) | date[True] or 0 ]

SSD Constraints Integrity Constraints on the values of SSD: –Inclusion Constraints; –Inverse Relationship Constraints; –Key Constraints; path expressions to navigate on SSD: articles.article.title(x) root.section*.init(x) Integrity constraints as inclusion of paths: student.takes => course.cno student.takes  course.taken_by Key constraints (first order logic with paths):  x,y. article.title(x) And article.title(y) And  (x=y) =>  (x == y)

Constraints in STL Integrity Constraints over SSD are easily expressed using STL with variables and quantification. Examples using path abbreviation (.a[A] = a[A] | True): –An inclusion constraint  $X..student.taking[$X] =>.course.cno[$X] –A key constraint for SSD:  $X. Not (.article.title[$X] |.article.title[$X] ) Combining quantification with recursion we can express complex types and constraints (e.g. binary trees)

SSD Queries Many query languages (Xquery, Lorel, Yatl, …), essentially queries are expressions selecting data reachable from paths and constructing new results TQL a peculiar query language based on spatial tree logic, the selection is done using pattern matching over STL formulas TQL logic expresses all regular path expressions Query answering is implemented for the full TQL logic

SSD Decision Problems with STL Given a data source F, and formulas A representing a schema and B, B’ a set of integrity constraints Validation: F |= A, F|=B, F|= A And B Schema/constraint consistency: sat(A), sat(B), sat(A And B) Constraint Implication (inference): vld(B => B’) Constraint Implication in presence of a schema: vld(A and B => B’)

A decidable TQL sublogic STL are good to express types, constraints and queries over SSD but: –Validity in the full TQL logic is undecidable –The gound logic is decidable, but it is not enough to express all interesting types and contraints We are looking for a decidable fragment of TQL expressive enough to reason about SSD A first step in this direction is the following logic…

A decidable TQL sublogic A, B ::= True | A and B | Not A| 0 | %[A] | n[A] | A | B We can define useful operators to describe types and constraints in this decidable logic String = def %[ 0 ] Tree = def %[True] A or B = def Not (Not A And Not B) A => B = def Not A Or B A exists = def A | True A foreach = def Not( Not A | True) A foreachTree = def (Tree => A) foreach Note: if A => Tree we can use A foreachTree to express A*

Conclusions and Future Directions STL provide a powerful unified framework for types, constraints, and queries over SSD and XML This framework is worth of studying, it may lead to: –A good formalization of “SSD reasoning” in terms of model checking and validity –Generalization of results on reasoning about types, constraints –Query Optimization strategies guided by types/constraints (some) future steps –Extend the decidable logic to express integrity constraints –Modeling ordered trees

Spatial tree logics to reason about Constraints and Types Speaker: Giovanni Conforti Supervisor: Giorgio Ghelli Università di Pisa: Ph.D. Proposal

SSD Query Optimization TQL pattern clause uses STL formulas… We can use validated constraints C an types T as information to optimize queries (e.g. static declaration of empty result) A query from Q |= A select Q’ can be rewritten with from Q |= B select Q’ for each B such that (C and T) => (A B)

Research Plan: pianification The challenge is ambitious, it must be intended as a long term direction of our work We address some initial tasks we expect to accomplish: –Comparison of STL with other formalisms for types and constraints –Find a “satisfactory” decidable logic fragment to express types (and constraints) –Write a preliminar formal system for constraint (and type) implication We plan two stages: 1.(2nd year) deep study of basic theories (tree automata, modal logics, description logics) and initial tasks investigation 2.(3rd year) Initial tasks completion and integration of the results in a unified formal framework

Research Plan: directions Main directions, investigate on: –Expressivity of Spatial Tree Logics (in particular for standard Types and Constraints specifications) –Decidability and complexity of model checking and validity for fragments (or extensions) of TQL logic –Reformulation (or generalization) of known results about reasoning and optimization over SSD Other interesting directions: –Implementation of a query rewriter guided by constraints and types –Extensions to the logic to model order, data updates, private names

Background: Semi-structured data (SSD) Semi - Structured Data ( SSD ) are used to: –model and query web (HTML, XML, …); –store sperimental data; –integrate eterogeneous databases; –… SSD are: –Self-describing (structure is implicit); –Irregular; –Always in evolution