Download presentation
Presentation is loading. Please wait.
Published byNathan McCullough Modified over 11 years ago
1
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA The 15 th International World Wide Web Conference May 2006 Edinburgh, Scotland
2
Symmetrically Exploiting XML: Zhang, Dyreson Hierarchical model vs. relational model Codd: symmetric exploitation of data part/project works on some, but not all Path expressions are asymmetric Currently, all XML query languages use path expressions 1970s Database Controversy Part Project Part Commit ProjectPart
3
Symmetrically Exploiting XML: Zhang, Dyreson Querying Data with Path Expressions Task Find books by E. F. Codd XQuery return doc("author.xml")//author[name= 'E. F. Codd']/book name author book title publisher price Addison Wesley Academic Press DB 46.95 Automata 9.99 E. F. Codd
4
Symmetrically Exploiting XML: Zhang, Dyreson Same Data, Different Structure Same task Find books by E. F. Codd Need different XQuery return doc("book.xml")//book[author/name='E. F. Codd'] publisher book title author price Addison Wesley DB 46.95 Automata 9.99 name E. F. Codd publisher Academic Press name Codd name author book title publisher price Addison Wesley Academic Press DB 46.95 Automata 9.99 E. F. Codd
5
Symmetrically Exploiting XML: Zhang, Dyreson Goal Make same query work on different structures Useful when there is lack of schema knowledge heterogeneous data irregular data schema evolution Factor off problem of different label sets, others are working on it
6
Symmetrically Exploiting XML: Zhang, Dyreson Existing Axes are Directional preceding following descendent ancestor self
7
Symmetrically Exploiting XML: Zhang, Dyreson Proposal: A Non-directional Axis preceding following descendent ancestor self
8
Symmetrically Exploiting XML: Zhang, Dyreson Proposal: A Non-directional Axis preceding following descendent ancestor self
9
Symmetrically Exploiting XML: Zhang, Dyreson Proposal: A Non-directional Axis preceding following descendent ancestor self
10
Symmetrically Exploiting XML: Zhang, Dyreson The Closest Axis Syntax closest:: ->name is abbreviation for closest::name Semantics a function that takes a context node and returns a sequence of closest nodes
11
Symmetrically Exploiting XML: Zhang, Dyreson Closest Axis of the First Title closest::* Returns a list of five nodes closest::price Returns the first price node name author book title publisher price
12
Symmetrically Exploiting XML: Zhang, Dyreson Node selection restricted by minimal type distance The minimal distance between a title and a price is 2 closest::price Returns an empty list When the First Book Lacks a Price name author book title publisher price
13
Symmetrically Exploiting XML: Zhang, Dyreson closest::name for each book? Root-to-node path type author/name author/book/publisher/name Type Distance is Crucial name author book title publisher price name
14
Symmetrically Exploiting XML: Zhang, Dyreson Querying with the Closest Axes Closest axis-enabled XQuery evaluation engine Query Same query -- return doc("any.xml")->author[->name='E. F. Codd']->book Query Result#2 Result#3 Query Result#1
15
Symmetrically Exploiting XML: Zhang, Dyreson Querying with Directional Axes XQuery evaluation engine Query#1 -- return doc("author.xml")//author[name= 'E. F. Codd']/book Query#2 -- …… Query#3 -- return doc("book.xml")//book[author/name='E. F. Codd'] Result#2 Result#3 Result#1
16
Symmetrically Exploiting XML: Zhang, Dyreson Find the closest price for title Non-directional expression closest::price Directional (path) expression parent::*/child::price In-memory Implementation Naïve approach Compute Closest for every node Time complexity is O(sn 2 ) s: number of labels in the signature n: number of nodes Converting to a path expression name author book titlepublisherprice
17
Symmetrically Exploiting XML: Zhang, Dyreson Experiment Compare directional vs. nondirectional for $b in doc("bib.xml")//title/closest::publisher return $b for $b in doc("bib.xml")//title/..//publisher return $b Implemented closest in eXist (an XML DBMS)
18
Symmetrically Exploiting XML: Zhang, Dyreson Persistent Implementation Take advantage of type indexes LCA-join Every Closest pair related via an LCA Idea is to merge lists of types O(sn)
19
Symmetrically Exploiting XML: Zhang, Dyreson Related Work Data integration TSIMMIS Garcia-Molina et al. (Journal of Intelligent Information Systems 1997) YAT Christophides, Cluet, Simèon (SIGMOD Record June 2000) Silkroute Fernandez, Tan, Suciu (WWW 2000) LCA-related techniques Schmidt, Kersten, Windhouwer (ICDE 2001) Cohen, Mamou, Kanza, Sagiv (VLDB 2003) Li, Yu, Jagadish (VLDB 2004)
20
Symmetrically Exploiting XML: Zhang, Dyreson Related Research Projects XML Restructuring Zhang, Dyreson (IIWeb 2006) XML Compaction Zhang, Dyreson, Dang (DASFAA 2006) Common theme – symmetric exploitation!
21
Symmetrically Exploiting XML: Zhang, Dyreson Conclusion Current XQuery depends on path expressions A path expression is directional (asymmetric) May break down if structure changes The closest axis is non-directional (symmetric) Simple in syntax Can be easily integrated in XQuery Can be implemented efficiently In-memory Persistent
22
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.