Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Processing with XML CSE 350 – Advanced Database Topics Jeffrey R. Ellis.

Similar presentations


Presentation on theme: "Query Processing with XML CSE 350 – Advanced Database Topics Jeffrey R. Ellis."— Presentation transcript:

1 Query Processing with XML CSE 350 – Advanced Database Topics Jeffrey R. Ellis

2 Query Processing Topics Why? Java and Other Programming Languages XPath/XSLT XQuery (W3C-sponsored Query Language) Current Research – Other Query Languages – XISS (XML Indexing and Storage System)

3 FIRST – Distinction between XML and HTML/Web Technologies XML spotlight is analogous to Java – Immediate benefits applied to World Wide Web – Long-range, more exciting benefits in applications XML IS NOT AN HTML REPLACEMENT – HTML marks pages up for presentation on the web – XML marks text for semantic information purposes XML can encode HTML pages, but HTML works well on the Web

4 XML Data Storage XML Documents – Data is delineated semantically – Schemas/DTDs control contents of elements – Semi-structured attitude allows flexibility – Text is human-readable and machine-parsable – Open standards work with common tools – File data storage allows for easy sharing – Can queries control access to data?

5 Traditional Database Storage Databases – Data is delineated semantically – Schemas control contents of rows – No flexibility from semi-structured storage – Data is not human-readable, but only machine- parsable – Proprietary standards prevent interoperability – Proprietary storage prevents data sharing – Queries control access to data

6 XML for Query Processing If we can get efficient query processing, XML document storage provides many benefits over traditional database storage. Sample application – Employee database document – XML Schema assumed to exist – Employee information queried as per standard HR processing

7 Bissell Brian IT Specialist 35,000 CT Pham Hung Q Senior IT Specialist 45,000 CT …

8 Tree Structure of XML Document Remember that XML documents are trees emp gendernamepositionsalarylocation lastfirstmi

9 Query Processing – Programming Languages XML Documents are flat files Any language with file I/O can read XML document Any language with string parsing capabilities can use XML data Query processing done through language syntax “Obvious” result different from traditional databases

10 Query Processing – Programming Languages Strategy – Basic File I/O through language – Basic String matching to identify elements – Processing possible, but not necessarily efficient Languages have gathered XML processing tools in libraries – xerces – Apache library for Java and C++ Two methods for parsing XML data – DOM – SAX

11 DOM Document Object Model Defined by W3C for XML, HTML, and stylesheets Provides an hierarchical, object-view of the document DOMParser parses through file, then provides access to nodes Key: Every item in XML document is a node

12 DOM Example Node (Element) name=“emp” attribute1 child1 Node (Attr) name=“gender” value=“m” parent Node (Element) name=“name” parent child1 Node (Element) name=“last” parent child1 Node (Text) value=“Bissell” parent

13 SAX Simple API for XML Defined by XML-DEV mailing list Provides an event-driven processing of the document XMLReader parses through file and activates different methods and functions based on the elements retrieved Key: Methods are defined in interface, implemented in user code

14 DOM versus SAX SAX is primarily Java-based; DOM defined for most languages DOM requires storage of entire document in memory; SAX processes as it reads DOM mirrors a document that can be revisited; suited for document processing SAX mirrors object lifecycles; suited for data processing

15 Query Processing - XPath/XSLT Standard XML technologies XPath and XSLT provide a ready-made querying infrastructure XPath identifies the location of various document elements XSL Stylesheets provide methods for tranforming data from one format to another Combining XPath and XSLT provides easy generation of result sets based on queries

16 XPath Provides element, value, and attribute identification employees/emp/name/first = “Brian”, “Hung”, “Sara”, “Brian” //salary = “35,000”, “40,000”, “35,000”, “60,000” count(/employees/emp) = 4 //mi = “Q”

17 XSLT Stylesheet transforms data from one form into another = Brian Bissell, Hung Pham, Sara Menillo, Brian Chicos

18 Combine XPath and XSLT for Queries Query: Find the last name and position of each employee named Brian : ;

19 Combine XPath and XSLT for Queries Query: Find the average salary of all non-managers

20 Results XSLT/XPath Many SQL queries can be accomplished – XPath provides element (data) access – XPath provides basic functions (e.g., sum() ) – XPath provides WHERE functionality – XSLT provides SELECT functionality – XSLT provides ORDER BY functionality (sort) – XSLT provides result set formatting – UNION functionality provided..?

21 Querying with XPath and XSLT Important questions – Is it sufficient? – Is it efficient? – Is there a better way? XML community has need to design a full query language XQuery – Working draft published 7 June 2001

22 Query Processing - XQuery XML provides flexibility in representing many kinds of information Good query language must be likewise flexible – Pre-XQuery languages are good for specific types of data Goal: “[S]mall, easily implementable language in which queries are concise and easily understood.”

23 XQuery Forms 1. Path expressions 2. Element constructors 3. FLWR expressions 4. Operator/Function expressions 5. Conditional expressions 6. Quantified expressions 7. Data Type expressions

24 XQuery – Path Expressions Contribution of XPath XQuery 1.0 and XPath 2.0 Data Model document(“sample1.xml”)//emp/salary /employees/emp/name[../@gender=‘f’] //emp[1 TO 3]/name/first

25 XQuery – Element Constructors Queries can generate new elements Similar to XSLT abilities {$name/last} {$position}

26 XQuery – FLWR Expressions For clause/Let clause/Where clause/Return Similar to SQL FOR $e IN document(“sample1.xml”)//emp WHERE $e/salary > 38000 AND $e/@gender = ‘f’ RETURN $e/name

27 XQuery – Operator/Function Expressions Pre-defined and user-defined operators and functions Still under development: Union, Intersect, Except FOR $e IN //employees/emp WHERE not(empty($e//mi)) RETURN $e/name

28 XQuery – Conditional Expressions If-then-else expressions are not yet limited to boolean (ongoing discussion) FOR $e IN /employees/emp RETURN {$name} IF ($e/position=“Manager”) THEN

29 Quanitifed Expressions Some/Every conditions Some/Every evaluates to True or False FOR $e IN //employees WHERE SOME $p IN $e//emp/position = “Manager” RETURN $e

30 Data Types Data Types based on those available from XML Schema Data types can be literal (“Brian”), from constructor functions (date(“2001-10-11”) ), or from casting ( CAST AS xsd:integer(24) ) User-defined data types are also allowable and parsable

31 XQuery More choices than XSLT/XPath combination Work in progress Current W3C efforts into query language Influencing the future design of the core XML technologies (XPath) Hopes to be fully flexible for all future XML applications

32 Query Processing – Research XQuery specification continues to undergo review and change – 6 of 7 specification documents released since June – All specifications released in 2001 Other avenues of research – Other Query languages – Indexing strategies – Implementation

33 Query Processing – Other Query Languages Many query languages exist – Quilt (basis for XQuery) – W3C early languages (XML-QL, XQL) – Adopted traditional languages (OQL, XSQL) – Research papers (XML-GL, YATL, Lorel) Other query languages often optimized for a particular subset of XML documents Query language field *MAY* be standardizing to XQuery

34 Query Processing – Indexing Strategy Query language less important; better indexing techniques lead to efficiency XISS (XML Indexing and Storage System) – September 19, 2001 publishing – Builds sets of indexes on XML data elements and attributes on initial parse of XML document – Lookup becomes constant-time through the various built indexes – Demonstrated successes in test runs

35 Query Processing - Implementation XML is currently in state of flux – Standards are still being revised – Industry cautious before embracing a new technology – Economic slowdown may prevent new research and development efforts XML still waiting for its “Killer App”, application that forces immediate acceptance

36 XML Query Processing XML is a functional database storage language Efficient query language needed to turn XML into a viable database Query language solutions are being developed – Java/C++ hooks first developed – OK – XSLT/XPath implemented – GOOD – XQuery being designed – GREAT? – Future additions – ????


Download ppt "Query Processing with XML CSE 350 – Advanced Database Topics Jeffrey R. Ellis."

Similar presentations


Ads by Google