Integrating XQuery and Relational Database Systems.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

Sam Idicula, Oracle XML DB Development Team Binary XML Storage and Query Processing in Oracle VLDB 2009.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Static Typing in XQuery Mary Fernández, Jérôme Siméon, Philip Wadler Επιμέλεια Παρουσίασης: Μαγδαληνός Παναγής.
Lecture 14 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
1 COS 425: Database and Information Management Systems XML and information exchange.
XML Data in MS SQL Server Query and Modification Steven Blundy, Duc Duong, Abhishek Mukherji, Bartlett Shappee CS561.
Module 9 Designing an XML Strategy. Module 9: Designing an XML Strategy Designing XML Storage Designing a Data Conversion Strategy Designing an XML Query.
Introduction to XML This material is based heavily on the tutorial by the same name at
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
Module 17 Storing XML Data in SQL Server® 2008 R2.
2.2 SQL Server 2005 的 XML 支援功能. Overview XML Enhancements in SQL Server 2005 The xml Data Type Using XQuery.
CVSQL 2 The Design. System Overview System Components CVSQL Server –Three network interfaces –Modular data source provider framework –Decoupled SQL parsing.
Main challenges in XML/Relational mapping Juha Sallinen Hannes Tolvanen.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
An Extension to XML Schema for Structured Data Processing Presented by: Jacky Ma Date: 10 April 2002.
Lecture 15 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
Using XML in SQL Server 2005 NameTitleCompany. XML Overview Business Opportunity The majority of all data transmitted electronically between organizations.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
Copyright © 2012 Accenture All Rights Reserved.Copyright © 2012 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are.
Database Solutions for Storing and Retrieving XML Documents.
Extensible Markup and Beyond
Sofia, Bulgaria | 9-10 October Using XQuery to Query and Manipulate XML Data Stephen Forte CTO, Corzen Inc Microsoft Regional Director NY/NJ (USA) Stephen.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
Company LOGO OODB and XML Database Management Systems – Fall 2012 Matthew Moccaro.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
Querying Structured Text in an XML Database By Xuemei Luo.
Technical Aspects of SIARD “SIARD under the hood” 10. April 2003 / Stephan Heuscher.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
Module 18 Querying XML Data in SQL Server® 2008 R2.
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Copyright© 2005 Oracle Corp.1 SQL/XML Jim Melton USA: Oracle Corp. JTC1 SC32N1632.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
Tutorial 13 Validating Documents with Schemas
XML Engr. Faisal ur Rehman CE-105T Spring Definition XML-EXTENSIBLE MARKUP LANGUAGE: provides a format for describing data. Facilitates the Precise.
XML and Database.
Object Oriented Database By Ashish Kaul References from Professor Lee’s presentations and the Web.
XML Query: xQuery Reference: Xquery By Priscilla Walmsley, Published by O’Reilly.
Session 1 Module 1: Introduction to Data Integrity
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
Module 3: Using XML. Overview Retrieving XML by Using FOR XML Shredding XML by Using OPENXML Introducing XQuery Using the xml Data Type.
Lecture 23 XQuery 1.0 and XPath 2.0 Data Model. 2 Example 31.7 – User-Defined Function Function to return staff at a given branch. DEFINE FUNCTION staffAtBranch($bNo)
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
XML Extensible Markup Language
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Creating Database Objects
XML: Extensible Markup Language
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
12/9/2018 6:15 AM © 2004 Microsoft Corporation. All rights reserved.
Data Model.
Database Systems Instructor Name: Lecture-3.
Creating Database Objects
Oracle and XML Mingzhu Wei /7/2019.
Presentation transcript:

Integrating XQuery and Relational Database Systems

Xml and relational databases Xml Advantages: simple unicode based platform independent syntax many parsers is able to represent structured data, semi- structured data and markup data.

Structured data C1 Janine Smith 1 Broadway Way Seattle WA P P3

Semi-structured data Janine Smith 1 Broadway Way Seattle WA Janine came in with a rash. We idetified a antibiotics allergy and Changed her cold prescription. P2 Nils Soerensen 23 NE 40th Street, New York

Markup data Janine came in with a rash. We idetified a antibiotics allergy and Changed her cold prescription.

Xml and realtional databases Relational DBs manage structured data and thay are being used by 80% of the market. Most relational DBs developped capabilities that fit XML structured data concept. Recently there have been attempts to deal with data that doesn’t fit that concept.

Overview Relational Storage of XML: The Xml type Integrating Xquery and SQL: Querying XML datatypes Top-level XQuery

Relational Storage of XML: The Xml type All LOB-based storage mechanisms provide a built in XML datatype. The SQL name for that built in datatype is XML. There are different logical models and physical representations for the XML datatype.

Logical Models for the XML Datatype The standard doesn’t define the impelmentation of the XML datatype, only the requirement it should satisfy: Representing any element content, multiple top-level element nodes and top- level text nodes. It is possible to cast the XML datatype to string (serializing) and vice versa (parse) using SQL.

Physical Models for the XML Datatype All the storage mechanisms assume that the data has been verified to be a well-formed instance of the XML datatype. XML storage fidelities: –String-level fidelity: code-point for code-point. –Infoset-level fidelity: the stored XML shares the same inforamtion set as the original XML document. –Relational fidelity level: preserves information from relational point of view and disgards XML-specific properties, such as document order.

Character LOB (CLOB) With CLOB, the XML is stored in a character representation. The may preserve the original XML data exactly (string-level fidelity). The data may have been transformed by changing the encoding of the content (infoset- level fidelity). This method is not efficient: It requires extensive indexing or requires every query to parse the data before executing the query. Hard to update on a node level.

Binary LOB (BLOB) Binary format provides more efficient index processing and compression, such as pre processed xml in the form of a DOM tree. BLOB could provide string-level fidelity, but it usually provides only infoset level. A BLOB provides an additional level of abstracion over CLOBs. Node-level updates are costly since the BLOB support in relational systems is not designed for logical updates.

User-Defined Type Relational database systems allow user to add user defined physical presentation. The format of the XML is defined by the user (binary vs text etc..). There are two major approaches for supporting user-defined types: 1.Deep integration approach: provides type integration into the actual query processor. 2.Virtual Machine approach: provides an external co-engine to help processing XML querries.

Relational table mapping XML data is being mapped to relational data and indices. It often only provides relational fidelity. Node tables: represents every node in the XML instance as a row in the table. Table shredding: Can be used if additioanal structured information is available in the form of a schema.

Node table example XML (untyped)

Shredded table example XML with a schema that describes Customer and Order

Typing an XML Datatype Typing xml values according to xml schemas. In order to provide type information from an xml schema, the type relevant information must be managed in the metadata componant of the database. Relational databases enable mapping of xml namespace URIs to SQL idetifiers. example: CREATE XML SCHEMA POSchema NAMESPACE N'<schema xmlns=" targetNamespase= '

Typing an XML Datatype XML datatype instances are validated according to the schemata in the database repository. There are three modes of validation: –Skip validation: no validation. –Lax validation: validation of a subtree only if there is an applicable schema component. –Strict validation: requires that all data conform to the schema. isvalid( XML instance, SchemaComponent,validation mode )  boolean

Typing an XML Datatype Static association vs. dynamic association –CREATE TABLE Customers ( CustomerID int PRIMARY KEY, CustomerName nvarchar(100), PurchaseOrders XML TYPED AS POSchema, OrderForm XML, OrderFormType nvarchar(1000) CHECK isvalid(OrderForm,OrderFormType,’strict’) )

Integrating XQuery and SQL: Querying XML datatypes SQL can retrieve the XML -typed column, but it can’t query it. That is where XQuery comes in. XQuery allows us to query and transforn the XML data. This section explains the integration of the two, so that we can invoke XQuery functionality from SQL, and provide information from the relational environment to the Xquery context.

XQuery Functionality in SQL We must be able to transform an XML datatype instance to another XML datatype. We must have a way to extract information from an XML instance that fits into the SQL type system (scalar SQL types/ XML datatype). Testing the XML structure for existence –value(XML, XQueryString, SQLType)  SQLType –exists(XML, XQueryString)  boolen –query(XML, XQueryString)  XML

XQuery Functionality in SQL Compiling the XQuery Expressions –The XQuery expressions could be given dynamically or as constant strings. If the XQuery expression is given as a constant string, then relational systems can compile the XQuery at the same time as the SQL statement.

Augmenting the XQuery Static Context The XQuery default collation is implicitly set to the relational collation of the XML datatype. The relational database system’s built in functions are added to the static function context, as are the SQL constructors for the built in types. The SQL built in types are added to the static type context, as are the types of the schema components in case of a statically constrained XML datatype.

Providing Access to SQL Data inside XQuery Sometimes, an XQuery expression needs to access data from the relational realm. One way to access SQL variables in XQuery is by mapping the variables into the XQuery static variable context. Since there are more valid name characters in SQL than in XML, different encoding should take place. This may be done using the variable() built-in pseudo-function. Accessing an SQL column may be done using the column() built-in pseudo-function.

Mapping SQL Types to XML Schema Types In order to be able to import the relational values into the XQeury context, the SQL types must be mapped to XML Schema. Example: Relational Type: INTEGER maps to: Since many relational systems provide additional built-in SQL types, they provide mappings that describe the implementation types.

Mapping SQL Types to XML Schema Types The user must only provide the mapping for implementation specific namespaces. Problems: –Mapping of strings: one XML Schema type for each length. –There are certain SQL character type values that are invalid in XML (most of the low-range ASCII control characters).

Adding XQuery Function Libraries Importing externally defined XQuery function libraries is similar to the XML Schema import model. A relational system can store XQuery function libraries by extending the metadata to allow them to be stored according to their namespace URI or an SQL idetifier. Such libraries would then be loaded and stored and imported into the XQuery static context when reffered to.

CREATE XML FUNCTION NAMESPACE N'module " declare namespace myf=" define function myf:in-King-Country-WA ($zip as xs:integer) as xs:boolean ($zip = 90110) define function myf:King-Country-WA-salestax($x as xs:decimal) as xs:decimal {$x * 0.88} SELECT CustomerNAme, query(PurchaseOrder, 'import module namespace myf=" declare namespace po = " for $p in /po:purchase-order, $d in $p/po:details let $net as xs:decimal := * where myf:in-King-Country-WA($p/po:shipTo/op:zip) return <po-detail total="{$net + myf:King-Country-WA-salestax($net)}"/>') as po-detail-price FROM Customers

Physical mapping of XQuery General discussion on the different mapping strategies of XQuery into physical execution plans. We assume that the XQuery expressions are provided as constants; thus their compilation and SQL expressions’ can occur simultaniously.

Physical mapping of XQuery Two major compiling and executing XQuery expressions methods: –Decoupled approach: works with a standalone XQuery engine that is added to the relational system. The XQuery expression is passed to the XQuery processor, the XQuery is processed and the result is returned to the SQL environment.

Physical mapping of XQuery –Intergrated approach: Maps the XQuery expressions into logical operator trees that are integrated with the logical operator tree of the SQL statement. Complications: Xquery has a more complex, nested and sequence based data-model. Types and their operations are not mapped one to one. Thus, relational systems must extend their expression services to deal with the above. Preserving input order in execution.

Physical mapping of XQuery Intergrated approach cost based optimizer will have to choose execution plans that preserve the XQuery semantics. Optimizer will often choose a bottom-up evaluation strategy, while the naïve execution startegy of XQuery is described top down. That may cause dynamic errors that would have been avoided in the top- down strategy.

Example of the previous situation: for $i in //A where castable as xs:integer return for $j in $i/B where return $j Optimization: for $i in //A, $j in $i/B where castable as xs:integer and return $j

Issues of combining SQL, XML Datatype and XQuery The need to know the two languages: SQL and XQuery. –XQuery users interested in querying across multiple documents need to use SQL to iterate over the collection of documents. –SQL users interested in querying into XML documents need to use XQuery.

Issues of combining SQL, XML Datatype and XQuery Solutions, more problems and more solutions… –The second problem can be avoided by shredding, but this mapping approach often only provides relational fidelity and thus does not address order preservation and markup scenarios. Thus either the SQL model should be extended, or The SQL model will be subsummed under the XML and Xquery model.

Top-Level XQuery Top-Level XQuery provides a way to query collections of XML documents and forests without the need to use SQL. One advantage is the abillity to provide concurency and locking. The mid-tier model can do the above, however, if there’s already an XML datatype, it seems natural to extend the programming model to provide top-level Xquery support: XQUERY {XQuery expression}.

XML Document (Fragmet) Collections There are two ways to provide XML collections. –An XQuery function collection(). Used to refer to a column that is typed as XML, or the same function provided by the database. Another approach is to extend the notion of a table to allow the creation of a table of the XML datatype instead of a rowtype.

CREATE TABLE PurchaseOrders OF TYPE XML TYPED AS POSchema INSERT INTO PurchaseOrders SELECT PurchaseOrders FROM Customers XQUERY-MODIFY { Insert 42 2 nd Avenue Bellevue WA at last into sql:collection(‘PurchaseOrders’) } XQUERY { declare namespace po = sql:collection(‘PurchaseOrders’)/po:purchase-order/po:details }

XML Views over Relational Data There are two main mapping approaches from relational to XML: –Table-to-doc: Maps every table to an XML document, where the root’s name is the table’s, and every row is mapped to an element named “row”. –Table-to-forest: Maps a table into an XML element forest.

XML Views over Relational Data The top-level XQuery environment can access these views, by doc(), collection() or other built in functions. Any of these views can be represented as a single XML document by making the top-level element(s) become the children of a document node. The result may not necessarily be well formed.

XML Views over Relational Data Example of a possible built-in XQuery function provided by top-level XQuery: sql:table($table as xs:string [,$table_map_option as xs:string [,$null_map_option as xs:string] ]) as element* –Use: Sql:table(‘Order’,’table-to-forest’,’xsinil’)

Conclusion and Issues Overview of how to integrate relational database systems with Xquery –The XML datatype –Mixed approach of using XQuery and SQL –Top-level XQuery approach –Insight into the imapct of the actual physical data model of the XML datatype on the processing of Xquery in the context of relational systems.

Open Issues Which of the physical mappings of the XML data support XML inside a relational content the best? How should the query-processing model be extended? Concurancy control

THE END!