Using XML in SQL Server and Azure SQL Database Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Meet Your Instructors Pete Harris | @peteatmsft Senior Content Devloper Various roles at Microsoft since 1995 Graeme Malcolm | @graeme_malcolm Senior Content Developer at Microsoft Consultant, trainer, and author since SQL Server 4.2
Course Topics Using XML in SQL Server and Azure SQL Database 01 | Introduction to XML Data 02 | Storing and Querying XML 03 | Implementing XML Indexes 04 | Working with Typed XML 05 | Creating Relational Data from XML 06 | Creating XML from Relational Data
Setting Expectations Target Audience SQL Server database developers Anyone pursuing Microsoft certification in SQL Server Suggested Prerequisites/Supporting Material Querying with Transact-SQL MVA course
Introduction to XML Data Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Module Overview What is XML? Representing Data with XML XML Namespaces XML Support in SQL Server and Azure SQL Database
What is XML? Based on SGML Describes data structures and values Commonly used by applications for: Data interchange Configuration Data storage Case-sensitive <?xml version="1.0" encoding="ISO-8859-1"?> <order id="123456" date="2015-01-01"> <salesperson id="123"> <name>Naomi Sharp</name> </salesperson> <customer id="921"> <name>Dan Drayton</name> </customer> <items> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order>
Representing Data with XML Basic XML Structure <?xml version="1.0" encoding="ISO-8859-1"?> <order id="123456" date="2015-01-01"> <salesperson id="123"> <name>Naomi Sharp</name> </salesperson> <customer id="921"> <name>Dan Drayton</name> </customer> <!-- an order may contain multiple items --> <items> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order> element processing instruction text comment shorthand syntax for closing empty elements attribute – order is not enforced
An XML Document
Representing Data with XML Node Trees / order id date customer name items item quantity salesperson <?xml version="1.0" encoding="ISO-8859-1"?> <order id="123456" date="2015-01-01"> <salesperson id="123"> <name>Naomi Sharp</name> </salesperson> <customer id="921"> <name>Dan Drayton</name> </customer> <!-- an order may contain multiple items --> <items> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order>
Representing Data with XML Documents vs Fragments Documents are well-formed They have a single root element Fragments have no root element But all elements in the fragment are well-formed <employee id="123"> <first-name>Naomi</first-name> <last-name>Sharp</last-name> </employee> <product id="561" price ="1013.79">Mountain Bike</product> <product id="127" price ="9.59">Water Bottle</product>
XML Namespaces <items> <?xml version="1.0" encoding="ISO-8859-1"?> <o:order xmlns:o="http://aw/order" xmlns:s="http://aw/sales" xmlns:c="http://aw/customer" xmlns:p="http://aw/product" o:id="123456" o:date="2015-01-01"> <s:salesperson s:id="123"> <s:name>Naomi Sharp</s:name> </s:salesperson> <c:customer c:id="921"> <c:name>Dan Drayton</c:name> </c:customer> <!-- an order may contain multiple items --> <o:items> <o:item p:id="561" o:quantity="1"/> <o:item p:id="127" o:quantity="2"/> </o:items> </o:order> <?xml version="1.0" encoding="ISO-8859-1"?> <order id="123456" date="2015-01-01"> <salesperson id="123"> <name>Naomi Sharp</name> </salesperson> <customer id="921"> <name>Dan Drayton</name> </customer> <!-- an order may contain multiple items --> <items> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order> <?xml version="1.0" encoding="ISO-8859-1"?> <order xmlns="http://aw/order" id="123456" date="2015-01-01"> <salesperson xmlns="http://aw/sales" id="123"> <name>Naomi Sharp</name> </salesperson> <customer xmlns="http://aw/customer" id="921"> <name>Dan Drayton</name> </customer> <!-- an order may contain multiple items --> <items> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order>
XML Support In SQL Server and Azure SQL Database Feature Description xml data type Store XML in variables and columns XQuery Node-tree query language for XML XML Indexes Optimize XQuery performance XSD Schemas Enforce type validation for XML data FOR XML clause Generate XML from relational data OPENXML function Generate relational data from XML
XML Support Scenarios for XML in a Database Why work with XML data in SQL Server and Azure SQL Database? You need to share, query, and modify XML in an efficient and transacted way You have both relational and XML data and need to have interoperability You need to build cross-domain applications and need portability of data Your data is sparse or you do not know the structure of your data You want the server to guarantee that the XML is well-formed and optionally validate your data against a schema You want to index your XML data for high query performance You want to store and query relational data, but use XML as an interchange format
Introduction to XML Data What is XML? Representing Data with XML XML Namespaces XML Support in SQL Server and Azure SQL Database
Storing and Querying XML Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Module Overview The xml Data Type XQuery Querying the xml Data Type Modifying the xml Data Type
The xml Data Type Native data type for XML Enables you to store XML documents and fragments Used for columns, variables, or parameters Stores XML in an internal form When queried, semantically equivalent XML is returned
The xml Data Type
XQuery Basic Syntax Query syntax for XML Based on XPath expressions / order id date customer name items item quantity salesperson <?xml version="1.0" encoding="ISO-8859-1"?> <order id="123456" date="2015-01-01"> <salesperson id="123"> <name>Naomi Sharp</name> </salesperson> <customer id="921"> <name>Dan Drayton</name> </customer> <!-- an order may contain multiple items --> <items> <item id="268" quantity="2"/> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order> /order/salesperson/name /order//name /order/customer/@id /order/items/item[1] /order/items/item[@quantity > 1]
XQuery FLWOR Queries For, (Let), Where, Order By, Return <?xml version="1.0" encoding="ISO-8859-1"?> <order id="123456" date="2015-01-01"> <salesperson id="123"> <name>Naomi Sharp</name> </salesperson> <customer id="921"> <name>Dan Drayton</name> </customer> <!-- an order may contain multiple items --> <items> <item id="268" quantity="2"/> <item id="561" quantity="1"/> <item id="127" quantity="2"/> </items> </order> <item id="127" quantity="2"/> <item id="268" quantity="2"/> for $i in /order/items/item where $i/@quantity > 1 order by $i/@id return $i
Querying the xml Data Type The xml data type exposes methods support XQuery expressions query (xquery): Returns XML node(s) value (xquery, datatype): Returns a scalar value exist (xquery): Returns 1 (True) or 0 (False) to indicate existence of specified node nodes (xquery) AS table (column): Returns a rowset of XML nodes Often used with CROSS APPLY
Querying the xml Data Type
Modifying the xml Data Type Use the modify method: modify (insert … into xquery) modify (replace xquery with …) modify (delete xquery) Note: When modifying XML in a table, the modify method is called in the SET clause of an UPDATE statement – regardless of the type of operation!
Modifying the xml Data Type
Storing and Querying XML The xml Data Type XQuery Querying the xml Data Type Modifying the xml Data Type
Implementing XML Indexes Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Module Overview Introduction to XML Indexes Primary XML Index Secondary XML Indexes
Introduction to XML Indexes XML data can be slow to access Node tree is created for each query XML indexes can help with query performance Pre-defined node tree Indexes contain details of: Nodes Values Paths / order id date customer name items item quantity salesperson Indexes are used in SQL Server to improve the performance of queries. XML indexes are used to improve the performance of XQuery-based queries. Many systems query XML data directly as text. This can be very slow, particularly if the XML data is large. You saw earlier how XML data is not directly stored in a text format in SQL Server. For ease of querying, it is broken into a form of object tree that makes it easier to navigate in memory. Rather than having to create these object trees as required for queries, which is also a relatively slow process, it is possible to define XML indexes. An XML index is rather like a copy of an XML object tree that is saved into the database for rapid reuse. It is important to note that XML indexes can be quite large compared to the underlying XML data. Relational indexes are often much smaller than the tables on which they are built, but it is not uncommon to see XML indexes that are larger than the underlying data. You should also consider alternatives to XML indexes. Promoting a value that is stored within the XML to a persisted calculated column would make it possible to use a standard relational index to quickly locate the value.
Primary XML Index Provides a persisted node tree in an internal format that is used to speed access to elements and attributes within the XML Requires a clustered primary key on the table CREATE PRIMARY XML INDEX XML_Order_Items ON Sales.Order (Items); The primary XML index basically provides a persisted object tree in an internal format. The tree has been formed from the structure of the XML, is used to speed up access to elements and attributes within the XML, and avoids the need to read the entire XML document for every query. Before you can create a primary XML index on a table, the table must have a clustered primary key.
Secondary XML Indexes Can only be created after a primary XML index has been created You can create three types of secondary indexes to help resolve specific XQuery expressions rapidly: PROPERTY – Optimized for retrieving multiple values in query method calls VALUE – Optimized for retrieving single values in value method calls PATH – Typically used by the exist method Most of the querying benefit comes from primary XML indexes, but SQL Server also enables the creation of three types of secondary XML index. These secondary indexes are each designed to speed up a particular type of query. There are three forms of query that they help with: PATH, VALUE, and PROPERTY: A PATH index helps to decide whether a particular path to an element or attribute is valid. It is typically used with the exist() XQuery method. A VALUE index helps to obtain the value of an element or attribute. A PROPERTY index is used when retrieving multiple values through PATH expressions. You can only create a secondary XML index after a primary XML index has been established. When you are creating the secondary XML index, you need to reference the primary XML index. CREATE XML INDEX XML_Order_Items_Property ON Sales.Order (Items) USING XML INDEX XML_Order_Items FOR PROPERTY;
Using XML Indexes
Implementing XML Indexes Introduction to XML Indexes Primary XML Index Secondary XML Indexes
Working with Typed XML Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Module Overview XML Schemas XML Schema Collections Untyped XML vs Typed XML Storing Fragments or Documents
XML Schema XML schema describes the structure of an XML document XML schema language is also called XML Schema Definition (XSD) An XML schema provides the following: Validation constraints Data type information Information about types of attributes and elements
Validating XML with an XML Schema
XML Schema Collections You can optionally associate XSD schemas with an xml data type through an XML schema collection The XML schema collection: Stores the imported XML schemas Validates xml data type instances Types the XML data as it is stored in the database
Untyped XML vs. Typed XML Use the untyped xml data type when: You do not have a schema for your XML data You have schemas, but don’t want the server to validate data (there is significant impact on a server performing validation) Use the typed xml data type when: You have schemas and want the server to validate XML data You want to take advantage of storage and query optimizations based on type information You want to take advantage of type information during compilation of your queries
Storing Fragments or Documents The xml data type stores content by default (including fragments) You can specify the DOCUMENT keyword to prevent storage of fragments
Using XML Schema Collections
Working with Typed XML XML Schemas XML Schema Collections Untyped XML vs Typed XML Storing Fragments or Documents
05 | Creating Relational Data from XML Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Module Overview Options for Shredding XML The nodes Method sp_xml_preparedocument and sp_xml_removedocument OPENXML OPENXML with Namespaces XML Metadata Properties
Options for Shredding XML The nodes method of the xml data type The OPENXML statement <?xml version="1.0“ ?> <order id="123456"> … </order> SELECT t.n.value(…) FROM @.nodes AS t(n) <?xml version="1.0“ ?> <order id="123456"> … </order> sp_xml_preparedocument SELECT * FROM OPENXML sp_xml_removedocument
Relative path to value node The nodes Method Relative path to value node SELECT OrderTable.OrderXml.value('./@id', 'int') AS OrderID, OrderTable.OrderXml.value('./@date', 'date') AS OrderDate, OrderTable.OrderXml.value('./customer[1]/name[1]', 'nvarchar(25)') AS Customer FROM @x.nodes('/order') AS OrderTable(OrderXml); Context node Use nodes in the FROM clause to specify a context node Creates a row for each instance of that node Use value in the SELECT clause to extract values from relative nodes (using XPath expressions)
Shredding XML with the nodes Method
sp_xml_preparedocument and sp_xml_removedocument Use sp_xml_preparedocument to create a node tree Use sp_xml_removedocument to release memory DECLARE @docHandle int; EXEC sp_xml_preparedocument @docHandle OUTPUT, @xml; EXEC sp_xml_removedocument @docHandle;
OPENXML Handle returned by sp_xml_preparedocument Context node SELECT * FROM OPENXML(@docHandle, 'order', 1) WITH (id int, date date, customerid varchar(25) 'customer/name'); Flags: 0: attribute-centric 1: attribute-centric 2: element-centric 3 (1+2): attributes and elements Explicit path to non-default node Note that the rowset schema in the WITH clause can be a table name if the flags enables all nodes to be located without an explicit column pattern. Specify document handle, row pattern to context node, and flags to indicate default centricity Specify rowset schema and explicit column patterns in the WITH clause
Using OPENXML
OPENXML with Namespaces Specify namespace and prefix in sp_xml_preparedocument Additional parameter with <root> tag Prefix namespace in row and column patterns EXEC sp_xml_preparedocument @docHandle OUTPUT, @xml, '<root xmlns:awo="http://aw/order"/>'; SELECT * FROM OPENXML(@docHandle, 'awo:order', 1) WITH (id int, date date, customer varchar(25) 'awo:customer/awo:name');
Using OPEN XML with a Namespace
XML Metadata Properties Retrieving XML Metadata properties Specify @mp:property_name as a column pattern Creating overflow columns Use Flag + 8, retrieve non-mapped XML in @mp:xmlText Creating an edge table Omit flags and WITH clause
Using Metadata Columns
Creating Relational Data from XML Options for Shredding XML The nodes Method sp_xml_preparedocument and sp_xml_removedocument OPENXML OPENXML with Namespaces XML Metadata Properties
06 | Creating XML from Relational Data Pete Harris | Senior Content Developer, Microsoft Graeme Malcolm | Senior Content Developer, Microsoft
Module Overview The FOR XML Clause RAW Mode AUTO Mode EXPLICIT Mode PATH Mode Using Namespaces with FOR XML
The FOR XML Clause Extends SELECT syntax Returns XML instead of rows and columns Is configurable to return attributes, elements, and the schema
RAW Mode Returns an XML representation of a rowset Results can be element- or attribute-centric Specify an optional root element and row element name SELECT c.CustomerID AS CustID, c.CompanyName, soh.SalesOrderID, soh.OrderDate, soh.TotalDue FROM SalesLT.Customer AS c INNER JOIN SalesLT.SalesOrderHeader AS soh ON c.CustomerID = soh.CustomerID ORDER BY c.CustomerID FOR XML RAW('Order'), ROOT('Orders'), ELEMENTS;
Using RAW Mode
AUTO Mode Creates nested child-elements for joined tables Elements are named to match table (or alias) Results can be element- or attribute-centric Specify an optional root element SELECT [Order].SalesOrderID, [Order].OrderDate, [LineItem].ProductID, [LineItem].OrderQty, FROM Sales.SalesOrderHeader AS [Order] INNER JOIN Sales.SalesOrderDetail AS LineItem ON LineItem.SalesOrderID = [Order].SalesOrderID ORDER BY [Order].SalesOrderID FOR XML AUTO, ROOT('Orders'), ELEMENTS;
Using AUTO Mode
EXPLICIT Mode Enables tabular representation of XML documents Enables complete control of the XML structure SELECT 1 AS Tag, NULL AS Parent, SalesOrderID AS [Invoice!1!InvoiceNo], OrderDate AS [Invoice!1!Date], CustomerID AS [Invoice!1!CustomerID!Element], TotalDue AS [Invoice!1!TotalDue!Element] FROM SalesLT.SalesOrderHeader WHERE SalesOrderID = 71774 FOR XML EXPLICIT;
Using EXPLICIT Mode
PATH Mode Uses XML Path Language (X Path) to specify XML format Enables the creation of nested data and specifies what should be exposed as an element or an attribute Easier to use than EXPLICIT mode SELECT o.SalesOrderID AS '@invoiceno', o.OrderDate AS '@date', o.CustomerID AS 'customer/@id', c.CompanyName AS 'customer', o.TotalDue AS 'totaldue' FROM SalesLT.SalesOrderHeader AS o JOIN SalesLT.Customer AS c ON o.CustomerID = c.CustomerID WHERE SalesOrderID = 71774 FOR XML PATH('invoice');
Using PATH Mode
Using Namespaces with FOR XML Supported only for RAW, AUTO, and PATH modes Specify a namespace and prefix Explicitly name elements and attributes with prefix WITH XMLNAMESPACES ('http://aw/order' AS ord) WITH XMLNAMESPACES ('http://aw/order' AS ord) SELECT SalesOrderID AS 'ord:SalesOrderID', OrderDate AS 'ord:OrderDate', CustomerID AS 'ord:CustomerID', TotalDue AS 'ord:TotalDue' FROM SalesLT.SalesOrderHeader WHERE SalesOrderID = 71774 FOR XML RAW('ord:Order'), ELEMENTS;
Using Namespaces with FOR XML
Creating XML from Relational Data The FOR XML Clause RAW Mode AUTO Mode EXPLICIT Mode PATH Mode Using Namespaces with FOR XML
Using XML in SQL Server and Azure SQL Database 01 | Introduction to XML Data 02 | Storing and Querying XML 03 | Implementing XML Indexes 04 | Working with Typed XML 05 | Creating Relational Data from XML 06 | Creating XML from Relational Data