XML? What’s this doing in my database? Adam Koehler SQL Saturday #901 Kansas City, MO XML? What’s this doing in my database? Adam Koehler
THANK YOU, SPONSORS!
Surveys and late breaking information Please fill out the session and event surveys https://www.sqlsaturday.com/901/Sessions/Session Evaluation.aspx?sid=94467
About Me: I have worked with SQL Server since 2000. Adam Koehler Senior Database Administrator at ScriptPro, LLC I have worked with SQL Server since 2000. Worked with versions 7.0 to 2017 E-Mail: ajkoehl@gmail.com Twitter: @sql_geek LinkedIn: https://www.linkedin.com/in/adam-j-koehler Blog: https://sqlgeekery.wordpress.com
Topics What is XML? History of SQL Server Support Storing XML in the database Querying XML data Manipulating XML in the database Performance considerations Demos Q&A
XML, what is it? Stands for eXtensible Markup Language Created in 1996, approved by W3C in 1998 Designed to store & transport data Designed to be human and machine readable Is defined by W3C standards Used by different systems to for communication (HL7, RSS, ATOM,.NET) Also used to describe data
Components of an XML document Elements Individual Parts of the XML Come in pairs The root element starts an XML document Can contain text, attributes, other elements, or a combination of both Can contain multiple values
Components of an XML document Attributes Can describe an element Are always quoted Cannot be a tree structure Can only contain a single value
XML Document Definition <?xml version=“1.0” encoding=“UTF-8”?> <root> <child></child> <child> <subchild>…</subchild> </child> </root> <?xml version=“1.0” encoding=“UTF-8”?> <car make=“Ford”> <model>Taurus <year>2013</year> <Features> <Engine>V6</Engine> <Power> <Seats>Driver</Seats> <Windows>Yes</Windows> </Power> </Features> <MSRP>25935</MSRP> </model> </car>
History of XML In SQL Server support for generating XML result sets (FOR XML) and parsing/manipulating XML strings (OPENXML) SQL 2005 Introduced the XML data type XQuery support XML DML support (Microsoft Specific) Indexes on XML columns SQL 2012 UTF 16 Support Selective XML indexes
Storing XML in the database Why? Need original source version of data that is inserted into relational tables for auditing purposes Source data structure changes rapidly, Allows for a single point of storage. allows you to store in the database without having to change the schema constantly The database itself contains the XML data, instead of individual files on disk, thus decreasing the amount of files that have to be backed up. Plus, the benefit of SQL Server and transactions. Backup the database, and you have the data.
How to store XML in the database Can use the following data types: (n)varchar(max) XML varbinary(max)
XML Data Type Stores the XML data in a way that preserves the XML document structure and order Can hold either untyped or typed XML Has a 2GB size limit per row Can be nested up to 128 levels Cannot be compared or sorted Can only be used with the ISNULL,COALESCE & DATALENGTH functions Cannot explicitly be a key column in an index
Querying XML data Standard T-SQL statement: SELECT Source_data_id,XML_Column FROM XML_Source_Data WHERE XML_Column LIKE '%string%' XQuery: SELECT Source_data_id FROM XML_Source_Data WHERE XML_Column.exist('//Inva lid_date') = 1
Manipulating XML Data Can use XQuery to modify specific sections of an XML column UPDATE XML_Source_Data SET XML_Column.modify('delete //Make[Ford]/Model[Pinto]' WHERE Source_data_id = 123
Indexing XML Data Depends on how the XML is stored in the database (max) data types Can’t make (max) data types a key column on an index due to the 900 byte ceiling, but can be an included column on the index XML data type XML indexes do exist Primary Secondary Selective XML indexes Fulltext indexes Can be used on both character and XML data types
XML Indexes – Primary Must have a clustered index on the Primary Key column of the table before this can be created Is a shredded representation of the BLOB data in the XML column Each row in the index stores the following: Tag name (element/attribute) Value of the node Node Type Document order Path from each node to the root of the XML tree (stored in the index in reverse order) Primary key of the table the XML column is in
XML Indexes - Secondary Must have a primary XML index first Three types of secondary indexes PATH PROPERTY VALUE
Secondary XML Indexes - Path Best for queries that use exist() in the WHERE clause Path and node values are key columns in this type of index, and querying for these will result in seeks against the index SELECT Source_data_id FROM XML_Source_Data WHERE XML_Column.exist(‘//@Inv alid_date[.="2019-04- 01"]’) = 1
Secondary XML Indexes - Property Works best when querying for objects with the value() method when a query is using the primary key column in the WHERE clause
Secondary XML Indexes - Value Based on the node values in the Primary XML index Is a good option when you’re looking for a specific value, but don’t know the element or attribute the value is associated to
Drawbacks to XML Indexes They index the entire xml document This leads to larger indexes, which in turn leads to larger database sizes Index creation/maintenance takes longer Xquery statements take longer because of the size of the indexes that have to be used
Selective XML Indexes Added in SQL Server 2012 When a selective index is created, the xml data path is shredded and stored in a normal table within SQL Server as a sparse column Improvement over normal XML indexes Allows for only a section of the XML document to be indexed, thus decreasing index size and maintenance times Existing applications supplying the XML data do not have to be modified to support these indexes Is not recommended for large numbers of node paths or for queries that look for elements in unknown locations
Selective XML Indexes – Details Must have a clustered index on the primary key of the requisite table Size of the PK cannot be more than 128 bytes Clustering key cannot be more than 15 columns Can only be created on a single XML column Limited to the XML data type Can have up to 249 selective indexes on a given table’s XML column Each XML column can have one selective index Cannot be used with query hints
Full-text & XML indexes, so happy together Can use full-text indexes on XML data types alone, or in combination with XML indexes. A full-text index will index the values of the XML data, but not the XML markup.
Metadata on XML indexes Has a record in sys.indexes with index type 3 Dedicated DMV sys.xml_indexes Using_xml_index NULL for Primary XML index References the Primary XML index for any Secondary indexes Secondary_type P – Path, V – Value, R – Property Secondary_type_desc Xml_index_type 0 – Primary, 1 – Secondary, 2 – Selective XML, 3 – Secondary Selective XML_index_type_desc Path_id NULL for all indexes except for Secondary Selective References sys.selective_xml_index_paths.Path_id
Metadata on XML indexes – There’s more! sys.selective_xml_index_paths Path – promoted path for the index Path_type 0 – Xquery 1 – SQL Xml_component_id Unique id of the XML schema in the database
DEMOS!!! AND Q&A
Links and References XML Data Type: http://bit.ly/2KrokJt Selective Indexes: http://bit.ly/2ONx1lG Optimization Hints for Selective indexes: http://bit.ly/2OQJkOh Full-text index reference: http://bit.ly/2wwBLhR Using Full-text indexes with XML data: http://bit.ly/2kLpky2 XML DMV’s: http://bit.ly/2YVSfxu
About Me: I have worked with SQL Server since 2000. Adam Koehler Senior Database Administrator at ScriptPro, LLC I have worked with SQL Server since 2000. Worked with versions 7.0 to 2017 E-Mail: ajkoehl@gmail.com Twitter: @sql_geek LinkedIn: https://www.linkedin.com/in/adam-j-koehler Blog: https://sqlgeekery.wordpress.com