Download presentation
Presentation is loading. Please wait.
Published byErika Hamilton Modified over 5 years ago
1
Comparing RDMS to XML Understanding the Differences David Plotkin
Data Quality Manager Wells Fargo Consumer Credit Group
2
What we will cover XML from a Data Practitioner’s standpoint
Relational databases and schemas A comparison of relational and XML architecture What each technology is good at – and when to use it. Relational metadata and XML metadata How to navigate a relational schema vs. an XML schema Gathering data (queries) in a database vs. an XML document Differences in flexibility between the two Cost considerations Data integrity considerations So…what do we do?
3
XML: A Data Practitioner’s View (1)
XML is a specification for designing tag-based languages. The specification allows for: Metadata (XML Schemas) that define Valid data structures Defining of user data types Valid lists of values, ranges, and patterns Optionality/cardinality Elements and Attributes Reusability of data, data types, and schemas Creation of instance documents based on an XML Schema
4
XML: A Data Practitioner’s View (2)
The potential exists for: Industries to agree on an XML-based language for data exchange. Exchange of XML instance documents between trading partners An entire industry has grown up around: Providing XML tools (and repositories) Ongoing development of standards XML has gained very wide acceptance!
5
XML Content Here is an easy-to-understand sample of XML:
<?xml version = "1.0" encoding = "UTF-8"?> <DataTransmission xmlns:xsi = " xsi:noNamespaceSchemaLocation = "PatientSearchResponse.xsd" Source = "CentralPatient" Target = "Store042" MsgTypeCode = "PSRS" MsgTypeDesc = "PatientSearchResponse" FoundFlag = "true"> <PatientList> <PatientIdFields> <FirstName>Maria</FirstName> <LastName>Montes</LastName> <Birthdate> </Birthdate> <Gender>F</Gender> <StreetAddress>1969 Ygnacio Valley Road</StreetAddress> <CityAddress>Walnut Creek</CityAddress> <StateCode>CA</StateCode> <ZipCode>94597</ZipCode> <PatientID> </PatientID> </PatientIdFields> </PatientList> </DataTransmission>
6
The XML Document(1) An XML document is:
A self-contained set of structured (“tagged”) data Decoupled from the source systems Tagged data in the document does not change when the source system data changes The document represents a snapshot of the source system data at a particular moment in time. Packaged with the metadata (tags) for a specific business transaction
7
The XML Document (2) An XML Document has: Semantic content Order
Values (tagged data) Metadata (the tags) Order Hierarchy Structure
8
The XML Schema An XML document is usually (but not always) validated by an XML Schema. The XML Schema provides the information on whether the XML document “followed the rules” set up in the XML Schema. An XML Schema is an agreement between the sender and the receiver of a document as to the structure of that document.
9
Elements vs. Attributes
Basic building blocks of XML Contain content which can be a structure Attributes Specify additional information about an element. Contain only simple type content Some data could be either an Element or an Attribute (so you need standards on how to decide which to use).
10
Element & Attribute in XML Schema
Element and Attribute declaration: <xsd:element name = "DataTransmission"> <xsd:complexType> <xsd:sequence> <xsd:element ref = "FirstName" minOccurs = "0"/> <xsd:element ref = "LastName" minOccurs = "0"/> <xsd:element ref = "Phone" minOccurs = "0"/> <xsd:element ref = "Birthdate" minOccurs = "0"/> <xsd:element ref = "Gender" minOccurs = "0"/> <xsd:element ref = "StreetAddress" minOccurs = "0"/> <xsd:element ref = "CityAddress" minOccurs = "0"/> <xsd:element ref = "StateCode" minOccurs = "0"/> <xsd:element ref = "ZipCode" minOccurs = "0"/> <xsd:element ref = "SSN" minOccurs = "0"/> <xsd:element name = "SafetyCapDate" type = "xsd:date"/> </xsd:sequence> <xsd:attribute name = "Source" use = "required" type = "xsd:string"/> <xsd:attribute name = "Target" use = "required" type = "xsd:string"/> <xsd:attribute name = "MsgTypeCode" use = "required" type = "MsgTypeCodeType"/> <xsd:attribute name = "MsgTypeDesc" use = "required" type = "xsd:string"/> </xsd:complexType> </xsd:element>
11
Element and Attribute XML Document
Element and Attribute content: <DataTransmission xmlns:xsi = " xsi:noNamespaceSchemaLocation = "PatientSearchRequest.xsd" Source = "Store599" Target = "CentralPatient" MsgTypeCode = "PSRQ" MsgTypeDesc = "PatientSearchRequest"> <FirstName>Maria</FirstName> <LastName>Montes</LastName> <Birthdate> </Birthdate> <Gender>F</Gender> <StreetAddress>1969 Ygnacio Valley Road</StreetAddress> <CityAddress>Walnut Creek</CityAddress> <StateCode>CA</StateCode> <ZipCode>94597</ZipCode> <SSN> </SSN> <SafetyCapDate> </SafetyCapDate> </DataTransmission>
12
Simple data types in an XML Schema
Comes with “atomic” simple data types Integer, boolean, date, decimal, string, etc. You can build user-defined simple data types Built on the included “atomic” data types Allows declaration of valid values, ranges, Patterns, Length, total digits And more… Attributes or Elements can be of a simple data type (either atomic or user-defined).
13
Simple data type examples
<xsd:simpleType name = "SevenPlaceInteger"> <xsd:restriction base = "xsd:integer"> builds on atomic simple data type <xsd:totalDigits value = "7"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name = "GenderType"> <xsd:restriction base = "xsd:string"> <xsd:enumeration value = "M"/> <xsd:enumeration value = "F"/> <xsd:length value = "1"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name = "RelationshipCodeType"> <xsd:restriction base = "xsd:string"> <xsd:enumeration value = "self"/> <xsd:enumeration value = "spouse"/> <xsd:enumeration value = "dependent"/> <xsd:enumeration value = "other"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name = "SevenPlacePositiveInteger"> builds on custom simple data type <xsd:restriction base = "SevenPlaceInteger"> <xsd:minInclusive value = "0"/> </xsd:restriction> </xsd:simpleType>
14
Complex data types in XML Schema
Builds a structure of Elements. Each subelement is either a simple data type or another structure of Elements. Only Elements can be of a complex data type. Can be named and reusable or anonymous and used only by a single Element. Can be an extension or restriction of another complex type.
15
Complex data type examples
<xsd:complexType name = "AddressType"> declaration of named complex data type <xsd:sequence> <xsd:element ref = "StreetAddress"/> <xsd:element ref = "CityAddress"/> <xsd:element ref = "StateCode"/> </xsd:sequence> </xsd:complexType> <xsd:element name = "WorkAddress" type = "AddressType"/> association of Element with named complex data type <xsd:complexType name = "AddressWithCountryType"> new complex data type extends existing complex data type <xsd:complexContent> <xsd:extension base = "AddressType"> <xsd:sequence> <xsd:element name = "CountryCode" type = "xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexContent> </xsd:complexType> <xsd:element name = "PatientInsurance"> element with anonymous complex data type <xsd:complexType> <xsd:sequence> <xsd:element ref = "Patient"/> <xsd:element ref = "TPMembership" minOccurs = "0" maxOccurs = "unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element>
16
Using the XML Schema Source database Target database Data Data Extract
program Parse program XML Document Network XML Schema XML Schema
17
Reusing XML Schemas Statecode.xsd XML Schemas can build on each other to provide reusability. Base Definitions.xsd <xsd:include schemaLocation = "StateCodes.xsd"/> Patient Search Request.xsd Patient Search Response.xsd Patient Update Request.xsd <xsd:include schemaLocation = "BaseDefinitions.xsd"/> <xsd:include schemaLocation = "BaseDefinitions.xsd"/> <xsd:include schemaLocation = "BaseDefinitions.xsd"/>
18
An XML Schema example <xsd:element name = "Patient"> <xsd:complexType> <xsd:sequence> <xsd:element ref = "PatientID"/> <xsd:element ref = "FirstName"/> <xsd:element ref = "LastName"/> <xsd:element ref = "Birthdate"/> <xsd:element ref = "Gender"/> <xsd:element name = "PrivacyInd" type = "xsd:boolean"/> <xsd:element ref = "StreetAddress"/> <xsd:element ref = "CityAddress"/> <xsd:element ref = "StateCode"/> <xsd:element ref = "ZipCode"/> <xsd:element ref = " Address" minOccurs = "0"/> <xsd:element ref = "SSN" minOccurs = "0"/> <xsd:element name = "HIPAANotifInd" type = "xsd:boolean"/> <xsd:element name = "SafetyCapInd" type = "xsd:boolean"/> <xsd:element name = "SafetyCapDate" type = "xsd:date"/> <xsd:element ref = "StatusCode"/> <xsd:element ref = "Doctor" minOccurs = "0"/> <xsd:element ref = "Phone" maxOccurs = "unbounded"/> <xsd:element ref = "PatDrugAllergy" minOccurs = "0" maxOccurs = "unbounded"/> <xsd:element ref = "OtherDrugTaken" minOccurs = "0" maxOccurs = "unbounded"/> </xsd:sequence> <xsd:attribute name = "PrivacyInd" type = "xsd:boolean"/> </xsd:complexType> </xsd:element>
19
The Structure of an XML Schema
Elements in an XML Schema are hierarchical. To expand the hierarchy with this tool (Tibco’s XML Authority), click here.
20
Expanding an Element This is the result you get – you can now see the elements that make up the structure of the the OtherDrugTaken element.
21
Managing XML Schemas Avoid chaos by managing XML metadata across the Enterprise: Create reusable base definition schemas Create and document: Widely used elements (with their attributes) Complex data types Simple data types Keep track of which schemas use other schemas Keep track of which documents are validated by which schemas (XML repository).
22
XML Architecture XML is hierarchical:
This hierarchy is useful for starting with a patient and finding all the information about them – such as a list of their prescriptions and when the prescriptions were filled. Patient Insurance Prescription Doctor Drug Fills Doctor Drug Claim
23
XML Hierarchy Revisited
The Hierarchy can change depending on what the XML document is used for. Prescription Each version of the hierarchy serves a different purpose Drug Fills Doctor Patient This version of the hierarchy is useful for starting with a Prescription and finding all the information about it, including the Patient and Drug. Drug Claim
24
Database Architecture
Database architecture is relational: Normalized to eliminate data redundancy Join on any two columns that have the same data type. Foreign keys can enforce data integrity
25
Relational Metadata – the Schema
Relational metadata is stored in the database Database control tables fully define the structure of the database. Without the DBMS metadata the contents of the database are worthless. Completely self-contained (not reusable) Tables are structured, each column is a “bucket” for a specific kind of data In most databases, the metadata does not include descriptions, so a Data Dictionary is necessary.
26
XML Metadata – the Document
Metadata built into the document Every element has a tag to tell you where the data is stored in the document. Descriptive tags give structure to the document and tell you what the data means (sort of). “Sort of” because it only tells the tag name, so this only has meaning to someone who already understands what the element or attribute means. Document cannot be parsed for storage on its own. What else is needed?…
27
XML Metadata – the Schema
An XML Schema (or DTD) is needed to: Provide standardization (basis of agreement) Allow meaningful parsing and data storage Specify agreement on document structure A data dictionary is still necessary to provide definition for Elements and Attributes Without an XML Schema, a document is essentially only good for transmitting blocks of data for humans to read.
28
Comparing XML to RDMS Metadata
An XML Schema establishes the valid structure of an XML document, like a database schema establishes the valid structure of a database. Database Architecture XML Architecture Data Model Cardinality Optionality Data Type Describes Relationships Has Metadata Generate DDL XML Schema Valid XML Sequence of Elements Final Result: Database Final Result: Valid XML Document
29
What are relational databases good at?
Data Storage General purpose data storage and retrieval Used for many purposes, such as queries and analysis Generalized view of data for shared use Ideally shared across business units or the Enterprise Works well to store the contents of an XML document.
30
What is XML good at? Data Exchange Exchange of data in a document
Usually designed for a specific communication Works well to move data between databases Important when source or target database is outside your firewall. Usually don’t have direct access to such databases. Works well (with style sheets or XSLT) to display data on the web because browsers inherently are designed to display documents.
31
When do I use each technology?
RDBMS Store data Query data Mine data Create generalized reports XML Transmit data Exchange data with outside agencies Replace flat files Create specific reports Which one? Use them together Each for its own purpose Build an infrastructure that: Creates XML document (using XML schemas) from database contents. Parses XML documents to store their contents in a database
32
Relationships: RDBMS RDBMS
Explicit table to table: if not declared by a foreign key, it doesn’t exist.
33
Relationships: XML XML Implied positioning:
<Patient> <PatientID> </PatientID> <FirstName>Maria</FirstName> <LastName>Montes</LastName> <Birthdate> </Birthdate> <Gender>F</Gender> <PrivacyInd>false</PrivacyInd> <StreetAddress>1969 Ygnacio Valley Road</StreetAddress> <CityAddress>Walnut Creek</CityAddress> <StateCode>CA</StateCode> <ZipCode>94597</ZipCode> <HIPAANotifInd>true</HIPAANotifInd> <SafetyCapInd>false</SafetyCapInd> <SafetyCapDate> </SafetyCapDate> <StatusCode>A</StatusCode> <Phone> <PhoneID> </PhoneID> <AreaCode>925</AreaCode> <PhoneNumber> </PhoneNumber> <PhoneTypeCode>RS</PhoneTypeCode> </Phone> <PatDrugAllergy> <PatDrugAllergyID> </PatDrugAllergyID> <InformedDate> </InformedDate> <DrugID> </DrugID> </PatDrugAllergy> </Patient> XML Implied positioning: Part-of relationship implied by positioning Parallel elements have same relationships as sibling elements
34
Querying: XML vs. RDBMS Relational XML Easy to build queries by:
Navigating the database joins … or … Creating joins on any pair of tables that share matching columns XML Difficult to build queries because: Must stick to the structure of the document Pointer-based navigation is restrictive
35
Navigating in XML for Queries
Must follow the structure of the document. Example: DOM Node navigation: getFirstChild getLastChild getPreviousSibling getNextSibling getParentNode
36
Structure Flexibility : Relational vs. XML
Relational: Rigid structure Lots of work to change the structure because of foreign keys, Views, Stored procs, and triggers Plus programs that assume the database looks a certain way! XML: Flexible structure Easy to change the structure of documents to account for new data, data type changes Everything is text so it is easy to implement Or is it?
37
XML Flexibility Changing the structure of an XML document is easy.
Changing the XML Schema is easy too, but have you considered that you must: Get agreement from business partners Change the programs that create documents from the database and parse documents for the database. Version your XML Schemas if you want to be able to open and parse historical documents (do you keep them?).
38
Data Integrity: Relational vs. XML
Both provide ways to control data types, ranges, patterns, valid values, and optionality. But XML does not provide data validation via look-up as relational does with foreign keys. However, this lack does make the XML Schema a less rigid structure than a database schema.
39
Cost Considerations People always ask – which is cheaper to implement?
XML is essentially all text, so it is easy to create, change and manage. XML doesn’t need the infrastructure that an RDBMS does. XML tools are cheap ($99!) But does this cost difference really matter? The two technologies do different things. You can’t use one as replacement of the other. If you need it, you need it. So, XML is cheaper – and it doesn’t really matter.
40
So what do we do? Do you remember the previous slide?
The two technologies do different things They complement each other: RDBMS stores data in a generally usable way XML exchanges data in a specific way is easy to display on the web. They both present challenges for metadata management. So…we use them both for the appropriate purpose.
41
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.