Download presentation
Presentation is loading. Please wait.
Published byFrederick Webster Modified over 8 years ago
1
XML Validation III Schemas + RELAX NG Robin Burke ECT 360
2
Outline Types Built-in Named Anonymous Type Derivation Schema Organization Break RELAX NG
3
Built-in types Part of the schema language Base types 19 fundamental types Examples: string, decimal Derived types 25 more types that use the base types Examples: ID, positiveInteger
4
Built-in types, cont'd
5
User-defined types Any use of complexType can be turned into a user-defined type usually called "standalone" Simple types can be derived from the built-in types
6
Standalone types A type can stand outside of an element definition must have a name Used in element definition
7
Mixed content Can specify that an element has mixed content
8
Mixed content, cont'd Schema cannot control where the text appears If this is legal text here thud grunt So is this thud more text grunt still more
9
Deriving types DTDs do not allow types restrictions beyond enumeration, CDATA, token for attributes PCDATA for content Schemas have built-in types also capability to create your own
10
Derivation operations list sequence of values union combine two types allowing either restriction placing limits on the legal values
11
List PN334-04 PN223-89 PQ1112-03 Must be separated by spaces probably more useful to do this with document structure partList -> partNo*
12
Union Allows data of either type to be used Example Database situation null is a possible value
13
Restriction Most useful Allow design to state exactly what values are legal prices must be non-negative SSN must follow a certain pattern in-stock must yes or no etc.
14
Restriction, cont'd Restrict a base type according to "facets" Different facets available for different data types
15
Facets
16
Example: enumeration
17
Example: numeric
18
Example: pattern Regular expressions again derived from perl
19
Inheritance facet restrictions are inherited new type derivations must honor them but can restrict them further but new derivations can alter other facets For example monetary type fractionDigits facet = 2 loan amount type monetary type + maxValue = 100000 car loan amount loan amount type + maxValue = 30000
20
Fixed Facets Possible to prevent users from changing certain facet in any way fixed="true" in facet declaration similar to "final" keyword in Java Example minInclusive cannot be changed when inherited lower would be illegal anyway the "fixed" attribute means it cannot be altered upward
21
Complex Types (not discussed in book) Possible to derive from complex types i.e. elements Use complexContent Possibilities extension restriction elements attributes
22
Complex Type Extension can add elements to existing complex type only at the end
23
Complex Type Restriction Adding additional attributes Odd syntax entire element definition must be repeated Not much benefit to inheritance validation checks for consistency with supertype
24
Example grades schema
25
Schema design Questions to ask what kind of document? narrative data-centric what kind of processing? web page output complex queries
26
Document modeling Get examples Get style guides / rules For each data element ask how many ask what legal values ask about sub-parts ask about exceptions
27
Design decisions Attribute vs element Level of granularity Naming Schema structure
28
Attribute vs element Some specific rules ID must be attribute General principle data vs metadata Element for document content Attribute for information about content Not always easy to tell!
29
Element Consists of document content Will be shown to a human user Contains substructure Sequence may be important Could be very long Presence depends on other values
30
Attribute (Opposite of above) Must be from an enumeration of values Also consistency
31
Level of granularity How detailed to model the data? Very detailed more work to markup more detail in expressing the schema exceptions must be handled Less detailed easier to mark up easier to schematize document contents less accessible
32
Element content granularity Fine grained model salutation, first name, middle name, last name, appellation Coarse grained model name Tradeoff search / sort / organized document creation
33
Levels vs recursion Named levels Recursion Tradeoff ability to rearrange transparency of markup
34
Naming Case convention uppercase is bad lowercase better Multiple words CapCase camelCase Underline_Convention
35
Structure Nested "russian doll" schema looks like the document small schema only Flat elements defined at global level references used in complex type definitions Type-based "venetian blind" all schema complex in type defintions one global element
36
Break
37
RELAX NG XML Schemas are big a lot of the page consists of / repeated element names RELAX NG created as an alternate validation language compact, non-XML syntax also XML syntax
38
Example element grades { element grade { element student { text }, element assigned-grade { text } }* } Equivalent to
39
Attributes element grades { element grade { element student { text, attribute id { text } }, element assigned-grade (text) }* attribute assignment { text } }
40
Types instead of { text } use appropriate built-in data type attribute age { xsd:positiveInteger } facets qualify with name / value pair attribute drinkingAge { xsd:positiveInteger { minInclusive="21" } }
41
What does this one say? element grade { element student...., { element assigned-grade { text { pattern="([A-D](\+|\-)?|F)" } | ( element assigned-grade { text "I" }, element reason { text } ) }
42
The point A schema language has two purposes lets the language designer state a design lets the system validate documents against that design Any language that serves this purposes can be used
43
Validation languages DTD SGML holdover ugly fairly simple to express Schema complete extensible baroque unreadable RELAX NG readable esp. compact syntax more expressive than Schema fewer tools
44
Next week Presentations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.