Download presentation
Presentation is loading. Please wait.
Published byRodger Randall Gibbs Modified over 9 years ago
1
XML for E-commerce II Helena Ahonen-Myka
2
XML processing model n XML processor is used to read XML documents and provide access to their content and structure n XML processor works for some application n the specification defines which information the processor should provide to the application
3
Parsing n input: an XML document n basic task: is the document well- formed? n Validating parsers additionally: is the document valid?
4
Parsing n parsers produce data structures, which other tools and applications can use n two kind of APIs: tree-based and event- based
5
Tree-based API n compiles an XML document into an internal tree structure n allows an application to navigate the tree n Document Object Model (DOM) is a tree-based API for XML and HTML documents
6
Event-based API n reports parsing events (such as start and end of elements) directly to the application through callbacks n the application implements handlers to deal with the different events n Simple API for XML (SAX)
7
Example Hello, world! n Events: start document start element: doc start element: para characters: Hello, world! end element: para end element: doc
8
Example (cont.) n an application handles these events just as it would handle events from a graphical user interface (mouse clicks, etc) as the events occur n no need to cache the entire document in memory or secondary storage
9
Tree-based vs. event-based n tree-based APIs are useful for a wide range of applications, but they may need a lot of resources (if the document is large) n some applications may need to build their own tree structures, and it is very inefficient to build a parse tree only to map it to another tree
10
Tree-based vs. event-based n an event-based API is simpler, lower- level access to an XML document n as document is processed sequentially, one can parse documents much larger than the available system memory n own data structures can be constructed using own callback event handlers
11
We need a parser... n Apache Xerces: http://xml.apache.org n IBM XML4J: http://alphaworks.ibm.com n XP: http://www.jclark.com/xml/xp n … many others
12
… and the SAX classes n http://www.megginson.com/SAX/ n often the SAX classes come bundled to the parser distribution n some parsers only support SAX 1.0, the latest version is 2.0
13
Starting a SAX parser import org.xml.sax.XMLReader; import org.apache.xerces.parsers.SAXParser; XMLReader parser = new SAXParser(); parser.parse(uri);
14
Content handlers n In order to let the application do something useful with XML data as it is being parsed, we must register handlers with the SAX parser n handler is a set of callbacks: application code can be run at important events within a document’s parsing
15
Core handler interfaces in SAX n org.xml.sax.ContentHandler n org.xml.sax.ErrorHandler n org.xml.sax.DTDHandler n org.xml.sax.EntityResolver
16
Custom application classes n custom application classes that perform specific actions within the parsing process can implement each of the core interfaces n implementation classes can be registered with the parser with the methods setContentHandler(), etc.
17
Example: content handlers class MyContentHandler implements ContentHandler { public void startDocument() throws SAXException { System.out.println(”Parsing begins…”); } public void endDocument() throws SAXException { System.out.println(”...Parsing ends.”); }
18
Element handlers public void startElement (String namespaceURI, String localName, String rawName, Attributes atts) throws SAXexception { System.out.print(”startElement: ” + localName); if (!namespaceURI.equals(””)) { System.out.println(” in namespace ” + namespaceURI + ” (” + rawname + ”)”); } else { System.out.println(” has no associated namespace”); } for (int I=0; I<atts.getLength(); I++) { System.out.println(” Attribute: ” + atts.getLocalName(I) + ”=” + atts.getValue(I)); }}
19
endElement public void endElement(String namespaceURI, String localName, String rawName) throws SAXException { System.out.println(”endElement: ” + localName + ”\n”); }
20
Character data public void characters (char[] ch, int start, int end) throws SAXException { String s = new String(ch, start, end); System.out.println(”characters: ” + s); } n parser may return all contiguous character data at once, or split the data up into multiple method invocations
21
Processing instructions n XML documents may contain processing instructions (PIs) n a processing instruction tells an application to perform some specific task n form:
22
Handlers for PIs public void processingInstruction (String target, String data) throws SAXException { System.out.println(”PI: Target:” + target + ” and Data:” + data); } n Application could receive instructions and set variables or execute methods to perform application-specific processing
23
Validation n some parsers are validating, some non- validating n some parsers can do both n SAX method to turn validation on: parser.setFeature (”http://xml.org/sax/features/validation”, true);
24
Ignorable whitespace n validating parser can decide which whitespace can be ignored n for a non-validating parser, all whitespace is just characters n content handler: public void ignorableWhitespace (char[] ch, int start, int end) { … }
25
XML Schema n DTDs have drawbacks: –They can only define the element structure and attributes –They cannot define any database-like constraints for elements: Value (min, max, etc.) Type (integer, string, etc.) –DTDs are not written in XML and cannot thus be processed with the same tools as XML documents, XSL(T), etc. n XML Schema n XML Schema: –Is written in XML –Avoids most of the DTD drawbacks
26
XML Schema n XML Schema Part 1: Structures: –Element structure definition as with DTD: Elements, attributes, also enhanced ways to control structures n XML Schema Part 2: Datatypes: –Primitive datatypes (string, boolean, float, etc.) –Derived datatypes from primitive datatypes (time, recurringDate) –Constraining facets for each datatype (minLength, maxLength, pattern, precision, etc.) n Information about Schemas: –http://www.w3c.org/XML/Schema/
27
Complex and simple types n complex types: allow elements in their content and may have attributes n simple types: cannot have element content and cannot have attributes
28
Reminder: DTD declarations n
29
Example: USAddress type <xsd:attribute name=”country” type=”xsd:NMTOKEN” use=”fixed” value=”US” />
30
Example: PurchaseOrderType
31
Notes n element declarations for shipTo and billTo associate different element names with the same complex type n attribute declarations must reference simple types n element comment declared elsewhere in the schema (here reference only)
32
… continues n element is optional, if minOccurs = 0 n maximum number of times an element may appear: maxOccurs n attributes may appear once or not at all n use attribute is used in an attribute declaration to indicate whether the attribute is required or optional, and if optional, whether the value is fixed or whether there is a default
33
More examples … Lawnmower Lawnmower 1 1 148.95 148.95 Confirm this is electric Confirm this is electric Baby Monitor Baby Monitor 1 1 39.98 39.98 1999-05-21 1999-05-21 …
34
<xsd:element name="item" minOccurs="0” <xsd:element name="item" minOccurs="0” maxOccurs="unbounded"> maxOccurs="unbounded"> <xsd:element name="shipDate" type="xsd:date” <xsd:element name="shipDate" type="xsd:date” minOccurs="0"/> minOccurs="0"/> </xsd:complexType> </xsd:simpleType>
35
Patterns </xsd:simpleType> n ”three digits followed by a hyphen followed by two upper- case ASCII letters”
36
Building content models n : fixed order n : (1) choice of alternatives n : grouping (also named) n : no order specified
37
Null values n A missing element may mean many things: unknown, not applicable… n an attribute to indicate that the element content is null in schema: <xsd:element name=”shipDate” type=”xsd:date” nullable=”true” /> in document:
38
Specifying uniqueness n XML Schema enables to indicate that any attribute or element value must be unique within a certain scope n unique element: first ”select” a set of elements, then identify the attribute of element ”field” relative to each selected element that has to be unique within the scope of the set of selected elements
39
Defining keys and their references n Also keys and key references can be defined: parts/part @number regions/zip/part @number
40
XML Query Languages n Currently: –There is no recommendation/standard available, only drafts –Different suggestions given in 1998, work in progress n XML Query Requirements: –Requirements draft 16.8.2000 –Query language until the end of 2000 n XML Query Data Model: –Draft 11.5.2000 n More on XML Query Languages: –http://www.w3.org/XML/Query/
41
XML Query Languages n Required features of an XML query language: –Support operations (selection, projection, aggregation, sorting, etc.) on all data types: Choose a part of the data based on content or structure Also operations on hierarchy and sequence of document structures –Structural preservation and transformation: Preserve the relative hierarchy and sequence of input document structures in the query results Transform XML structures and create new XML structures –Combination and joining: Combine related information from different parts of a given document or from multiple documents
42
XML Query Languages n Required features of an XML query language (cont'd): –Closure property: The result of an XML document query is also an XML document (usually not valid but well-formed) The results of a query can be used as input to another query n Notions: –HTML is layout-oriented, queries can not be efficiently carried out –XML is not layout-oriented but is based on representing structure, DTD’s and structure information can be used in queries –XML query languages are still under construction, but prototype languages exist (e.g., XML-QL, XQL, Lore…)
43
XML Query Languages We want our query to collect elements from manufacturer documents (in temp.database.xml) listing manufacturer's name, year, models, vendors, price, etc. to create new elements We want our query to collect elements from manufacturer documents (in temp.database.xml) listing manufacturer's name, year, models, vendors, price, etc. to create new elements –The results should list their make, model, vendor, rank, and price (in this order) n Lorel: Select xml(car:(select X.vehicle.make, X.vehicle.model, X.vehicle.vendor, X.manufacturer.rank, X.vehicle.vendor, X.manufacturer.rank, X.vehicle.price X.vehicle.price from temp.database.xml X)) from temp.database.xml X))
44
XML Query Languages WHERE<manufacturer> $mn $mn $mon $mon $r $r $y $y $mn $mn </manufacturer> IN www.nhcs\temp.database.xml CONSTRUCT<car> $mn $mn $mon $mon $v $v $r $r $y $y </car> n XML-QL
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.