Download presentation
Presentation is loading. Please wait.
Published byCharlotte Spencer Modified over 9 years ago
1
XML Refresher Course Bálint Joó School of Physics University of Edinburgh May 02, 2003
2
Contents XML Documents Basic Structure Parsing via SAX Document Object Model (DOM) Basic Tree Representation DOM Node Types DOM Notes Conclusions
3
XML Documents Begin with Prologue: Sequence of tags follows: Some stuff
4
Element Structure Elements have a name: Data or as empty tags (no data): Must occur as pair of opening/closing tags possibly containing data:
5
Attributes Elements can have one or more attributes Attributes are name/value pairs Attributes are simple - they have no sub tags Attributes may have a purpose (e.g. declaration) declares namespace bj
6
Namespaces Allow reuse of tag names for different purposes Consist of a prefix and a URI Declared with xmlns attribute: Tags/Attributes from namespace are prefixed: In some cases, attribute values may be prefixed
7
Namespaces in QCDML Suppose Metadata Working Group can't agree on convention for parameter but both UKQCD and SciDAC want to use the name beta but with different meanings. Define namespaces: sciDac and ukqcd Can then have tags:
8
Parsing XML via SAX SAX - Simple API for XML Treats XML Document as a “ program” SAX Parsers provide hooks to let the user write an “ interpreter” for the “ program” Generally fast, with small memory footprint BUT: writing interpreters is potentially burdensome / problem specific
9
Document Object Model (DOM) DOM specifies a Dynamic Interface to XML documents Tree based representation Various APIs for accessing the representation Traversing searching creating/updating We consider here the tree representation only (as it is closely related to XPath)
10
DOM Trees Docum ent Document Node Root Link Node Root Node Sibling next Sibling previous Node Sibling Node (brother/sister) child parent Node Child Node
11
DOM Nodes There are several types of Node. Most useful: Document Element Corresponds to... or Attribute The attribute in Text The data in data The value in
12
DOM Notes DOM Preserves Document order (parent/child, previous/next sibling links) Getting Documents into DOM is easy Using libxml: doc=xmlParseFile(“foo.xml”); Many free DOM parsers exist even for C/C++ Apache Xerces, libxml Difficulty shifts to extracting data from DOM
13
Conclusions This talk provided basic introduction to XML document structure Discussed DOM representation of XML Highlighted need to define Easy To Use API to query DOM objects What does Easy To Use mean ? What is Easy To Parse? Stay Tuned for Part 2...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.