SAX A parser for XML Documents
XML Parsers What is an XML parser? –Software that reads and parses XML –Passes data to the invoking application –The application does something useful with the data
XML Parsers Why is this a good thing? –Since XML is a standard, we can write generic programs to parse XML data –Frees the programmer from writing a new parser each time a new data format comes along
XML Parsers Two types of parser –SAX (Simple API for XML) Event driven API Sends events to the application as the document is read –DOM (Document Object Model) Reads the entire document into memory in a tree structure
Simple API for XML
SAX Parser When should I use it? –Large documents –Memory constrained devices When should I use something else? –If you need to modify the document –SAX doesn’t remember previous events unless you write explicit code to do so.
SAX Parser Which languages are supported? –Java –Perl –C++ –Python
SAX Parser Versions –SAX 1 introduced in May 1998 –SAX 2.0 introduced in May 2000 and adds support for namespaces filter chains querying and setting properties in the parser
SAX Parser Some popular SAX APIs –Apache XML Project Xerces Java Parser –IBM’s XML for Java (XML4J) –For a complete list, see
SAX Implementation in Java Create a class which extends the SAX event handler Import org.xml.sax.*; import org.xml.sax.helpers.ParserFactory; Public class SaxApplication extends HandlerBase { public static void main(String args[]) { }
SAX Implementation in Java Create a SAX Parser public static void main(args[]) { String parserName = “org.apache.xerces.parsers.SAXParser”; try { SaxApplication app = new SaxApplication(); Parser parser = ParserFactory.makeParser(parserName); parser.setDocumentHandler(app); parser.setErrorHandler(app); parser.parse(new InputSource(args[0])); } catch (Throwable t) { // Handle exceptions }
SAX Implementation in Java Override event handlers of interest Public class SaxApplication extends HandlerBase { public void main (String args[]) { // stuff missing } public void startElement(String name, AttributeList attrs) { // Process this element }
SAX Implementation in Java Other events generated by the parser –startDocument() –endDocument() –startElement() –endElement() –error()
For more information... java.sun.com/xml