1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake
9/10/2005Bun Yue: 2 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 3 What is XML? XML stands for eXtensible Markup Language. XML is a system for defining, validating, and sharing document formats. Standard organization: World Wide Web Consortium (W3C):
9/10/2005Bun Yue: 4 XML Basic Constructs XML uses tag elements and attributes to describe document structures and properties. Unlike HTML, XML is extensible. Authors can use XML to define a new language for a given application. XML is a meta-language.
9/10/2005Bun Yue: 5 A Simple XML Example Bun Yue Everybody Hello, welcome! XML Version must be in the first line. XML contents Root element
9/10/2005Bun Yue: 6 Why XML? XML captures semantic well. Simple. Text. Standard. Wide support. Validation. Abundance of tools.
9/10/2005Bun Yue: 7 Some Disadvantages Verbose Text Ordered tree model may not fit best
9/10/2005Bun Yue: 8 Some UHCL XML Applications Web Services: SOAP, UDDI, WSDL, etc. XHTML VoiceXML
9/10/2005Bun Yue: 9 Some UHCL XML Applications Wireless Markup Language (WML)
9/10/2005Bun Yue: 10 Some UHCL XML Applications Scalar Vector Graphics (SVG) A triangle fractal
9/10/2005Bun Yue: 11 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 12 XML Modeling Devise an XML vocabulary to capture an application. May use available modeling tools and languages, such as UML. XML basically uses an ordered tree model.
9/10/2005Bun Yue: 13 XML Tree Model XML file: Hi There Bye Document Root aProlog cbb An XML tree showing element nodes only
9/10/2005Bun Yue: 14 Syntax & Validation All XML document must be well-formed: satisfying basic syntax. XML documents may be validated by various schemas. Validation: –Cost: time and effort. –Benefit: increased reliability.
9/10/2005Bun Yue: 15 XML Validation Languages –Document Type Definition (DTD) –XML Schema –Schematron –Relax NG
9/10/2005Bun Yue: 16 Document Type Definition (DTD) A grammar to determine the validity of an XML document. An XML document satisfying the rules of a DTD is said to be validated. DTD is part of the XML language standard.
9/10/2005Bun Yue: 17 A Simple DTD <!ATTLIST person id ID #REQUIRED spouse IDREF #IMPLIED>
9/10/2005Bun Yue: 18 DTD Validation An XML document validated by the DTD: Adam Eva John
9/10/2005Bun Yue: 19 Not Validated Eva Adam John Jack
9/10/2005Bun Yue: 20 Limitations of DTD Schema languages have limited expressive power. DTD is simple and not expressive. Others are more expressive (and complicated): e.g. XML Schema.
9/10/2005Bun Yue: 21 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 22 XML Parsing A large collection of XML Parsers in various languages: Java, Perl, C#,… Two popular classes: –DOM (Document Object Model): Build an XML tree. –SAX (Simple API for XML): Event driven (push).
9/10/2005Bun Yue: 23 SAX XML Input is converted to a sequence of events (e.g. startElement, endElement, characters, …) Programmers define event handlers to handle the events.
9/10/2005Bun Yue: 24 SAX Example // Java public void startElement(String namespaceURI, String lName, // local name String qName, // qualified name Attributes attrs) throws SAXException { numElements++; // numElements is a data member. } // startElement
9/10/2005Bun Yue: 25 DOM DOM (Document Object Model): a W3C standard. A “platform- and language-neutral interface” to present documents. DOM parser parses an XML document and build an XML tree. DOM classes can then be used to access and manipulate the tree.
9/10/2005Bun Yue: 26 DOM Example try { // Java // Parse input XML file. DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance() DocumentBuilder builder = factory.newDocumentBuilder(); document = builder.parse(new File(argv[0])); System.out.println(“Name of root element of " + argv[0] + " = " + document.getDocumentElement().getLocalName())) } …
9/10/2005Bun Yue: 27 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 28 XML Transformation Transformation from XML to XML and other formats. Can use XML parsers. XSLT: XML Stylesheet Language/Transform.
9/10/2005Bun Yue: 29 XSLT Rule-based language for XML transformation. Contains a set of templates (rules) for identifying components to be acted on.
9/10/2005Bun Yue: 30 XSLT Template
9/10/2005Bun Yue: 31 XSLT Example An XSLT template to replace an element by, preserving its content:
9/10/2005Bun Yue: 32 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 33 Storage of XML XML can be stored as files or in database. Leading databases support XML storage: –Native XML Database –XML Enhanced Database
9/10/2005Bun Yue: 34 XQuery W3C Standard For effectively querying and retrieving information from a diversified XML sources. Similar to SQL for relational database.
9/10/2005Bun Yue: 35 XQuery Example {for $f in doc(“diagrams.xml")//figure return { $f/title } }
9/10/2005Bun Yue: 36 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 37 XML Related Courses CSCI 4230 Internet Application Development: started covering XML in Spring –Example project: using MS XML parser, parse XML weather information from an external site, retrieve its information, and present it in a specific HTML format.
9/10/2005Bun Yue: 38 XML Related Courses CSCI 5733 XML Application Development: started in Spring Complete coverage of details of this presentation + much more. Programming assignments in XML parsers, XSLT, XPath, WML, SVG, etc.
9/10/2005Bun Yue: 39 Capstone Projects Graduate capstone projects from external companies. Some XML project examples: –SVG –XML difference engine –XML based workflow
9/10/2005Bun Yue: 40 XML Research at UHCL Some examples: –Effective storage of XML in relational database. –Mapping of DTD to relational schema. –Software metrics for XML Schema.
9/10/2005Bun Yue: 41 Content What is XML? XML Modeling XML Parsing XML Transformation XML and Databases XML at UHCL Conclusions
9/10/2005Bun Yue: 42 Conclusions XML: wide potential for applications and research. UHCL is an early adopter. Many UCHL students are trained in XML.
9/10/2005Bun Yue: 43 Questions? Any Questions? Thanks!