XML for Scientific Applications Marlon Pierce ERDC Tutorial August
What is XML? Standard rule set for defining custom tags. –Make your (meta)data human-readable. –Separate data content from presentation (XSL). Rules for a particular dialect defined in either DTD or Schema. W3C: Standards Making Body –Same people that produced HTML. –See
XML for E&M Input Data
Ex: XML for Electricity and Magnetism 2 Tags omitted for brevity balloon.dat ASCII P3D none …Tags omitted for brevity…
EX: E&M DTD Fragment Cut for brevity.
What the DTD Tells You What tags can be included Parent/child relationships The number of allowed tags of a particular type –1 only, 0 or 1, 0 or more, 1 or more. Names of attributes If the tag takes parsable character data
Ex: E&M Schema Fragment ….
Schema v. DTD (a partial list) Schemas are in XML; DTDs are not. Schemas have several simple types (integers, strings, floats, …); DTDs treat everything as character data. Schema complex types support inheritance –Bee complex type can be extended by drone, queen, worker subtypes. But DTDs have been around longer.
Now What? Get a parser for your favorite language –Apache XML Project’s Xerces parser supports Java, C++, Perl – Write code using the parser: –Validates XML files. –Returns the DOM. –You can now navigate the XML document tree
Document Object Model Defines general entities that make up the document. Forms a tree Objects include –Document –Node –Element –Attribute ProjectDesc GridDataMaterialList
Practical Drawbacks The DOM classes are very general. They only provide you with the most general way of navigating the tree. Typically for every XML dialect you create, you will have to write new code to extract the information. It would be nice if there was a better way to do this….
Automatic JavaBeans with Castor XML trees map nicely into Java Bean components. Get/Set methods return the information. Castor: automatically generates JavaBeans from XML and vice versa. You just write the Bean classes (simple) and Castor handles the mapping to XML.
Some Standard XML Dialects Don’t reinvent what already exists. See MathML ChemistryML SVG: Scalable Vector Graphics SOAP: Simple Object Access Protocol RDF: Resource Description Framework
Scientific Visualization with SVG
XML Namespaces Namespaces allow you to mix different types of XML. –You can combine custom and standard tags –Ex: combine GEMML plus MathML
Namespace Example
Additional References and Resources Inside XML by Steven Holzner. New Riders (2001). The W3C has a nice schema tutorial at The ARL ICE project mixes XML and HDF5: XSIL is a markup language for scientific data: