XML in the Oceanographic Community An Open Discussion Woods Hole Information Technology Group Smith Conference Room November 9, 2004 Robert C. Groman
What’s XML eXtensible Markup Language - A subset of SGML making up a particular text markup language for interchange of structured data. XML allows one to create customized tags that describe data structure, but not the style. Originally created to deal with electronic publishing. SGML – Standard Generalized Markup Language
What or Who is W3C ® ? “The World Wide Web Consortium (W3C ® ) develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential.” One usually adds to XML, the Document Type Definition (DTD) file or schema file (XSD), and style sheets (XSLT, eXtensible Stylesheet Language Transformations and CSS, Cascading Stylesheets.)
Ontology Ontology combines both a vocabulary and a taxonomy, together with a hierarchy of concepts, relationships, and axioms (that is, relevance). [Edgington, 2004] But I digress ….
What’s XML Look Like? Chris A. Jones WHOI "Redfield 226, MS #23" "Woods Hole, MA 02543"
XML Tools XML toolbox for Matlab Growing support within existing word processing applications Ties to NetCDF; Perl and other languages Altova’s XMLSPY –
Why Am I Playing with XML? WebCOAST is the portal for data and information products developed by COOA researchers at the University of New Hampshire. – “Info file” DIF record XML
What’s Happening in Oceanography Circles There is much evidence of implementing XML technologies at other oceanographic sites and WHOI is not part it. What is “it”? Using XML technologies to foster data and information exchange. WHOI isn’t fully in the game (yet).
Review of Some of the Projects I was initially confused by the multiplicity of efforts. There seemed to be many groups doing the same thing. Why? It is still in the “research phase.” Texas A & M ICES-IOC Study Group on the Development of Marine Data Exchange Ocean Biological Information System Monterey Bay Aquarium Research Institute WHOI
Texas A&M University SURA SCOOP Project Gerry Creager “The problem’s going to be getting the researchers to A) agree to get involved with data standards, metadata standards, data transport and archive, retrieval and catalogs, etc.; and B) perform the work they agreed to in (A) above.”
Texas A&M University (cont.) “I've been working with Ken Keiser and Sara Graves from UA/Huntsville on some of the data issues. I think we can have a serious positive impact on the Oceans Community if we can get some demonstrations of LDM and THREDDS running to provide the model community with some easier methods of getting the data.…”LDM and THREDDS
Texas A&M University (cont.) “The SURA SCOOP project has funded a study to extend the OCEAN.US work on metadata standards and see about codifying some of these recommendations into a concrete document. I'm not directly working on that ….”
An Aside: Ocean Observing Systems and Ocean Observatories and Gulf of Maine Ocean Observing System (GoMOOS, Philip Bogden). Texas Mesonet is a network of meteorological monitoring instruments, broadly dispersed across the State. Funding just really starting for the OOS.
Texas A&M University (cont.) “The SURA SCOOP project has funded a study to extend the OCEAN.US work on metadata standards.” “IODE and MBARI have some things.” “UCAR/NCAR CF has potential.” “MBARI should, by now, have some NSF funding from last year's money to do about a $500K piece on […] data standards work for tethered observatories.” “SURA (Southeast Universities' Research Association) has provided about $150K additional funds to […] SCOOP ….”
Texas A&M University (cont.) “ … Earth Sciences Markup Language from the University of Alabama/Huntsville…” [information exchange] “… a project from UAH called SensorML (and the associated Sensor Web) which have also seen interest from the OpenGeospatial Consortium.” [data exchange] “Graphics are another issue.” “ … using the Wide Area Information Server (WAIS) which is a natural language index/database system.” Sees benefits in working with WHOI to benefit TAMU and SCOOP.
International Oceanographic Data and Information Exchange (IODE) IOC and the International Council for the Exploration of the Sea (ICES) are cooperating in the development of a marine XML and have formed the ICES-IOC Study Group on the Development of Marine Data Exchange Systems using XML (SGXML).
Ocean Biogeographic Information System (OBIS) OBIS is the information component of the Census of Marine Life (CoML), a growing network of researchers in more than 45 nations engaged in a 10-year initiative to assess and explain the diversity, distribution, and abundance of life in the oceans - past, present, and future.CoML OBIS is a web-based provider of global geo- referenced information on marine species.
Monterey Bay Aquarium Research Institute (MBARI) Mike McCann Shore Side Data System (SSDS) “[W]e are still working on our SSDS system and use XML quite a bit for describing our metadata. We have a schema and a database and all of the supporting Java code to make it all work. It's still very much a work in progress.”
Monterey Bay Aquarium Research Institute (cont.) John Graybeal SensorML work under the auspices of the OpenGIS Consortium ( [Likes this effort; some XML schema specifiedhttp://vast.uah.edu/SensorML Ontologies for marine terminology –SWEET, GCMD, and BODC – eExercisehttp://wiki.mbari.org/marinemetadatawiki/StandardNam eExercise
Monterey Bay Aquarium Research Institute (cont.) Leading the Marine Metadata project “Al Plueddemann and Bob Weller are strong participants in the Marine Metadata project.” “We are particularly interested in exemplars (as you are clearly working on one) and references.” [Referring to our effort to create XML-based tags for US GLOBEC metadata.]
Marine Biological Laboratory (MBL) and WHOI Dave Remsen, with WHOI’s Ralph Stephen, Andy Maffei, etc. XML-based ontology and classification systems. Has experience with Web Services using SOAP owl - Ontology Web Language [????]
Conclusion There is much activity in the oceanographic community, embracing the XML technologies in order to share, exchange and (re)serve data WHOI is a new player here but can offer a lot, including experience with lots of data.
References - Central and Northern California Ocean Observing Systemhttp:// - International Oceanographic Data and Information Exchange (IOC) Edgington, Theresa, Beomjin Choi, Katherine Henson, T.S. Raghu, and Ajay Winze, Adopting Ontology to Facilitate Knowledge Sharing, Communications of the ACM, November 2004, Vol 47, No – Monterey Bay Area Workshop on Data Management and Visualization, 2003http:// World Wide Web Consortiumhttp:// - DODS Developer's Web site (evolving DDS plus DAS into XML version (DDX)
References (cont.) eport/ - W3C Semantic Web Advanced Development for Europe (EU MarineXML project, SEEGrid, IOC-UNESCO); eport/ Dublin Core, FGDC and Global Change Master Directory DIF records – – – – [Federal Geographic Data Committee]
References (cont.) - GEOsciences Network (GEON)