EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Data Formats: Using self-describing data formats Curt Tilmes NASA Version 1.0 Review Date.
Interoperability of Distributed Component Systems Bryan Bentz, Jason Hayden, Upsorn Praphamontripong, Paul Vandal.
XML Parsing Using Java APIs AIP Independence project Fall 2010.
1 SWE Introduction to Software Engineering Lecture 22 – Architectural Design (Chapter 13)
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Introduction to Databases
Data Management I DBMS Relational Systems. Overview u Introduction u DBMS –components –types u Relational Model –characteristics –implementation u Physical.
Data format translation and migration Future possibilities Alasdair Crockett, Data Standards Manager UK Data Archive.
Russell Taylor Lecturer in Computing & Business Studies.
Automatic Data Ramon Lawrence University of Manitoba
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Computer Software.
HDF 1 NCSA HDF XML Activities Robert E. McGrath Mike Folk National Center for Supercomputing Applications.
OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003.
Introduction to Database Systems 1.  Assignments – 3 – 9%  Marked Lab – 5 – 10% + 2% (Bonus)  Marked Quiz – 3 – 6%  Mid term exams – 2 – (30%) 15%
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation Haystack: Bringing Good Metadata to Life Dennis Quan
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
August Chapter 1 - Introduction Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology Radford.
9 Feb 2004Mikko Mäkinen & Saija Ylönen Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004, Topic (ii): Metadata.
Data Formats: Using Self-describing Data Formats Curt Tilmes NASA Version 1.0 February 2013 Section: Local Data Management Copyright 2013 Curt Tilmes.
University of Illinois at Urbana-ChampaignHDF 9/19/2000 McGrath 9/19/ Transition from HDF4 to HDF5: Issues Robert E. McGrath NCSA University of Illinois.
HDF-EOS Workshop VII, An XML Approach to HDF-EOS5 Files Jingli Yang 1, Bob Bane 1, Muhammad Rabi 1, Zhangshi Yin 1, Richard Ullman 1, Robert McGrath.
Calculation BIM Curriculum 07. Topics  Calculation with BIM  List Types  Output.
DCS Overview MCS/DCS Technical Interchange Meeting August, 2000.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
Introduction to MDA (Model Driven Architecture) CYT.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Stephen Booth EPCC Stephen Booth GridSafe Overview.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
1 Module Objective & Outline Module Objective: After completing this Module, you will be able to, appreciate java as a programming language, write java.
© 2007 by Prentice Hall 1 Introduction to databases.
Why do I want to know about HDF and HDF- EOS? Hierarchical Data Format for the Earth Observing System (HDF-EOS) is NASA's primary format for standard data.
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
UAH The University of Alabama in Huntsville SUBSETTING Matt Smith Information Technology and Systems Center (ITSC) University of Alabama in Huntsville.
XML Registries Source: Java TM API for XML Registries Specification.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
The european ITM Task Force data structure F. Imbeaux.
CS370 Spring 2007 CS 370 Database Systems Lecture 1 Overview of Database Systems.
Easily Serving and Accessing HDF-EOS2 Datasets Using DODS Technologies Richard Chinman, UCAR-IITA, DODS Project Manager
1 Chapter 1 Introduction to Databases Transparencies.
HDF4 OPeNDAP Project Progress Report MuQun Yang and Hyo-Kyung Lee 1 HDF Developers' Meeting11/24/2015.
ITSC/University of Alabama in Huntsville ADaM System Architecture Rahul Ramachandran, Sara Graves and Ken Keiser Mathematical Challenges in Scientific.
OWL Representing Information Using the Web Ontology Language.
ESML, Subsetting, Mining Tools Sara Graves Rahul Ramachandran Information Technology and Systems Center (ITSC) University of Alabama in Huntsville (UAH)
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
EARTH SCIENCE MARKUP LANGUAGE Tutorial on how to write an ESML Description File (for ESML Schema v3.0) “Define Once Use Anywhere” INFORMATION TECHNOLOGY.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
Exporting WaterML from the Earth System Modeling Framework Xinqi Wang Louisiana State University NCAR SIParCS Program August 4, 2009.
 Programming - the process of creating computer programs.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Digital Data Preservation: a schema-driven model Student: Stacy Kowalczyk Co-Authors: Clare McInerney and Phil Mitchell Digital Data Preservation – the.
1 Lecture1 Introduction to Databases Systems Database 1.
By ILTAF MEHDI (MCS, MCSE, CCNA) 1 Remember: Examination is a chance not ability. 6/12/2016.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Java Web Services Orca Knowledge Center – Web Service key concepts.
ESML, Subsetting, Mining Tools
NASA Earth Science Data Stewardship
Transition from HDF4 to HDF5: Issues
XML QUESTIONS AND ANSWERS
XML Data Introduction, Well-formed XML.
XML Problems and Solutions
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Presentation transcript:

EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE

Overview Introduction to ESML –Data/Application Interoperability Problem –Interchange Technology Solution via ESML Writing ESML Description Files using the Editor –ASCII Format –Binary –HDF-EOS Writing ESML reader using Python

Earth Science Data Characteristics Different formats, types and structures (18 and counting for Atmospheric Science alone!) Different states of processing ( raw, calibrated, derived, modeled or interpreted ) Enormous volumes Heterogeneity leads to Data usability problem HDF HDF-EOS netCDF ASCII BinaryGRIB

One approach: Standard data formats  Difficult to implement and enforce  Can’t anticipate all needs  Some data can’t be modeled or is lost in translation  The cost of converting legacy data A better approach: Interchange Technologies  Earth Science Markup Language Interoperability: Accessing Heterogeneous Data The Problem DATA FORMAT 1 DATA FORMAT 1 DATA FORMAT 2 DATA FORMAT 2 DATA FORMAT 3 DATA FORMAT 3 READER 1 READER 2 FORMAT CONVERTER APPLICATION ESML LIBRARY APPLICATION DATA FORMAT 1 DATA FORMAT 1 DATA FORMAT 2 DATA FORMAT 2 DATA FORMAT 3 DATA FORMAT 3 The ESML Solution ESML FILE ESML FILE ESML FILE ESML FILE ESML FILE ESML FILE

What is ESML? It is a specialized markup language for Earth Science metadata based on XML It is a machine-readable and -interpretable representation of the structure of any data file, regardless of data format ESML description files contain external metadata that can be generated by either data producer or data consumer (at collection, data set, and/or granule level) ESML provides the benefits of a standard, self-describing data format (like HDF, HDF-EOS, netCDF, geoTIFF, …) without the cost of data conversion ESML is the basis for core Interchange Technology that allows data/application interoperability

Components of the ESML Interchange Technology DATA FORMAT1 DATA FORMAT2 DATA FORMAT3 OTHER FORMATS ESML FILE ESML SCHEMA ESML LIBRARY OTHER APPLICATIONS ESML EDITOR ESML CONSISTS OF: MARKUPS ESML FILE RULES FOR THE MARKUPS ESML SCHEMA MIDDLEWARE FOR AUTOMATION ESML LIBRARY ESML DATA BROWSER ADaM DATA MINING SYSTEM External description file for dataset or formats Rules that govern the description of the data files Library parses and interprets the description file and figures out how to read the data

Components of the ESML Interchange Technology DATA FORMAT1 DATA FORMAT2 DATA FORMAT3 OTHER FORMATS ESML FILE ESML SCHEMA ESML LIBRARY ESML DATA BROWSER ADaM DATA MINING SYSTEM OTHER APPLICATIONS ESML EDITOR ESML FILE ESML SCHEMA INTERCHANGE TECHNOLOGY These three key components allow applications to use data in a wide variety of formats

Interchange Technology for Data Users and Application Developers DATA FORMAT1 DATA FORMAT2 DATA FORMAT3 OTHER FORMATS ESML FILE ESML SCHEMA ESML DATA BROWSER ADaM DATA MINING SYSTEM OTHER APPLICATIONS ESML EDITOR ESML FILE ESML SCHEMA INTERCHANGE TECHNOLOGY DATA PRODUCERS OR CONSUMERS APPLICATION DEVELOPERS ESML can be used by both scientists and application developers ESML LIBRARY

Advantages of using ESML Scientist (Data Producer/Consumer) –ESML will let them use virtually any data format in their applications –ESML files are external description files that can be easily created, modified and viewed by any text editor –ESML has a few simple concepts which can be used to describe numerous data sets –An ESML file can be seen as a set of instructions to the application on how to read and understand a data file –If the format of the data changes for whatever reason (e.g., new version of data set) no software changes are required, just a new ESML file. Does that mean a scientist has to write an ESML file for every data file? –No, in fact the beauty of ESML is that it allows scientist to write ONE ESML file to describe MANY data files that are structurally and semantically similar

Advantages of using ESML Data Archiving Centers (Data Producers) –ESML files can be used to store not only the structural but also embed semantic information about the data sets –Since ESML files are independent separate files, they can be generated on the fly utilizing metadata databases as datasets are ordered –Centers can archive data in its native formats and not have to store them in any “selected” format –Centers can now also “ESMLize” all their legacy datasets with minimal efforts –The existing legacy datasets now become a more valuable data resource for scientists, because they can be used more efficiently and effectively Application Developers –By using the ESML library, developers can build “ESML enabled” applications! –ONE single reader component can read all the various data formats instead of having separate reader module for different formats

ESML file ESML file ESML file ESML Library Collocation Algorithm MODISCERES MISR/ Others Scientists can: Select a variety of data in different formats for the collocation analysis Purpose: To study the relationship between shortwave flux and cloud/aerosol properties Important for climate change studies ESML IN ACTION: Collocation Algorithm Analysis

Skin temperatures come in a variety of data formats - GOES - McIDAS Reanalysis Data - GRIB MM5 Model - MM5 Binary AVHRR - HDF MODIS - EOS-HDF ESML IN ACTION: Ingest surface skin temperature data in Numerical Models Reanalysis GRIB files Reanalysis GRIB files MM5 GOES ESML FILE ESML FILE ESML FILE ESML LIBRARY APPLICATION

For More Information esml.itsc.uah.eduURL: esml.itsc.uah.edu Become a member and post ESML related news items on the website Schema and related documents available to all Download the latest products Source code available via Source Forge Join the ESML mailing list

ESML Schema ESML schema defines Syntactic metadata that describe the structure of the file in machine-readable and -interpretable terms

Design follows the layered cake approach where: –Lowest (core) level provides the basic functionality of reading the structural metadata from the ESML file and returning data to the user –Additional software layers can be added to provide other functionalities such as using semantics from an ontology to “use” the data intelligently Includes plug-in modules for each individual format, allowing packaging of libraries Provides a simple API for easy addition of new formats as plug-in modules Provides a more intuitive user API based on the analogy of file access in a directory structure Provides the library source code via Source Forge repository ESML Library

ESML Library Design User Level 0 API Plug-in API DOM Tree Binary HDF - EOS HDF Semantic Parser Level 1 API Semantic Parser Level 1 API Layered Cake Design Additional modules for other formats can be easily added Additional layers of functionality can be easily added on top of the Core Library (shown in grey) Intuitive user API based on the analogy of file access in a directory structure Versions Available: C++ for Windows and Linux Python (pyESML)

PyESML Example List all the API functions Open a data file using ESML List all the structures List all the fields Get the data for a field Print the data