AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Implementing CSML Feature Types in applications within the NERC DataGrid Dominic Lowe, British Atmospheric.


Similar presentations
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll

Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
® OGC Web Services Initiative, Phase 9 (OWS-9): Innovations Thread - OPeNDAP James Gallagher and Nathan Potter, OPeNDAP © 2012 Open Geospatial Consortium.
System Design and Memory Limits. Problem  If you were integrating a feed of end of day stock price information (open, high, low, and closing price) for.
E-Science Data Information and Knowledge Transformation The BinX Language.
Bryan Lawrence on behalf of BADC, BODC, CCLRC, PML and SOC The British Atmospheric Data Centre and the NERC DataGrid (for) [ ]=
NERC DataGrid Vocabulary Governance Vocabulary Workshop, RAL, February 25, 2009.
NERC Data Grid Helen Snaith and the NDG consortium …
Automatic Data Ramon Lawrence University of Manitoba
BADC, BODC, CCLRC, PML and SOC The NERC Metadata Gateway: a product of the NERC DataGrid [ ]= Bryan Lawrence (on behalf of a big team)
CVSQL 2 The Design. System Overview System Components CVSQL Server –Three network interfaces –Modular data source provider framework –Decoupled SQL parsing.
Overview of Mini-Edit and other Tools Access DB Oracle DB You Need to Send Entries From Your Std To the Registry You Need to Get Back Updated Entries From.
The NERC DataGrid Vocabulary Server Roy Lowry British Oceanographic Data Centre Ontology Registry Meeting.
Stuff about DX/GS. Overview Installation of client/server packages 1 Complex installation: pydxs (pydxc) package into …/lib/python2.4/site- packages/
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML National Water Quality Monitoring Council August 20, 2003.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
Bryan Lawrence on behalf of BADC, BODC, CCLRC, PML and SOC An Introduction to NDG concepts [ ]=
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Mapping between SOS standard specifications and INSPIRE legislation. Relationship between SOS and D2.9 Matthes Rieke, Dr. Albert Remke (m.rieke,
Supporting High- Performance Data Processing on Flat-Files Xuan Zhang Gagan Agrawal Ohio State University.
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML Advisory Committee on Water Information September 10, 2003 Kenneth J. Lanfear,
NOCS, PML, STFC, BODC, BADC The NERC DataGrid = Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC.
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
DELIVERING ENVIRONMENTAL WEB SERVICES (DEWS) Partners: UK Met Office (Lead Partner), British Atmospheric Data Centre (BADC), British Maritime Technology.
Copyright © 2004 Pearson Education, Inc.. Chapter 26 XML and Internet Databases.
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
BADC, BODC, CCLRC, PML and SOC Interacting with NDG [ ]= Bryan Lawrence (on behalf of a big team)
NDG Discovery Gateway ISO19139 Issues Bryan Lawrence Director of Environmental Data Archival and Associated Research, CCLRC Head of the British Atmospheric.
Recent developments with the THREDDS Data Server (TDS) and related Tools: covering TDS, NCML, WCS, forecast aggregation and not including stuff covered.
The NERC DataGrid Prototype Bryan Lawrence 2, Ray Cramer 3, Marta Gutierrez 2, Kerstin Kleese van Dam 1, Siva Kondapalli 3, Susan Latham 2, Roy Lowry 3,
Integrating the Climate Science Modelling Language with geospatial software and services Dominic Lowe British Atmospheric Data
Dale E. Gary Professor, Physics, Center for Solar-Terrestrial Research New Jersey Institute of Technology 1 9/25/2012Prototype Review Meeting.
The CERA2 Data Base Data input – Data output Hans Luthardt Model & Data/MPI-M, Hamburg Services and Facilities of DKRZ and Model & Data Hamburg,
THREDDS Catalogs Ethan Davis UCAR/Unidata NASA ESDSWG Standards Process Group meeting, 17 July 2007.
Google Refine for Data Quality / Integrity. Context BioVeL Data Refinement Workflow Synonym Expansion / Occurrence Retrieval Data Selection Data Quality.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Why EML Metrics Primary quality checks are limited –schema compliance –EML parser (ids and references) Dataset quality not sufficient for automated use.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
AUKEGGS Canberra, Exposing legacy file-based data (interop-for-files) Andrew Woolf CCLRC Rutherford Appleton Laboratory
00/XXXX 1 Data Processing in PRISM Introduction. COCO (CDMS Overloaded for CF Objects) What is it. Why is COCO written in Python. Implementation Data Operations.
© Geodise Project, University of Southampton, Integrating Data Management into Engineering Applications Zhuoan Jiao, Jasmin.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
GO-ESSP LLNL, June 2006 CSML – Stocktake and Forward Look Andrew Woolf Dominic CCLRC Rutherford Appleton Laboratory.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344.
British Atmospheric Data Centre ( Searching: Whither NDG? Bryan Lawrence.
1 Key Results from GALEON John Caron Ben Dominico UCAR/Unidata.
OGC Web Services with complex data Stephen Pascoe How OGC Web Services relate to GML Application Schema.
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Google Code Libraries Dima Ionut Daniel. Contents What is Google Code? LDAPBeans Object-ldap-mapping Ldap-ODM Bug4j jOOR Rapa jongo Conclusion Bibliography.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.
ECMWF 24 th November 2008 Deploying secure OGC services in front of a heterogeneous data archive. Bryan Lawrence, Phil Kershaw, Dominic Lowe, and Stephen.
NERC DataGrid: Googling for Secure Data
The NERC Metadata Gateway: a product of the NERC DataGrid
Intro to XML.
Attie Bioinformatics Server Redesign
Lawrence Livermore National Laboratory
Tom Rink Tom Whittaker Paolo Antonelli Kevin Baggett.
Remote Data Access Update
UserCreator User management for schools
Supporting High-Performance Data Processing on Flat-Files
New (Applications of) Compiler Techniques for Data Grids
Presentation transcript:

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Implementing CSML Feature Types in applications within the NERC DataGrid Dominic Lowe, British Atmospheric Data Centre, CCLRC and NDG team (Andrew Woolf, Bryan Lawrence et al.)

AUKEGGSWorkshop ANU, Canberra, 29 November Complexity + Volume + Remote Access = Grid Challenge British Atmospheric Data Centre British Oceanographic Data Centre NCAR NERC DataGrid

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML Feature Model Intermediary XML data format Represents Atmospheric, Oceanographic data GML Application schema Features based on geometery e.g GridSeriesFeature PointFeature TrajectoryFeature...

AUKEGGSWorkshop ANU, Canberra, 29 November 2006

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML Parser XML Elements... Python Classes class GridFeature:... Provides mappings (and conversion) between XML model and python object model. toXML() fromXML() Implemented in Python, uses c module: cElementTree Performance: Length of XML (lines) Time to parse (CPU secs) ,

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML Parser Hierarchical - Calling fromXML() or toXML() method of root element calls methods of child elements.... and so on... Dataset.toXML() FeatureCollection.toXML() GridSeriesFeature(AbstractFeature).toXML() RangeSet.toXML()..... and so on

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML Parser Process whole document or parts of the document To convert whole document: From XML to python: tree = ElementTree(file='mycsmlfile.xml') dataset = csml.parser.Dataset() dataset.fromXML(tree.getroot()) From Python to XML: csmldoc = dataset.toXML() Or just convert a fragment: gsFeature = GridSeriesFeature() gsFeature.fromXML(xmlFragment) gsXML = gsFeature.toXML() Allows for easy editing of documents Or addition of new features Or redefining existing features e.g. subsetting

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML Parser – Creating CSML Can be used to create CSML documents (or fragments) from scratch: gs = GridSeriesFeature(id = 'myGS' domain = mydomain, rangeSet = myRS) ps = PointFeature(id = 'mypoint', domain = mydomain2, rangeSet= myrs2) fc = FeatureCollection(members = [gs,ps,...] ds = Dataset(id ='mycsmldocument', featureCollection = fc) ds.toXML() mycsmldocument.xml No need for data providers to understand XML APIs Very useful for performing operations on features

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML Parser 2 level API to parser object model: Parser level (mainly used by me + NDG data providers): dataset.featureCollection.members[3].profileSeriesDomain Wrapped by higher level interface (used by client applications) getListOfFeatures(csmldoc) # get list of available features getDomain(feature) # get domain info getAffordances(feature) # get available operations subsetFeature(feature, subset) # operation:request subset

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Access to underlying data Multiple I/O libraries - cdms, NAppy, others... CSML code talks to a single DataInterface class that provides a uniform wrapper for different file access methods. Easy to add more data formats - just need to write the correct wrapper methods (getData, getSubsetOfData, getVariable...) Similar interface needed for RDBMS access (not yet implemented)

AUKEGGSWorkshop ANU, Canberra, 29 November 2006

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML tooling - Use Case #1 Scanning at BADC Multiple data formats, NetCDF, NASAAmes, GRIB, PP Feature identification challenges Scanner has concept of a FeatureFileMap + Config options Creates parser objects, then calls csml.parser.dataset.toXML() to create document. By using parser, does not have to worry about XML details.

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML tooling - Use Case #2 Scanning at BODC Metadata in Oracle Database. Python-Oracle link to extract metadata Create parser objects, then call toXML() By using parser, does not have to worry about XML details.

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML tooling - Use Case #3 Subsetting operation Query CSML document, getFeatureList() etc. Subset CSML dataset, return CSML document + new netcdf file Subset multiple datasets return CSML document describing both + netcdf files Subset datasets from different data providers and supply in single CSML file + netcdf files All simplified by use of parser 'objects'.

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML tooling - Use Case #4 CSML Updates Datasets change! BADC has automatic ingest scripts. Datasets change often, and without warning! Feasible to automate metadata updates: Using parser to update existing CSML document when dataset changes Or rescanning dataset periodically and write new CSML document Not implemented btw...

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML tooling - Use Case #5 NDG power user Writes bespoke python scripts to access data e.g. For a dataset that is updated daily: Uses the CSML API to download a subset every day eg. Temperature at certain locations.

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 CSML tooling - Use Case #6 Integrate CSML into Applications High Level API easy to use Integrated with: BADC DataExtractor TPAC WCS Meteorologisk institutt (Norway)

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 BADC Data Extractor

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 TPAC WCS

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Norwegian Met Office

AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Summary Modular set of tools “Features as Objects” instead of just XML Many use cases simplified by object model High level API – easy integration of features with applications CSML v2 parser under development; more sophisticated than v1 parser & may be adaptable for other domains.