CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: www.ndg.badc.rl.ac.uk The NERC DataGrid – Building Bridges for the Environmental Sciences.

Slides:



Advertisements
Similar presentations
Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
Advertisements

CLADDIER project fundamentals Citation, Location and Deposition in Discipline and Institutional Repositories Sam Pepler Project Manager BADC CLADDIER workshop,
Peter Berrisford RAL – Data Management Group SRB Services.
BADC Workshop 1: Data & Services from the BADC Royal Met. Soc. Conference – 12 September 2005 Kevin Marsh et al.
1 UK e-Science All-Hands Meeting Nottingham, 2004 Enterprise specification of the NERC DataGrid Andrew Woolf, Ray Cramer.
Information Modelling MOLES Metadata Objects for Linking Environmental Sciences S. Ventouras Rutherford Appleton Laboratory.
1 NODC, Russia GISC & DCPC developers meeting Langen, 29 – 31 March E2EDM technology implementation for WIS GISC development S. Sukhonosov, S. Belov.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
The MashMyData project Combining and comparing environmental science data on the web Alastair Gemmell 1, Jon Blower 1, Keith Haines 1, Stephen Pascoe 2,
NERC Data Grid Helen Snaith and the NDG consortium …
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
EDMED and EDIOS Roy Lowry, Karen Vickers (Technical) Lesley Rickards, Liz Bradshaw (Content) British Oceanographic Data Centre.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
VO Sandpit, November 2009 Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan.
2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet.
Metadata (for the data users downstream) RFC GIS Workshop July 2007 NOAA/NESDIS/NGDC Documentation.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
CIM – The Common Information Model in Climate Research
The National Grid Service User Accounting System Katie Weeks Science and Technology Facilities Council.
Bryan Lawrence on behalf of BADC, BODC, CCLRC, PML and SOC An Introduction to NDG concepts [ ]=
CF Conventions Support at BADC Alison Pamment Roy Lowry (BODC)
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Grid-enabling OGC Web Services Andrew Woolf, Arif Shaon STFC e-Science Centre Rutherford Appleton Lab.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
NOCS, PML, STFC, BODC, BADC The NERC DataGrid = Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC.
The NERC DataGrid The NERC DataGrid DataGrid The NERC DataGrid DataGrid Bryan Lawrence, BADC David Boyd Kerstin Kleese Roy Lowry Dean Williams Bob Drach.
Vers national spatial data infrastructure training program Value of Metadata Introduction to Metadata An overview of the value of metadata to.
M.Lautenschlager (WDCC, Hamburg) / / 1 Semantic Data Management for Organising Terabyte Data Archives Michael Lautenschlager World Data Center.
‘intelligent openness’ The common objective of an RCUK data policy Gregor McDonagh
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
Lesson Overview 3.1 Components of the DBMS 3.1 Components of the DBMS 3.2 Components of The Database Application 3.2 Components of The Database Application.
NDG Discovery Gateway ISO19139 Issues Bryan Lawrence Director of Environmental Data Archival and Associated Research, CCLRC Head of the British Atmospheric.
Rupa Tiwari, CSci5980 Fall  Course Material Classification  GIS Encyclopedia Articles  Classification Diagram  Course – Encyclopedia Mapping.
The NERC DataGrid Prototype Bryan Lawrence 2, Ray Cramer 3, Marta Gutierrez 2, Kerstin Kleese van Dam 1, Siva Kondapalli 3, Susan Latham 2, Roy Lowry 3,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Implementing CSML Feature Types in applications within the NERC DataGrid Dominic Lowe, British Atmospheric.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Finding and utilising atmospheric/oceanic data in a distributed.
Michael Doherty RAL UK e-Science AHM 2-4 September 2003 SRB in Action.
1 NASA CEOP Final Summary CEOS WGISS-26 Boulder, Colorado September 23, 2008 Yonsook Enloe
Alison Pamment 1, Steve Donegan 1, Calum Byrom 2, Oliver Clements 3, Bryan Lawrence 1, Roy Lowry 3 1 NCAS/BADC, Science and Technology Facilities Council,
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
Bryan Lawrence on behalf of the NDG, BADC and BODC. Ray Cramer, Marta Gutierrez, Kerstin Kleese, Siva Kondapalli, Sue Latham, Roy Lowry, Kevin O’Neill,
An introduction to the MEDIN Discovery Metadata Standard.
British Atmospheric Data Centre ( Searching: Whither NDG? Bryan Lawrence.
NESC Worshop – 07 September 2005 Development of a Marine Metadata Standard Greg Reed Executive Officer Australian Ocean Data Centre Joint Facility.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
Introduction to BODC and GEOTRACES data office Edward Mawji British Oceanographic Data Centre
ECMWF 24 th November 2008 Deploying secure OGC services in front of a heterogeneous data archive. Bryan Lawrence, Phil Kershaw, Dominic Lowe, and Stephen.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
2005 – 06 – - ESSP1 WDC Climate : Web Access to Metadata and Data Frank Toussaint World Data Center for Climate (M&D/MPI-Met, Hamburg)
NERC DataGrid: Googling for Secure Data
An Overview of Data-PASS Shared Catalog
Proposal of a Geographic Metadata Profile for WISE
Presentation transcript:

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The NERC DataGrid – Building Bridges for the Environmental Sciences Bryan Lawrence Kerstin Kleese, Roy Lowry, Kevin O’Neill, Andrew Woolf & others Head, NCAS/British Atmospheric Data Centre Rutherford Appleton Laboratory, CCLRC

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Partners As funded a partnership between –British Atmospheric Data Centre (BADC, PI: Bryan Lawrence) –British Oceanographic Data Centre (BODC, Co-I: Roy Lowry) –CLRC E-science Centre (Co-I: Kerstin Kleese) –PCMDI at LNL in the US (Dean Williams, Bob Drach, Mike Fiorino) Project has caught the imagination, extra funding now supports: –A number of groups at the NERC Centre for Ecology and Hydrology (CEH: Ecology DataGrid) –NERC Earth Observation Data Centre & Plymouth Marine Lab Remote Sensing Not directly funded major collaborators will include: –ClimatePrediction.net, GODIVA (NERC e-science projects) –NCAS/CGAM: The Centre for Global Atmospheric Modelling at the University of Reading (via Lois Stenman-Clark and Katherine Bouton) –Already required to provide technology to support the major UK project: HIGEM (a collaboration between the Hadley Centre and the NERC academic community to develop the next generation of high resolution GCM models based on HadGEM).

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Outline Motivation: –The BADC, BODC, and the Metadata Gateway The NDG Goal NDG Metadata Structures and Architecture –Metadata Model –Data Model –ISO Context NDG Prototype Status Summary & Challenges

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The British Oceanographic Data Centre (not for much longer, moving to a site on Liverpool University campus imminently)

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: BODC Mission Statement vTo operate a world class data centre in support of UK marine science by: providing data management support for UK marine science projects maintaining and developing the UK’s national oceanographic database developing innovative marine data products and digital atlases collaborating, on behalf of the UK, in the international exchange and management of oceanographic data making high quality data readily available to UK research scientists in academia, government and industry

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: British Atmospheric Data Centre The Role: Key words: Curation and Facilitation!

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: BADC Users 3800 registered in March03 ~ 300 individual users per month Users by Discipline November 02, 2150 Users

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: BADC Storage Capacity Approx 50 TB (Nov02) Projected to quadruple well within next couple of years given existing commitments Planning exercise under way now. Committed to keeping as much as possible on spinning disk Further backup and extra storage at national archival centre (ATLAS, PB soon) 2.5Gb

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Huge variety of Data Sets

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Querying datasets Complex Metadata, held in Ingres database: export DIF and Z39.50 No possibility of automatic data usage …

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Different types of data returned: Wallingford Supporting very diverse user community: NetCDF is not enough …

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NERC Metadata Gateway - SST No clean handover from discovery to browse and use! Geospatial coordinates forgotten. Time reference forgotten. Need to get entire field(s), and find correct time! And if I want to compare data from different locations? - multiple logins - multiple formats - discovery?

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Outline Motivation: –The BADC, BODC, and the Metadata Gateway The NDG Goal NDG Metadata Structures and Architecture –Metadata Model –Data Model –ISO Context NDG Prototype Status Summary & Challenges

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The NERC DataGrid

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Wider Internet Research Group Satellite SuperComputer Shared Resources DB Research Group Metadata Origins Consider a hierarchy of data users beginning with an individual scientist, who may herself be part of a research group, itself part of a community sharing resources, lying in the wider internet … To be well integrated the metadata should have a role at each level! (The data portal client and server interface may be different at each level). At each level “extra” metadata will be required, probably produced by dedicated staff at the research group, or data centre.

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: A google for data; the metadata carrot! Wider Internet Researc h Group Satellite SuperComputer Shared Resources DB Researc h Group

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Outline Motivation: –The BADC, BODC, and the Metadata Gateway The NDG Goal NDG Metadata Structures and Architecture –Metadata Model –Data Model –ISO Context NDG Prototype Status Summary & Challenges

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Metadata Taxonomy

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid:

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid:

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid:

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Separate data (A) and metadata (B) models Clear separation of function –Difference between data use and discovery etc. –“Tuning” of metadata to include relevant detail Allows increased reuse of metadata model –Avoids tie-in to details of a particular fields data formats –Can plug-in another data model Metadata Model Data Model Data granule ID Data summary

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: (A) NDG Data Model: Overview Dataset: named container for a number of variables Variable: physical parameters within the dataset; controlled vocabularies eg BODC datadictionary, CF standard names Array: multidimensional container for other arrays or numeric data Coordinate: may be shared between multiple Arrays; ‘anonymous’ if not georeferenced; MappedCoordinate vs ProductCoordinate; with respect to a Coordinate reference System (ref ISO 19111, ISO 19115) GranuleDescriptor: describes data granule in terms of file storage; enables file aggregation; SQL/OGSA-DAI for RDBMS; physical or logical (eg SRB) files “Profiles” of model defined for important data types

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Data Model Array

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: (B) Metadata Model

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: (B) Metadata Model: an NDG Intermediate Schema, Conceptual Overview

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Outline Motivation: –The BADC, BODC, and the Metadata Gateway The NDG Goal NDG Metadata Structures and Architecture –Metadata Model –Data Model ISO Context NDG Prototype Status Summary & Challenges

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: ISO 19101: Geographic information – Reference model ISO 19103: Geographic information – Conceptual schema language ISO 19107: Geographic information – Spatial schema ISO 19108: Geographic information – Temporal schema ISO 19109: Geographic information – Rules for application schema ISO 19111: Geographic information – Spatial referencing by coordinates ISO 19115: Geographic information – Metadata ISO 19118: Geographic information – Encoding ISO 19121: Geographic information – Imagery and gridded data ISO TC211

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Dataset title Dataset reference date Dataset responsible party Metadata point of contact Dataset language Dataset character set Dataset topic categoryAbstract describing dataset Spatial resolution of dataset Spatial representation type Geographic location of dataset Vertical/temporal extent for dataset Reference system Lineage Distribution format On-line resource Metadata character set Metadata date stamp Metadata standard name Metadata standard version Metadata file identifier Metadata language ISO19115

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Metadata extensions and profiles ISO Direct relationship between ISO19115 and our (B) Intermediate schema.

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Profiling of ISO 191xx “The comprehensiveness and large number of options available in various base standards make it difficult to combine them for practical applications. … A profile integrates a set of base standards and/or modules (predefined subsets) of base standards to meet a specific implementation requirement.” Registration of profiles “A profile that is registered through an ISO registration procedure becomes an International Standardized Profile (ISP). National standards that are expressed as profiles of ISO base standards may be registered at a national level.” ISO19101

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Further Application in NERC DataGrid eg Data model “Coordinates” ISO ISO 19108

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Outline Motivation: –The BADC, BODC, and the Metadata Gateway The NDG Goal NDG Metadata Structures and Architecture –Metadata Model –Data Model –ISO Context NDG Prototype Status Summary & Challenges

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The Data Use Chain

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Key Components – need APIs and standards Globus Harvest

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Discovery Service Element Traditional and Grid Service (GT3) Interfaces

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Starting with the LAS Deployment for UK users within a few weeks (constraint is primarily access control)

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: LAS – Simple Box fill Output Work for us to do: Labelling is inadequate as yet.. ERA40

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Cache management in LAS/CDAT Calls cdms.open to open data file. CDAT BADC/CDAT intercepts command and checks cache BADC/CDAT YES Spectral file is converted on-the-fly and placed in cache. NO Cache unlocked. New cdms.open command sent to CDAT and cache file opened. Cache also checks if enough room, deletes oldest files if necessary and checks against disk space limit. Locks access to cache. Checks if regular gridded file is in cache list. localCache.py 18 TB virtual dataset LAS ERA-40 4 TB Spectral Archive ERA-40 < 1TB Grid Cache Internet User NetCDF file, plot or animations delivered to user. Data object delivered to LAS.

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NERC DataGrid Prototype (by hand) Ingestion of ACSOE data from BADC and BODC. NASA GCMD DIF based discovery –Exported from Intermediate Schema –Harvested by hand Working on hand-over-mechanism to pass dataset info to DataModel based LAS service –Generate and populate LAS database in response –Use standard LAS delivery Next Steps: GT3 based services, improve LAS, improve delivery, implement multiple datamodel profiles, implement multiple discovery services.

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Summary NDG project running for a year now, aiming to provide grid- enabled tools to support: –a diverse community –with diverse datasets NDG part of the UK National E-science programme, and will leverage off other projects to implement grid solutions. –initial prototype web-service based –GT3 prototype due early in the new year Software development based on plagiarising the maximum amount from other groups, and a standards based approach within the NDG. –All code will be in the public domain Major challenge will not be technical; policy, attitudes, legal issues.

CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: You’ve gone TOO FAR!