ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: www.ndg.badc.rl.ac.uk Finding and utilising atmospheric/oceanic data in a distributed.

Slides:



Advertisements
Similar presentations
BADC Workshop 1: Data & Services from the BADC Royal Met. Soc. Conference – 12 September 2005 Kevin Marsh et al.
Advertisements

1 UK e-Science All-Hands Meeting Nottingham, 2004 Enterprise specification of the NERC DataGrid Andrew Woolf, Ray Cramer.
Information Modelling MOLES Metadata Objects for Linking Environmental Sciences S. Ventouras Rutherford Appleton Laboratory.
1 NODC, Russia GISC & DCPC developers meeting Langen, 29 – 31 March E2EDM technology implementation for WIS GISC development S. Sukhonosov, S. Belov.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
CAS2K3, September 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The NERC DataGrid – Building Bridges for the Environmental Sciences.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
NERC Data Grid Helen Snaith and the NDG consortium …
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
EDMED and EDIOS Roy Lowry, Karen Vickers (Technical) Lesley Rickards, Liz Bradshaw (Content) British Oceanographic Data Centre.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet.
The NERC DataGrid Vocabulary Server Roy Lowry British Oceanographic Data Centre Ontology Registry Meeting.
The NERC DataGrid Vocabulary Server: an operational system with distributed ontology potential Roy Lowry British Oceanographic Data Centre GO-ESSP 2008,
Metadata (for the data users downstream) RFC GIS Workshop July 2007 NOAA/NESDIS/NGDC Documentation.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Bryan Lawrence on behalf of BADC, BODC, CCLRC, PML and SOC An Introduction to NDG concepts [ ]=
CF Conventions Support at BADC Alison Pamment Roy Lowry (BODC)
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Grid-enabling OGC Web Services Andrew Woolf, Arif Shaon STFC e-Science Centre Rutherford Appleton Lab.
Mapping between SOS standard specifications and INSPIRE legislation. Relationship between SOS and D2.9 Matthes Rieke, Dr. Albert Remke (m.rieke,
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
ET-ADRS-1, April ISO 191xx series of geographic information standards.
NOCS, PML, STFC, BODC, BADC The NERC DataGrid = Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC.
What is Information Modelling (and why do we need it in NEII…)? Dominic Lowe, Bureau of Meteorology, 29 October 2013.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
VO Sandpit, November 2009 CEDA Metadata Steve Donegan/Sam Pepler.
AUKEGGS Architecturally Significant Issues (that we need to solve)
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
NDG Discovery Gateway ISO19139 Issues Bryan Lawrence Director of Environmental Data Archival and Associated Research, CCLRC Head of the British Atmospheric.
Rupa Tiwari, CSci5980 Fall  Course Material Classification  GIS Encyclopedia Articles  Classification Diagram  Course – Encyclopedia Mapping.
The NERC DataGrid Prototype Bryan Lawrence 2, Ray Cramer 3, Marta Gutierrez 2, Kerstin Kleese van Dam 1, Siva Kondapalli 3, Susan Latham 2, Roy Lowry 3,
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Implementing CSML Feature Types in applications within the NERC DataGrid Dominic Lowe, British Atmospheric.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
An introduction to the MEDIN Discovery Metadata Standard.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
An introduction to the MEDIN Discovery Metadata Standard.
British Atmospheric Data Centre ( Searching: Whither NDG? Bryan Lawrence.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Data Browsing/Mining/Metadata
NERC DataGrid: Googling for Secure Data
An Overview of Data-PASS Shared Catalog
Data Requirements for Climate and Carbon Research
Session 2: Metadata and Catalogues
Metadata Development in the Earth System Curator
Proposal of a Geographic Metadata Profile for WISE
CISL’s Research Data Archive (RDA) : Description and Methods
Data Management Components for a Research Data Archive
Robert Dattore and Steven Worley
Presentation transcript:

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Finding and utilising atmospheric/oceanic data in a distributed world: the UK NERC DataGrid. Bryan Lawrence (Kerstin Kleese, Roy Lowry, Kevin O’Neill, Andrew Woolf & others) NCAS/British Atmospheric Data Centre Rutherford Appleton Laboratory, CCLRC

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Partners As funded a partnership between –British Atmospheric Data Centre (BADC, PI: Bryan Lawrence) –British Oceanographic Data Centre (BODC, Co-I: Roy Lowry) –CLRC E-science Centre (Co-I: Kerstin Kleese) –PCMDI at LNL in the US (Dean Williams, Bob Drach, Mike Fiorino) Project has caught the imagination, extra funding now supports: –A number of groups at the NERC Centre for Ecology and Hydrology (CEH: Ecology DataGrid) –NERC Earth Observation Data Centre & Plymouth Marine Lab Remote Sensing Not directly funded major collaborators will include: –ClimatePrediction.net, GODIVA (NERC e-science projects) –NCAS/CGAM: The Centre for Global Atmospheric Modelling at the University of Reading (via Lois Stenman-Clark and Katherine Bouton) Project will support HIGEM

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Outline Motivation: The NDG Goals Working in a standards based world – ISO and OGC … NDG Metadata Summary

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The British Oceanographic Data Centre (not for much longer, moving to a site on Liverpool University campus imminently)

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: British Atmospheric Data Centre The Role: Key words: Curation and Facilitation!

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Easily catalogued, but successful preservation? One could argue that the writers of these documents did a brilliant job of preserving the bits-and-bytes of their time … And yes they’ve both been translated … many times, it’s a shame the meanings are different … Phaistos Disk, 1700BC

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NERC Metadata Gateway - SST No clean handover from discovery to browse and use! Geospatial coordinates forgotten. Time reference forgotten. Need to get entire field(s), and find correct time! And if I want to compare data from different locations? - multiple logins - multiple formats - discovery?

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: A priori would any user know to look in the COAPEC data set? Earth system-science means we have to remove these boundaries! detailed file level metadata isn’t visible, and so data mining applications impossible. NB: Dynamic catalogues! How good is our metadata?

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Finding Data The Goal: Very simple interface, hide the complex software!

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: A newer “dataset” The extreme relevance of this example from Amazon was pointed out by Jon Callahan (LAS project, PMEL)!

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: PCMDI – Best practice! (if you know where to look) Final references are papers! Is the information coupled to the datasets? What if I take a dataset home, and another, and another … and then forget which is which? Can I ask the question: what datasets used the Semtner sea ice parameterisation?

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Huge variety of Data Sets

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Querying datasets Complex Metadata, held in Ingres database: export DIF and Z39.50 No possibility of automatic data usage …

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Different types of data returned: Wallingford Supporting very diverse user community: NetCDF is not enough …

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Modelling advances: Baseline Numbers T42 CCSM (current, 280km) –7.5GB/yr, 100 years ->.75TB T85 CCSM (140km) –29GB/yr, 100 years -> 2.9TB T170 CCSM (70km) –110GB/yr, 100 years -> 11TB NCAR Don Middleton

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Capacity-related Improvements Increased turnaround, model development, ensemble of runs Increase by a factor of 10, linear data Current T42 CCSM –7.5GB/yr, 100 years ->.75TB * 10 = 7.5TB

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Capability-related Improvements Spatial Resolution: T42 -> T85 -> T170 Increase by factor of ~ 10-20, linear data Temporal Resolution: Study diurnal cycle, 3 hour data Increase by factor of ~ 4, linear data CCM3 at T170 (70km)

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Capability-related Improvements Quality: Improved boundary layer, clouds, convection, ocean physics, land model, river runoff, sea ice Increase by another factor of 2-3, data flat Scope: Atmospheric chemistry (sulfates, ozone…), biogeochemistry (carbon cycle, ecosystem dynamics), middle Atmosphere Model… Increase by another factor of 10+, linear data

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Model Improvement Wishlist Grand Total: Increase compute by a Factor O( ) NCAR Don Middleton

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Climate in – A graphic Illustration Figures from Gary Strand, NCAR, ESG website

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Summary thus far Contentions: The average atmospheric scientific project involves about 1/3 of the time data handling! (Getting, reformatting etc). The problem for earth system model projects is about to get worse – for everyone, from the initiator, to the archiver, to the analyst, to the contributor, to the improver. (Remember the documentation problem is growing exponentially too: new sub-components etc)

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The NERC DataGrid

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid:

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: The Data Use Chain

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Requirements: Information (1) Amazon Discovery gives good examples: Browse Similar datasets Details Content examples Learn from the library and book handling community! Our domain Issues require: Dealing with Volume Formats Providing Tools All require documentation (aka metadata); We need to improve our information handling

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: What is metadata? The answer depends on who you are! Firstly: information to help one use one’s own data: e.g. calibration data (A) Metadata can help one find other people’s data … and then help one obtain and use it. (C) Metadata can be used to enable the preservation of data for posterity (all of ABCD) It is information passed with the data to enable someone else to use it. It describes the data. (B) Metadata can be used to enable automatic software to manipulate data. (D)

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Metadata Taxonomy

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: ISO 19101: Geographic information – Reference model ISO 19103: Geographic information – Conceptual schema language ISO 19107: Geographic information – Spatial schema ISO 19108: Geographic information – Temporal schema ISO 19109: Geographic information – Rules for application schema ISO 19111: Geographic information – Spatial referencing by coordinates ISO 19115: Geographic information – Metadata ISO 19118: Geographic information – Encoding ISO 19121: Geographic information – Imagery and gridded data ISO TC211

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Dataset title Dataset reference date Dataset responsible party Metadata point of contact Dataset language Dataset character set Dataset topic categoryAbstract describing dataset Spatial resolution of dataset Spatial representation type Geographic location of dataset Vertical/temporal extent for dataset Reference system Lineage Distribution format On-line resource Metadata character set Metadata date stamp Metadata standard name Metadata standard version Metadata file identifier Metadata language ISO19115

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Metadata extensions and profiles ISO Direct relationship between ISO19115 and our (B) Intermediate schema.

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Profiling of ISO 191xx “The comprehensiveness and large number of options available in various base standards make it difficult to combine them for practical applications. … A profile integrates a set of base standards and/or modules (predefined subsets) of base standards to meet a specific implementation requirement.” Registration of profiles “A profile that is registered through an ISO registration procedure becomes an International Standardized Profile (ISP). National standards that are expressed as profiles of ISO base standards may be registered at a national level.” ISO19101

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG A and B metadata in practice Clear separation of function between use and discovery. Standards Compliant Avoid tie-in to details of particular fields or data formats or even components Metadata model (B) “Intermediate” schema, supports multiple discovery formats NDG Data Model (A). provides an abstract semantic model for the structure of data within NDG, enables the specification of concrete instances for use by NDG Data Services

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: (B) Metadata Model

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: (B) Metadata Model: an NDG Intermediate Schema, Conceptual Overview

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Dataset Variables Multidimensional array... of other arrays... or from aggregated storage Rich spatiotemporal referencing (standards- compliant: ISO19108, ISO19111) NDG Data Model

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: This is the root element of the document. It is a limited (and incomplete!) implementation of the root element described in ISO (clause 5.4). The latter describes the format of a compliant XML exchange file, intended for encoding a single dataset. The application of ISO for the current NDG Data Model which may contain multiple datasets needs to be resolved. A dataset contains one or more elements that encode objects, grouped in a choice group that shall be used to restrict the legal objects in a dataset (ISO 19118, clause A.5.4.2). UML conceptual model: ISO (conceptual schema language) ISO (rules for application schema) XML schema ISO (encoding) NDG Data Model Schema

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: SAX demarshalling extraction serialisation writeData( selectedComponents) Instantiating the NDG Data Model

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Further Application in NERC DataGrid eg Data model “Coordinates” ISO ISO 19108

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Semantic Data Model

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Prototype Layout not important (yet!) It’s what’s under the hood that counts … ( … the data is NOT in NetCDF. The original data is available … … the search covered data that could have been harvested … … the architecture works!)

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid:

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Discovery Service Element Traditional and Grid Service (GT3) Interfaces

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: NDG Metadata Status We have built a SIMPLE prototype based primarily on our data model and used our structures to find, locate, reformat and deliver data typical of BODC and BADC observational data. (This is a first) We are about to re-engineer. Key issues to address will be –Vocabularies, and –Ontologies –Developing a Model Attribute Language (with CGAM, PRISM, PCMDI and others). Populating our metadata; a boring and laborious job!

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Wider Internet Research Group Satellite SuperComputer Shared Resources DB Research Group Metadata Origins Consider a hierarchy of data users beginning with an individual scientist, who may herself be part of a research group, itself part of a community sharing resources, lying in the wider internet … To be well integrated the metadata should have a role at each level! (The data portal client and server interface may be different at each level). At each level “extra” metadata will be required, probably produced by dedicated staff at the research group, or data centre.

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Vocabularies BODC has a parameter dictionary with o(10K) entries CF standard name vocabulary, o(100) entries NASA Global Change Master Directory o(1000) entries … there are more. Need methods of mapping namespaces, communities will not sacrifice their existing taxonomies …

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: An ontology defines the terms used to describe and represent an area of knowledge by specifying the following kinds of concepts: Classes (general things) in the many domains of interest The relationships that can exist among things The properties (or attributes) those things may have Ontologies are usually expressed in a logic-based language, so that detailed, accurate, consistent, sound, and meaningful distinctions can be made among the classes, properties, and relations.. What is an Ontology?

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: RDF: Resource Description Framework W3C language which builds on the hierarchical attribute/entity structures of XML. Used to create a collection of assertions – specified as triples – Now we can build tools which use these concepts: –Aslan has a mane! –Aslan will also have animal properties. RDF Schema vocabulary builds on this to allow namespaces and more … (ranges etc).

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Real Ontologies … immature … SWEET: Semantic Web for Earth and Environmental Technology (NASA, JPL) Earth realms, Numerics Physical Properties Units Phenomena I believe they are Attempting a CF mapping…

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Requirements (2) We need to think about our networks and our tools for moving and keeping track of data! We can’t rely on the “leave it at the supercomputer site” –How do we do joint analysis? –How do we process the data at all? Malcolm Atkinson quoting Jim Gray pointed out that it takes: ~ o(minute) to grep or ftp a GB ~ o(2 days) to grep or ftp a TB ~ o(3 years) to grep or ftp a PB Requires –sophisticated “fire and forget” file transfer (that has to out perform “sneaker net”). –Disk and compute resources for processing.

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: ESG1 Results (Supercomputing, 2001) Allcock et al Dallas to Chicago:

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Starting with the LAS Deployment for UK users within a few weeks (constraint is primarily access control)

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: LAS – Simple Box fill Output Work for us to do: Labelling is inadequate as yet.. ERA40

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Cache management in LAS/CDAT Calls cdms.open to open data file. CDAT BADC/CDAT intercepts command and checks cache BADC/CDAT YES Spectral file is converted on-the-fly and placed in cache. NO Cache unlocked. New cdms.open command sent to CDAT and cache file opened. Cache also checks if enough room, deletes oldest files if necessary and checks against disk space limit. Locks access to cache. Checks if regular gridded file is in cache list. localCache.py 18 TB virtual dataset LAS ERA-40 4 TB Spectral Archive ERA-40 < 1TB Grid Cache Internet User NetCDF file, plot or animations delivered to user. Data object delivered to LAS.

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: Summary Earth System Modelling extends the data handling challenge. We need better information management We need better tools for moving things around We need better tools for using remote data … and we need data manipulation hardware! The NDG is attempting (with help) to address: Information management Data movement Tools to manipulate large volumes of data. … and doing this all in as standards compliant a manner as possible.

ESM Meeting, Cambridge 2003 BADC: badc.nerc.ac.uk, NERC DataGrid: You’ve gone TOO FAR!