VO Sandpit, November 2009 CEDA Mission: “curation and facilitation” “Managing complex datasets and accompanying information for reuse and repurpose” Sam.

Slides:



Advertisements
Similar presentations
BADC Workshop 1: Data & Services from the BADC Royal Met. Soc. Conference – 12 September 2005 Kevin Marsh et al.
Advertisements

Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC.
The Role of Environmental Monitoring in the Green Economy Strategy K Nathan Hill March 2010.
Information Modelling MOLES Metadata Objects for Linking Environmental Sciences S. Ventouras Rutherford Appleton Laboratory.
The role of the ISIC facility for Climate and Environmental Monitoring from Space (CEMS) in the development of Quality Assured Datasets and Downstream.
Data-intensive Research Policy In Ireland A brief overview By J.-C. Desplat.
VO Sandpit, November 2009 NERC Big Data And what’s in it for NCEO? June 2014 Victoria Bennett CEDA (Centre for Environmental Data Archival)
NSF and Environmental Cyberinfrastructure Margaret Leinen Environmental Cyberinfrastructure Workshop, NCAR 2002.
The MashMyData project Combining and comparing environmental science data on the web Alastair Gemmell 1, Jon Blower 1, Keith Haines 1, Stephen Pascoe 2,
Modelling and Data Centre Requirements: CEDA ESGF UV-CDAT Conference December 2014 Philip Kershaw, Centre for Environmental Data Archival, RAL Space,
VO Sandpit, November 2009 Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan.
1 Eric Guilyardi and the Metafor team Common Metadata for Climate Modelling Digital Repositories Metafor Dissemination Workshop Abingdon, 14 March 2011.
IS-ENES [ees-enes] InfraStructure for the European Network for Earth System Modelling IS-ENES will develop a virtual Earth System Modelling Resource Centre.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
CEMS: The Facility for Climate and Environmental Monitoring from Space Victoria Bennett, ISIC/CEDA/NCEO RAL Space.
The DSpace Course Module – An introduction to DSpace.
1 European policies for e- Infrastructures Belarus-Poland NREN cross-border link inauguration event Minsk, 9 November 2010 Jean-Luc Dorel European Commission.
CIM – The Common Information Model in Climate Research
A Draft Palaeoclimate Strategy for NCAS 1. Why palaeoclimate? 2. Grand challenges 3. Strategy aim 4. Strategy components 5. Wider implications Resources.
Climate Sciences: Use Case and Vision Summary Philip Kershaw CEDA, RAL Space, STFC.
Citing Data Sets in the Literature: ORNL DAAC Practices Robert Cook, Suresh SanthanaVannan, and Daine Wright Environmental Sciences Division Oak Ridge.
VO Sandpit, November 2009 e-Infrastructure to enable EO and Climate Science Dr Victoria Bennett Centre for Environmental Data Archival (CEDA)
VO Sandpit, November 2009 Environmental Data Archival: Practices and Benefits crib sheet Graham Parton With many thanks to Dr.
NOCS, PML, STFC, BODC, BADC The NERC DataGrid = Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC.
JASMIN and CEMS: The Need for Secure Data Access in a Virtual Environment Cloud Workshop 23 July 2013 Philip Kershaw Centre for Environmental Data Archival.
VO Sandpit, November 2009 CEDA Metadata Steve Donegan/Sam Pepler.
Innovative Program of Climate Change Projection for the 21st century (KAKUSHIN Program) Innovative Program of Climate Change Projection for the 21st century.
The Commission's Impact Assessment system 18 September 2014 María Dolores Montesinos Impact Assessment unit Secretariat General 1.
May 2, 2013 An introduction to DSpace. Module 1 – An Introduction By the end of this module, you will … Understand what DSpace is, and what it can be.
NIST Data Science SymposiumMarch 4, 2014 NIST Data Science SymposiumMarch 4, Climate Archives in NOAA: Challenges and Opportunities March 4, 2014.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
UK Environmental Observation Framework.
- Vendredi 27 mars PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.
Tony Doyle - University of Glasgow 8 July 2005Collaboration Board Meeting GridPP Report Tony Doyle.
VO Sandpit, November 2009 e-Infrastructure for Climate and Atmospheric Science Research Dr Matt Pritchard Centre for Environmental Data Archival (CEDA)
IPCC TGICA and IPCC DDC for AR5 Data GO-ESSP Meeting, Seattle, Michael Lautenschlager World Data Center Climate Model and Data / Max-Planck-Institute.
Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center
Alison Pamment 1, Steve Donegan 1, Calum Byrom 2, Oliver Clements 3, Bryan Lawrence 1, Roy Lowry 3 1 NCAS/BADC, Science and Technology Facilities Council,
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
Report of the Architecture and Data Committee (ADC) R.Shibasaki (ADC, Japan)
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
ESERO-UK Using space to enhance and support Science, Technology, Engineering and Mathematics (STEM) teaching and learning in the UK ESERO-UK Teacher Conference.
Welcome to the PRECIS training workshop
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
NATURAL ENVIRONMENT RESEARCH COUNCIL Roles, Rights and Responsibilities in Data Curation; The NERC Perspective JISC Data Cluster Consultation Workshop,
NERC e-Science Meeting Malcolm Atkinson Director & e-Science Envoy UK National e-Science Centre & e-Science Institute 26 th April 2006.
The Global Scene Wouter Los University of Amsterdam The Netherlands.
Sensors and Instrumentation Computational and Data Challenges in Environmental Modelling Dr Peter M Allan Director, Hartree Centre, STFC.
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Supporting the “Solving Business Problems with Environmental Data” Competition 24 th October 2013 Vlad Stoiljkovic.
Using a Simple Knowledge Organization System to facilitate Catalogue and Search for the ESA CCI Open Data Portal EGU, 21 April 2016 Antony Wilson, Victoria.
British Antarctic Survey Polar Science For Planet Earth (PSPE) Images can be downloaded here from the BAS image collection here:
Get Data to Computation eudat.eu/b2stage B2STAGE How to shift large amounts of data Version 4 February 2016 This work is licensed under the.
Oceans and Society: Blue Planet An Integrating Task of GEO for Oceans Oceans and Society: Blue Planet An Integrating Task of GEO for Oceans Trevor Platt.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
RAL, 2012, May 11 Research behaviour Martin Juckes, 11 May, 2012.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Role and Challenges of the Resource Centre in the EGI Ecosystem Tiziana Ferrari,
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
USGS EROS LCMAP System Status Briefing for CEOS
An Approach to Software Preservation
Head, WGI Technical Support Unit
JASMIN Success Stories
Connecting the European Grid Infrastructure to Research Communities
WP2. Excellent university for the researchers
Ronald J Stouffer Karl Taylor, Jerry Meehl and many others
Overview of working draft v. 29 January 2018
Collaboration Board Meeting
CMIP6 use case and adoption of RDA outputs
School of Information Studies, Syracuse University, Syracuse, NY, USA
Presentation transcript:

VO Sandpit, November 2009 CEDA Mission: “curation and facilitation” “Managing complex datasets and accompanying information for reuse and repurpose” Sam Pepler Slides stolen from Bryan N. Lawrence University of Reading and STFC Centre for Environmental Data Archival

VO Sandpit, November 2009 Outline Context: What is CEDA? (1) Why is CEDA? (2) Who uses CEDA? (3) Science Challenges Climate: CMIP5 (4) Atmospheric Science: FAAM (5) Earth Observation: CEMS and ISIC (6) Implications: volume, hetereogeneity, diversity of users (7) Organisational Issues How is CEDA funded? (10) STFC and NERC (11)

VO Sandpit, November 2009 What is CEDA? Approximate sizes (FTE): BADC, 8; NEODC, 3.5; SSDC, ( ); DDC, 1.5; Projects, 8.5; Other, 1 Total (2012/13): 24 Lots more 582 logical filesets 953 TB primary data, 1.3 PB primary storage, 2.2 PB total disk. 93 servers, 30 hypervisors, 265 distinct computer systems (inc. VMs) 140 distinct disk partitions 89 million primary files

VO Sandpit, November 2009 Why is CEDA? NERC Data Policy Ensure the continuing availability of environmental data of long-term value for research, teaching, and for wider exploitation for the public good, by individuals, government, business and other organisations. Support the integrity, transparency and openness of the research it supports. Help in the formal publication of data sets, as well as enabling the tracking of their usage to be tracked through citation and data licences. Meet relevant legislation and government guidance on the management and distribution of environmental information. Difference between preservation and curation Preservation Digital curation entails (Wikipedia, 29/04/12) Collecting (CEDA: ingestion) Providing search and retrieval (Services) Certification of the trustworthiness and integrity (documentation/metadata/provenance) Semantic and ontological continuity (an active process!) The Phaistos Disk 1700 BC Preserved, but information content is zero!

VO Sandpit, November 2009 Who users CEDA? (Consumer Perspective) Break down of 3713 users registered for specific CEDA data or services. We don't have details for the other 14,000 users! April Geographic Area: 61% UK, 13% EU, 24% Rest of the world Discipline: 38% Atmospheric and EO. Full spectrum of other fields. User type: 72% University Researchers.

VO Sandpit, November 2009 Science and Impact: CMIP5/AR5 CMIP5: Fifth Coupled Model Intercomparison Project (CMIP5) Major intellectual challenge to organise the data. BADC in forefront of delivering the global federerated data structure. BADC key role as one of three “core” data centres; eventually to have a complete copy of requested output. AR5: Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC) February 2011: First model output available for analysis, July 31, 2012: By this date papers must be submitted for publication to be eligible for assesment by WG1, March 15, 2013: By this date papers cited by working group one (WG1) must be published or accepted. The IPCC’s AR5 is scheduled to be published in September Data in the CMIP5 archive which is used by WG1, WG2 or WG3, must be tagged in the BADC archive, for exposure by the (DECC funded) IPCC Data Distribution Centre. Over 20 modelling groups around the world sharing data from over 100 prescribed experiments representing thousands of years of simulations with millions of output datasets (parameter/experimen t/model) with up to 3 PB of requested output and 10's of PB of likely output!

VO Sandpit, November 2009 Science and Impact: FAAM Photo courtesy of Alan Gadian, NCAS NERC/Met Office aircraft, deployed anywhere in the world! Depend on access to BADC wherever and whenever they are (although we only provide UK 9-5 support, we have one staff member in the USA, which helps.) Deployed on science missions measuring atmospheric properties, and occasionally in support of civil contingency (e.g. Eyjafjallajökull and recently over Elgin.) Figures from FAAM flight briefs B688 B689

VO Sandpit, November 2009 EO data Sea Surface Temperature from space

VO Sandpit, November 2009 Science and Impact: CEMS & ISIC Facility for Climate and Environmental Monitoring from Space (CEMS); "To provide robust evidence of how our planet is changing, and to enable better predictions" From CEDA perspective: (1) A vehicle to support engagement with the commercial community in exploiting EO and climate data and; (2) A vehicle to provide resources for more innovative approaches to explore how we provide services (including computational virtualisation) for data users. Visualisation: supported from CEDA (& e-Science) (photo credit: Bennett) Complex relationship between CEMS and CEDA (diagram courtesy of Reburn, Bennett, and Kershaw)

VO Sandpit, November 2009 UPSCALE The largest ever PRACE computational project, led by the UK, dependent on BADC to provide the data links and data analysis environment! Picture courtesy of P-L Vidale & R. Schiemann, NCAS) Ocean temperatures (in colour going from blue=cold to violet=warm) are shown in the background, while clouds (B/W scale) and precipitation (colour) are shown in the foreground. Over land, snow cover is shown in white. 25 km resolution model run

VO Sandpit, November 2009 Science and Impact Implications: Volume, Heterogeneity, Diversity of Users … and all the observations of this diversity of processes are needed to underpin and evaluate the simulations More Numbers! Overpeck et al, Science, 2011 … probably a vast underestimate in volume terms, and definitely a vast estimate in terms of the different versions needed for differing communities! Data Analysis Problem! Diagrams from IPCC AR5

VO Sandpit, November 2009 Centre for Environmental Data Archival CEDA Activities

VO Sandpit, November 2009 CEDA Funding Key points to note: Roughly half funding comes from NERC (NCAS and NCEO) Major input from project funding, including from the European Union and UK government (e.g. for European Network for Earth Simulation, and the IPCC Data Distribution Centre respectively). Significant funding for “informatics” e.g. “Data modelling” to support the European Commission's INSPIRE geospatial directive, and research funding from the international G8 “exascale” challenge for the ExArch project (Climate analytics on distributed exascale data archives – looking beyond what we're doing for CMIP5!)

VO Sandpit, November 2009 CEDA in both STFC and NERC RCUK NERCSTFC Operations board RAL Space CEDA Earth Observation and Atmos Sci Div NCEONCAS NEODCBADC