The Climate-G testbed towards a large scale data sharing environment for climate change S. Fiore Scientific Computing and Operations Division, CMCC, Italy.

Slides:



Advertisements
Similar presentations
1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
Advertisements

Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
The Earth System Grid Discovery and Semantic Web Technologies Line Pouchard Oak Ridge National Laboratory Luca Cinquini, Gary Strand National Center for.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Client/Server Grid applications to manage complex workflows Filippo Spiga* on behalf of CRAB development team * INFN Milano Bicocca (IT)
INFSO-RI Enabling Grids for E-sciencE Intelligent Distributed Data Management in Earth system science K. Ronneberger, DKRZ, Germany.
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
Co-ordination & Harmonisation of Advanced e-Infrastructures Research Infrastructures – Grant Agreement n Regional progress report: Mediterranean.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Nicholas LoulloudesMarch 3 rd, 2009 g-Eclipse Testing and Benchmarking Grid Infrastructures using the g-Eclipse Framework Nicholas Loulloudes On behalf.
INFSO-RI Enabling Grids for E-sciencE ES applications in EGEEII – M. Petitdidier –11 February 2008 Earth Science session Wrap up.
The Climate-G testbed towards a large scale data sharing environment for climate change S. Fiore Scientific Computing and Operations Division, CMCC, Italy.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
The GRelC Project: architecture, history and a use case in the environmental domain G. Aloisio - S. Fiore The Climate-G testbed is an interdisciplinary.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
- Vendredi 27 mars PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
A GRID solution for Gravitational Waves Signal Analysis from Coalescing Binaries: preliminary algorithms and tests F. Acernese 1,2, F. Barone 2,3, R. De.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The GILDA t-Infrastructure Roberto Barbera.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
1 Overall Architectural Design of the Earth System Grid.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
SEE-GRID-2 The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE: Enabling grids for E-Science Bob Jones.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
A Collaborative e-Science Architecture towards a Virtual Research Environment Tran Vu Pham 1, Dr. Lydia MS Lau 1, Prof. Peter M Dew 2 & Prof. Michael J.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
INFSO-RI Enabling Grids for E-sciencE EGEE general project update Fotis Karayannis EGEE South East Europe Project Management Board.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bob Jones EGEE project director.
EGI-InSPIRE EGI-InSPIRE RI Services for Earth Sciences Services for the EGI Heavy User Communities – EGITF 2010 Horst Schwichtenberg.
INFSO-RI Enabling Grids for E-sciencE ESR Database Access K. Ronneberger,DKRZ, Germany H. Schwichtenberg, SCAI, Germany S. Kindermann,
Accessing the VI-SEEM infrastructure
Data Browsing/Mining/Metadata
Un sistema avanzato per la gestione di immagini biomedicali in ambiente GRID G. Aloisio, S. Fiore SPACI Consortium & University of Salento, Italy, Lecce.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Outline Pursue Interoperability: Digital Libraries
Data Management Components for a Research Data Archive
gLite The EGEE Middleware Distribution
Presentation transcript:

The Climate-G testbed towards a large scale data sharing environment for climate change S. Fiore Scientific Computing and Operations Division, CMCC, Italy G. Aloisio Scientific Computing and Operations Division Head, CMCC, Italy On behalf of Climate-G Team EGEE-III Review - June 24, 2009

2 Scenario, issues and needs Climate community produces huge amount of data at an international level There is a strong need to share and integrate data among several centres Petabytes of data related to: Climate Change data -> century types simulations Seasonal to Decadal data -> decennal types simulations Next generation climate change infrastructures must provide a seamless environment Open, distributed and service-based approach Issues: data distribution, data format heterogeneity, metadata management, security, transparent access to the system, scalable approach, …

EGEE-III Review - June 24, Grid and Climate Change: Climate-G The main goal of Climate-G is to create an open and unified environment for climate change enabling geographical and cross- institutional data discovery, access, analysis, visualization and sharing. This effort has been conceived as a proof of concept for the involved grid technologies (in particular GRelC grid metadata service) and supported by the Earth Science Cluster Community (EGEE Project). A virtual laboratory involving partners both in Europe and US Interdisciplinary effort: both Climate Change and Computational Scientists

EGEE-III Review - June 24, Climate-G partnership

EGEE-III Review - June 24, Climate-G: main challenges and requirements Management of PBs of distributed data performance scalability fault tolerance autonomy security transparency interoperability Data Distribution Centre pervasive easy ubiquitous Integrated Environment Several tools integrated in the same web context. Modular approach Easily extensible

EGEE-III Review - June 24, Why Grid? A Grid based approach provides the proper basis at an infrastructural level It ensures the right level of flexibility, scalability and manageability Data virtualization is a key point to build a transparent environment for the climate community Grid metadata management gives an efficient answer to climate metadata access, management, sharing and integration Computational tasks related to post-processing and data analysis can take advantage of a grid infrastructure

EGEE-III Review - June 24, Climate-G and EGEE: Grid Services GRelC Data Access and Integration Service (GRelC DAIS) - EGEE RESPECT Grid Metadata Management Convergence between Grid & P2P systems LHC File Catalog (LFC) - EGEE gLite Grid filesystem for the distributed climate data production User-defined data collections as subsets of the whole data production Virtual Organization Membership Service (VOMS) - EGEE gLite Flexible and scalable role-based management EGEE Farms, WMS and User Interface Computational and storage services for post-processing and analysis tasks Workload Management System and Information Index to distribute jobs on the grid User Interface to access to the Grid

EGEE-III Review - June 24, Climate-G and EGEE Middleware

EGEE-III Review - June 24, Grid Metadata Service: GRelC (EGEE RESPECT) Metadata are information about data Fundamental to perform search and discovery of climate datasets A centralized approach is not suitable (problem dimension, site autonomy requirement, scalability of the approach, etc.) Distributed and grid-based metadata management provides the basis for an efficient, transparent and effective climate metadata management GRelC provides a Grid & P2P based approach to metadata management Fully compliant with gLite Part of the EGEE RESPECT Program Expand the functionality of the grid infrastructure for users Manages several data sources XML and Relational General purpose, that is domain independent solution

EGEE-III Review - June 24, Grid Metadata Service: GRelC (EGEE RESPECT)

EGEE-III Review - June 24, Climate-G: Grid Metadata Service Network RESPECT

EGEE-III Review - June 24, Climate-G: domain based services/tools Climate-G includes domain-based services & tools into the infrastructure - User community requirement: coexistence of grid and domain-based services - Provides domain specific tasks. Well known, tested and widely adopted. - Legacy systems already available and accessible Some examples: OPeNDAP (OPeNDAP Consortium) Provides access to climate data sources Widely adopted in the Climate community nc Web Map Service (Univ. of Reading) HTTP interface for requesting geo-registered map images from geospatial databases Integrated Data Viewer (UNIDATA,UCAR) and Godiva2 (Univ. of Reading) Data visualization tools widely adopted by the Climate community

EGEE-III Review - June 24, Climate-G and EGEE In April, Climate-G has been recognized as a new VO by the EGEE Resource Allocation Group (climate-g.vo.eu-egee.org) First VO for climate change purposes About 50 users joined the VO since April (less than 2 months) Most of them (about 80%) come from the climate context and are using a grid infrastructure for the first time -> new users Interesting level of feedback from our users in terms of: suggestions to improve the portal new data sources and new tools to be included into the portal Several EGEE sites have been configured to support the “Climate-G VO” (Fraunhofer SCAI, SPACI-LECCE, IPSL/CNRS IPGP,UniCantabria) More than 300 CPUs are now available for preliminary tests The whole Climate-G infrastructure (data and computational) must be accessible through the Climate-G DDC Portal, our scientific gateway

EGEE-III Review - June 24, Climate-G Data Distribution Centre

EGEE-III Review - June 24, Climate-G and EU Projects EU FP7 EGEE-III Climate-G infrastructure is strongly based on the EGEE middleware (gLite and EGEE RESPECT) EU FP7 METAFOR Metafor CIM schema is under evaluation. Some interoperability issues could be part of the Climate-G metadata activity EU FP6 ENSEMBLES Climate-G publishes about 2 TB of datasets. Most of them come from the ENSEMBLES data providers (IPSL, UniCantabria) University of Cantabria (in the context of the ENSEMBLES project) will extend its own data post-processing and downscaling system to access to the grid through the Climate-G Grid Metadata Infrastructure

EGEE-III Review - June 24, Limitations and future work Presently Climate-G manages online data –In the next future access to deep storage will be managed via SRM interfaces Today Climate-G DDC manages the entire data infrastructure. Access to the computational part is now carried out via CLI –In the next future access to the computational part will be performed via the Climate-G Portal Climate-G DDC now manages Atmospheric and Oceanographic data –Climate-G will manage both climate and economic data. Economic impacts of climate change on health, coasts, soil, agriculture, etc. represent an important goal for our community Analysis & visualization tools currently supported: IDV and Godiva2 –Climate-G will soon integrate into the portal support for the Grid Analysis and Display System (GrADS)

EGEE-III Review - June 24, Conclusions Climate-G has a strong relationship with the EGEE Project A new EGEE VO for the Climate-G testbed has been created (April 2009) GRelC DAIS provides a grid based distributed metadata management as well as harvesting solution Climate-G Data Grid Portal to ease Metadata management via Web Interface Visualization tools have been integrated (IDV, Godiva2) Climate-G is conceived as a Virtual Laboratory for the involved people and technologies MoA between CMCC and University of Reading has been signed on advanced data management and data visualization topics MoA between CMCC and University of Cantabria has been signed on distributed/grid metadata management topics

EGEE-III Review - June 24, Acknowledgments Giovanni Aloisio (CMCC) Sandro Fiore (CMCC) Monique Petitdidier (CNRS/IPSL) Horst Schwichtenberg (Fraunhofer-SCAI) Sébastien Denvil (IPSL) Peter Fox (RPI, NCAR) Jon Blower (Univ. Reading) Antonio Cofino (Univ. of Cantabria) Many thanks to all of the involved people in the Climate-G testbed