© 2008 Open Grid Forum File Catalog Development in Japan e-Science Project GFS-WG, OGF24 Singapore Hideo Matsuda Osaka University.

Slides:



Advertisements
Similar presentations
© 2006 Open Grid Forum Discussion of File Catalog Standardization GFS-WG, OGF24 Singapore Osamu Tatebe, co-chair of GFS-WG Univ. of Tsukuba Sep 16, 2008.
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
© 2008 Open Grid Forum Data Grid Federation by RNS GFS-WG, OGF23 Balcelona Hideo Matsuda Osaka University / NAREGI.
© 2007 Open Grid Forum Data Management Challenge - The View from OGF OGF22 – February 28, 2008 Cambridge, MA, USA Erwin Laure David E. Martin Data Area.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
On Developing Data Grid Workflows using Storage Resource Broker (SRB) and Kepler Tim H. Wong - UC Davis Efrat Frank - SDSC Bertram Ludäscher - UC Davis.
On Developing Data Grid Workflows using Storage Resource Broker (SRB) and Kepler Tim H. Wong - UC Davis Efrat Frank - SDSC Dr. Bertram Ludäscher - UC Davis.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GINGIN Grid Interoperation on Data Movement.
NAREGI WP4 (Data Grid Environment) Hideo Matsuda Osaka University.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Grid Services/SRB/SRM & Practical Hai-Ning Wu Academia Sinica Grid Computing.
© 2008 Open Grid Forum Independent Software Vendor (ISV) Remote Computing Primer Steven Newhouse.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
© 2006 Open Grid Forum Enabling Pervasive Grids The OGF GIN Effort Erwin Laure GIN-CG co-chair, EGEE Technical Director
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
© 2006 Open Grid Forum Global resource naming for data grid federation GFS-WG, OGF22 Cambridge Osamu Tatebe, co-chair of GFS-WG Univ. of Tsukuba Feb 27,
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
H IGH E NERGY A CCELERATOR R ESEARCH O RGANIZATION KEKKEK High Availability iRODS System (HAIRS) Yutaka Kawai, KEK Adil Hasan, ULiv December 2nd, 20091Interoperability.
Karolina Sarnowska, University of Virginia Andrew Grimshaw, University of Virginia Mark Morgan, University of Virginia Akos Frohner, CERN Erwin Laure,
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
Replica Management Kelly Clynes. Agenda Grid Computing Globus Toolkit What is Replica Management Replica Management in Globus Replica Management Catalog.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Introduction to The Storage Resource.
Easy Access to Grid infrastructures Dr. Harald Kornmayer (NEC Laboratories Europe) Dr. Mathias Stuempert (KIT-SCC, Karlsruhe) EGEE User Forum 2008 Clermont-Ferrand,
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
1 Data Management for Internet Backplane Protocol by Tang Ming Assoc/Prof. Francis Lee School of Computer Engineering, Nanyang Technological University,
FESR Trinacria Grid Virtual Laboratory gLite Information System Muoio Annamaria INFN - Catania gLite 3.0 Tutorial Trigrid Catania,
RENKEI:UGI Takashi Sasaki. Project history The RENKEI project led by Prof. Ken Miura of NII is funded by MEXT during JFY The goal of the project.
Design of File System Directory Services Osamu Tatebe Grid Technology Research Center, AIST GFS-WG, GGF10 March 2004, Berlin GGF10 GFS-WG March 2004, Berlin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
2 nd EGEE/OSG Workshop Data Management in Production Grids 2 nd of series of EGEE/OSG workshops – 1 st on security at HPDC 2006 (Paris) Goal: open discussion.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
OGF24 15 September 2008 Data Area Overview Erwin Laure David E. Martin Data Area Directors.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
EGEE Data Management Services
gLite Basic APIs Christos Filippidis
The Data Grid: Towards an architecture for Distributed Management
Vincenzo Spinoso EGI.eu/INFN
Cross-health enterprises Medical Data Management on the EGEE grid
Medical Data Manager use case: 3D medical images analysis workflow.
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Introduction to Data Management in EGI
gLite Information System
gLite Information System
EGEE Middleware: gLite Information Systems (IS)
Grid related activities at KEK
RNS Interoperability and File Catalog Standardization
gLite The EGEE Middleware Distribution
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

© 2008 Open Grid Forum File Catalog Development in Japan e-Science Project GFS-WG, OGF24 Singapore Hideo Matsuda Osaka University

2 Japan e-Science Project 3.5 years project, starting from September 2008 Sponsored by MEXT (the Ministry of Education, Culture, Sports, Science and Technology), Japan Two major sub-projects System Software (Leader: Yutaka Ishikawa, Univ. Tokyo) Grid Software (Leader: Ken-ichi Miura, NII)

© 2008 Open Grid Forum Overview of e-Science Grid Software Project 3 Data Sharing Nation-wide distributed FS File catalog Computation Workflow Job Submissio Application mgmt DB Federation DB access control AuthN info mgmt Info. Tech. Center Laboratory Application I/F App control script API App monitoring End users Grid Middleware DB Grid Middleware DB Middleware Evaluation Grid Operation Infrastracture / Application Evaluation

4 Nation-wide Distributed File System Goal: Development of distributed file system technology spread over nation-wide with comparative performance of local fileserver Research Topics: Optimal automatic placement of file replicas based on Gfarm 2.0. Fault tolerance with file replicas File Server 1 File Server 1 Storage Client File Server 2 File Server 2 Storage File Server 3 File Server 3 Storage Virtual Distributed File System Client File Replica File Replica File Replica Client File Optimal Replica Placement

5 File Catalog Service Goal: Development of interoperable file catalog service between heterogeneouse Grid environments. Current file catalog systems (LFC (EGEE gLite), MCAT (SRB), etc.) does not have interoperability to each other. Development of standardized file catalog based on RNS (Resource Namespace Service) specification. EGEE gLite File Server File Catalog System SRB or iRODS File Server Japan e-Science Distributed File System Client (1) Logical File Name (3) File Access with GridFTP (2) Physical File Location (EPR)

6 File Catalog in e-Science File Catalog can be used for not only file-location management but also metadata in e-Science since matadata is often described with hierarchical representation in many sciences. CMSATLAS run1run2 track1track2 Proteome Genome Human Genome Plant Genome gb|AY Bacterial Genome Functional Analysis Structure Analysis sp|P37231pdb|1FM6 High Energy PhysicsMolecular Biology

7 Metadata Management using File Catalog Currently metadata are mainly stored in File Catalogs using their hierarchical namespace functionality. gLite: LFC, Fireman iRODS (SRB): ICAT Globus: RLS NAREGI: Gfarm It is not easy to exchange metadata over different Grid middlewares.

8 Resource Namespace Service (1) RNS lets you map any resource into single, hierarchical namespace Resources are referred to in a form of EndpointReference (WS-Addressing) RNS Specification is published as GFD- R-P.101 RNS implementation is available from U.Virginia and U.Tsukuba.

9 Resource Namespace Service (2) Hierarchical namespace management that provides name- to-resource mapping Basic Namespace Component Virtual Directory Non-leaf node in hierarchical namespace tree Junction Name-to-resource mapping that interconnects a reference to any existing resource into hierarchical namespace /grid ogfjp datagfs file1file3 file2 file4 file1file2 EPR1 EPR2 EPR: Endpoint Reference

10 Development of File Catalog System (Plan) RNS can interconnect a reference to any existing resource into hierarchical namespace Most of Grid middlewares have GridFTP for data transfer  Use RNS as a standardized File Catalog Use GridFTP URL “gsiftp://.../” as the address of Endpoint Reference. gLite File Server (SRM) RNS iRODS File Server Japan e- Science File Server Globus GridFTP Server Client (1) query (2) EPR list (including address) (3) Access with GridFTP protocol RNS

11 Comparison with gLite LFC Comments from Erwin Laure (OGF22 GFS-WG) add EPR: RNS is missing the detailed attributes of the replicas. query EPR: The attributes of a namespace entry should be defined, allowing specialized queries and lookups. RNS lacks bulk operations, sessions, transactions. Adoption of those may improve performance. Access control and VO management are also not introduced yet.

12 Comparison with iRODS Comments from Reagan Moore (OGF23 GFS-WG) Applications now manipulate structured information. iRODS can generate and manipulate structured information with micro-services. Multiple standards for describing structured information.

13 Summary Standarized File Catalog is useful for federating heterogeneous Data Grids. Need to establish File Catalog Profile for interoperation of different File Catalogs (and for its standardization).