National Institute of Advanced Industrial Science and Technology Gfarm v2: A Grid file system that supports high-performance distributed and parallel data.

Slides:



Advertisements
Similar presentations
National Institute of Advanced Industrial Science and Technology Belle/Gfarm Grid Experiment at SC04 Osamu Tatebe Grid Technology Research Center, AIST.
Advertisements

A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Gfarm v2 and CSF4 Osamu Tatebe University of Tsukuba Xiaohui Wei Jilin University SC08 PRAGMA Presentation at NCHC booth Nov 19,
The google file system Cs 595 Lecture 9.
G O O G L E F I L E S Y S T E M 陳 仕融 黃 振凱 林 佑恩 Z 1.
High Performance Computing Course Notes Grid Computing.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Job submission architectures in GRID environment Masamichi Ando M1 Student Taura Lab. Department of Information Science and Technology.
Recent Development of Gfarm File System Osamu Tatebe University of Tsukuba PRAGMA Institute on Implementation: Avian Flu Grid with Gfarm, CSF4 and OPAL.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
DISTRIBUTED COMPUTING
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Active Storage and Its Applications Jarek Nieplocha, Juan Piernas-Canovas Pacific Northwest National Laboratory 2007 Scientific Data Management All Hands.
Data GRID Activity in Japan Yoshiyuki WATASE KEK (High energy Accelerator Research Organization) Tsukuba, Japan
Chapter 20 Distributed File Systems Copyright © 2008.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Building Hierarchical Grid Storage Using the GFarm Global File System and the JuxMem Grid Data-Sharing Service Gabriel Antoniu, Lo ï c Cudennec, Majd Ghareeb.
The JuxMem-Gfarm Collaboration Enhancing the JuxMem Grid Data Sharing Service with Persistent Storage Using the Gfarm Global File System Gabriel Antoniu,
1 Chap8 Distributed File Systems  Background knowledge  8.1Introduction –8.1.1 Characteristics of File systems –8.1.2 Distributed file system requirements.
Large Scale Parallel File System and Cluster Management ICT, CAS.
 CASTORFS web page - CASTOR web site - FUSE web site -
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
Worldwide File Replication on Grid Datafarm Osamu Tatebe and Satoshi Sekiguchi Grid Technology Research Center, National Institute of Advanced Industrial.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
ATLAS Grid Requirements A First Draft Rich Baker Brookhaven National Laboratory.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
National Institute of Advanced Industrial Science and Technology Gfarm Grid File System for Distributed and Parallel Data Computing Osamu Tatebe
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Design of File System Directory Services Osamu Tatebe Grid Technology Research Center, AIST GFS-WG, GGF10 March 2004, Berlin GGF10 GFS-WG March 2004, Berlin.
An Introduction to GPFS
CVMFS Alessandro De Salvo Outline  CVMFS architecture  CVMFS usage in the.
Lustre File System chris. Outlines  What is lustre  How does it works  Features  Performance.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
XtreemOS IP project is funded by the European Commission under contract IST-FP Scientific coordinator Christine Morin, INRIA Presented by Ana.
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Chapter 1 Characterization of Distributed Systems
Data Management with Google File System Pramod Bhatotia wp. mpi-sws
Introduction to Distributed Platforms
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Grid File System WG GGF11, Honolulu June 8-9, 2004.
U.S. ATLAS Grid Production Experience
Introduction to Data Management in EGI
CERN Lustre Evaluation and Storage Outlook
Grid Computing.
Osamu Tatebe Grid Technology Research Center, AIST
University of Technology
Grid Datafarm and File System Services
A Survey on Distributed File Systems
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
Grid Canada Testbed using HEP applications
CS258 Spring 2002 Mark Whitney and Yitao Duan
Distributed File Systems
File-System Interface
THE GOOGLE FILE SYSTEM.
Presentation transcript:

National Institute of Advanced Industrial Science and Technology Gfarm v2: A Grid file system that supports high-performance distributed and parallel data computing Osamu Tatebe 1, Noriyuki Soda 2, Youhei Morita 3, Satoshi Matsuoka 4, Satoshi Sekiguchi 1 1 Grid Technology Research Center, AIST 2 SRA, Inc, 3 KEK, 4 Tokyo Institute of Technology / NII CHEP 04 Sep 27, 2004 Interlaken, Switzerland

National Institute of Advanced Industrial Science and Technology [Background] Petascale Data Intensive Computing Detector for ALICE experiment Detector for LHCb experiment High Energy Physics CERN LHC, KEK-B Belle ~MB/collision, 100 collisions/sec ~PB/year 2000 physicists, 35 countries Astronomical Data Analysis data analysis of the whole data TB~PB/year/telescope Subaru telescope 10 GB/night, 3 TB/year

National Institute of Advanced Industrial Science and Technology Petascale Data-intensive Computing Requirements Peta/Exabyte scale files, millions of millions of files Scalable computational power > 1TFLOPS, hopefully > 10TFLOPS Scalable parallel I/O throughput > 100GB/s, hopefully > 1TB/s within a system and between systems Efficiently global sharing with group-oriented authentication and access control Fault Tolerance / Dynamic re-configuration Resource Management and Scheduling System monitoring and administration Global Computing Environment

National Institute of Advanced Industrial Science and Technology Goal and feature of Grid Datafarm Goal Dependable data sharing among multiple organizations High-speed data access, High-performance data computing Grid Datafarm Gfarm File System – Global dependable virtual file system Federates scratch disks in PCs Parallel & distributed data computing Associates Computational Grid with Data GridFeatures Secured based on Grid Security Infrastructure Scalable depending on data size and usage scenarios Data location transparent data access Automatic and transparent replica selection for fault tolerance High-performance data access and computing by accessing multiple dispersed storages in parallel (file affinity scheduling)

National Institute of Advanced Industrial Science and Technology Grid Datafarm (1): Gfarm file system - World- wide virtual file system [CCGrid 2002] Transparent access to dispersed file data in a Grid POSIX I/O APIs Applications can access Gfarm file system without any modification as if it is mounted at /gfarm Automatic and transparent replica selection for fault tolerance and access-concentration avoidance Gfarm File System /gfarm ggfjp aistgtrc file1file3 file2 file4 file1file2 File replica creation Virtual Directory Tree mapping File system metadata

National Institute of Advanced Industrial Science and Technology Grid Datafarm (2): High-performance data access and computing support [CCGrid 2002] Do not separate Storage and CPU Parallel and distributed file I/O

National Institute of Advanced Industrial Science and Technology Scientific Application ATLAS Data Production Distribution kit (binary) Atlfast – fast simulation Input data stored in Gfarm file system not NFS G4sim – full simulation (Collaboration with ICEPP, KEK) Belle Monte-Carlo Production 30 TB data needs to be generated 3 M events (60 GB) / day is being generated using a 50-node PC cluster Simulation data will be generated distributedly in tens of universities and KEK (Collaboration with KEK, U-Tokyo)

National Institute of Advanced Industrial Science and Technology Gfarm TM v1 Open source development Gfarm TM version released on July 5, 2004 ( ) scp, GridFTP server、samba server,... Application Gfarm library Metadata server CPU... gfsd * Existing applications can access Gfarm file system without any modification using LD_PRELOAD gfmdslapd Compute and file system nodes

National Institute of Advanced Industrial Science and Technology Problems of Gfarm TM v1 Functionality of file access File open in read-write mode*, file locking (* supported in version 1.0.4)Robustness Consistency between metadata and physical file at unexpected application crash at unexpected modification of physical filesSecurity Access control of filesystem metadata Access control of files by group File model of Gfarm file - group of files (collection, container) Flexibility of file grouping

National Institute of Advanced Industrial Science and Technology Design of Gfarm TM v2 Supports more than ten thousands of clients and file server nodes Provides scalable file I/O performance Gfarm v2 – towards *true* global virtual file system POSIX compliant - supports read-write mode, advisory file locking,... Robust, dependabe, and secure Can be substituted for NFS, AFS,...

National Institute of Advanced Industrial Science and Technology Related work (1) Lustre >1,000 clients Object (file) based management, placed in any OST No replica management, Writeback cache, Collaborative read cache (planned) GSSAPI, ACL, StorageTek SFS Kernel module

National Institute of Advanced Industrial Science and Technology Related work (2) Google File System >1,000 storage nodes Fixed-size chunk, placed in any chunkserver by default, three replicas User client library, no client and server cache not POSIX API, support for Google ’ s data processing needs [SOSP’03]

National Institute of Advanced Industrial Science and Technology Opening files in read-write mode (1) Semantics (the same as AFS) [without advisory file locking] Updated content is available only when opening the file after a writing process closes the file [with advisory file locking] Among processes that locks a file, up-to-date content is available in the locked region. This is not ensured when a process writes the same file without file locking.

National Institute of Advanced Industrial Science and Technology Opening file in read-write mode (2) /grid ggfjp file1file2 Process 1Process 2 fopen(“/grid/jp/file2”, “rw”)fopen(“/grid/jp/file2”, “r”) Metadata server FSN1 FSN2 file2 FSN1 File access file2 FSN2 File access fclose() Before closing, any file copy can be accessed Delete invalid file copy in metadata, but file access is continued fclose()

National Institute of Advanced Industrial Science and Technology Advisory file locking /grid ggfjp file1file2 Process 1Process 2 fopen(“/grid/jp/file2”, “rw”)fopen(“/grid/jp/file2”, “r”) Metadata server FSN1 FSN2 file2 FSN1 File access file2 FSN2 File access Read lock request FSN1 File access Cache flush Disable caching

National Institute of Advanced Industrial Science and Technology Consistent update of metadata (1) Application Gfarm library Metadata server File system node FSN1 open FSN1 close Update metadata Metadata is not updated at unexpected application crash Gfarm v1 – Gfarm library updates metadata

National Institute of Advanced Industrial Science and Technology Consistent update of metadata (2) Application Gfarm library Metadata server File system node FSN1 open FSN1 close or broken pipe Update metadata Metadata is updated by file system node even at unexpected application crash Gfarm v2 – file system node updates metadata

National Institute of Advanced Industrial Science and Technology Generalization of file grouping model files N sets Image files taken by Subaru telescope 10 files executed in parallel N files executed in parallel 10 x N files executed in parallel

National Institute of Advanced Industrial Science and Technology File grouping by directory night1 shot1shot2 ccd0 ccd1 ccd9 shotN... night1-ccd1 shot1shot2 shotN... Symlink/hardlink to night1/shot2/ccd1 gfs_pio_open(“night1/shot2”, &gf)Open a Gfarm file that concatenates ccd0,..., ccd9 gfs_pio_set_view_section(gf, “ccd1”) Set file view to ccd1 section gfs_pio_open(“night1”, &gf)Open a Gfarm file that Concatenates shot1/ccd0,..., and shotN/ccd9

National Institute of Advanced Industrial Science and Technology Summary and future work Gfarm TM v2 aims at global virtual file system having Scalability up to more than ten thousands of clients and file system nodes Scalable file I/O performance POSIX complience (read-write mode, file locking,...) Fault tolerance, robustness and dependability. Design and implementation is discussed Future work Implementation and performance evaluation Evaluation for scalability up to more than ten thousands of nodes Data preservation, automatic replica creation