Building Hierarchical Grid Storage Using the GFarm Global File System and the JuxMem Grid Data-Sharing Service Gabriel Antoniu, Lo ï c Cudennec, Majd Ghareeb.

Slides:



Advertisements
Similar presentations
Gfarm v2 and CSF4 Osamu Tatebe University of Tsukuba Xiaohui Wei Jilin University SC08 PRAGMA Presentation at NCHC booth Nov 19,
Advertisements

Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Object Based Operating Systems1 Learning Objectives Object Orientation and its benefits Controversy over object based operating systems Object based operating.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GINGIN Grid Interoperation on Data Movement.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Workshop.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
CH2 System models.
LEGO – Rennes, 3 Juillet 2007 Deploying Gfarm and JXTA-based applications using the ADAGE deployment tool Landry Breuil, Loïc Cudennec and Christian Perez.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
JTE - HPC File Systems: From Cluster To Grid October 3-4, 2007, IRISA, Rennes ACM SIGOPS France.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Large-scale Deployment in P2P Experiments Using the JXTA Distributed Framework Gabriel Antoniu, Luc Bougé, Mathieu Jan & Sébastien Monnet PARIS Research.
Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.
File and Object Replication in Data Grids Chin-Yi Tsai.
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
High Performance File System Service for Cloud Computing Kenji Kobayashi, Osamu Tatebe University of Tsukuba, JAPAN.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
Objectives Functionalities and services Architecture and software technologies Potential Applications –Link to research problems.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Grid Data.
The JuxMem-Gfarm Collaboration Enhancing the JuxMem Grid Data Sharing Service with Persistent Storage Using the Gfarm Global File System Gabriel Antoniu,
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Presenters: Rezan Amiri Sahar Delroshan
Laboratoire LIP6 The Gedeon Project: Data, Metadata and Databases Yves DENNEULIN LIG laboratory, Grenoble ACI MD.
Towards high-performance communication layers for JXTA on grids Mathieu Jan GDS meeting, Lyon, 17 February 2006.
Tools for collaboration How to share your duck tales…
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Latest news on JXTA and JuxMem-C/DIET Mathieu Jan GDS meeting, Rennes, 11 march 2005.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CGW 04, Stripped replication for the grid environment as a web service1 Stripped replication for the Grid environment as a web service Marek Ciglan, Ondrej.
Replica Management Kelly Clynes. Agenda Grid Computing Globus Toolkit What is Replica Management Replica Management in Globus Replica Management Catalog.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
Making a DSM Consistency Protocol Hierarchy-Aware: An Efficient Synchronization Scheme Gabriel Antoniu, Luc Bougé, Sébastien Lacour IRISA / INRIA & ENS.
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Introduction to Active Directory
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
ANR CIGC LEGO (ANR-CICG-05-11) Bordeaux, 2006, December 11 th Automatic Application Deployment on Grids Landry Breuil, Boris Daix, Sébastien Lacour, Christian.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
November, 19th GDS meeting, LIP6, Paris 1 Hierarchical Synchronization and Consistency in GDS Sébastien Monnet IRISA, Rennes.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
1 Data Management for Internet Backplane Protocol by Tang Ming Assoc/Prof. Francis Lee School of Computer Engineering, Nanyang Technological University,
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
National Institute of Advanced Industrial Science and Technology Gfarm v2: A Grid file system that supports high-performance distributed and parallel data.
Federating Data in the ALICE Experiment
Introduction to Data Management in EGI
Peer-to-peer networking
Grid Computing.
Grid Datafarm and File System Services
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
CLUSTER COMPUTING.
Distributed Databases
Presentation transcript:

Building Hierarchical Grid Storage Using the GFarm Global File System and the JuxMem Grid Data-Sharing Service Gabriel Antoniu, Lo ï c Cudennec, Majd Ghareeb INRIA/IRISA Rennes, France Osamu Tatebe University of Tsukuba Japan

2 Context: Grid Computing Target architecture: cluster federations (e.g. Grid’5000) Focus: large-scale data sharing Solid mechanics Thermodynamics Optics Dynamics Satellite design

3 Approaches for Data Management on Grids Use of data catalogs: Globus GridFTP, Replica Location Service, etc Logistical networking of data: IBP Buffers available across Internet Unified access to data: SRB From file-systems to tapes and databases Limitations No transparency => Increased complexity at large scale No consistency guarantees for replicated data

4 Towards Transparent Access to Data Desirable features Uniform access to distributed data via global identifiers Transparent data localization and transfer Consistency models and protocols for replicated data Examples of systems taking this approach On clusters Memory level: DSM systems (Ivy, TreadMarks, etc.) File level: NFS-like systems On grids Memory level: data sharing services JuxMem - INRIA Rennes, France File level: global file systems Gfarm - AIST/University of Tsukuba, Japan

Idea: a Collaborative Research of Memory and File-level Data Sharing Study possible interactions between The JuxMem grid data sharing service The Gfarm global file system Goal Enhance global data sharing functionality Improve performance and reliability Build a memory hierarchy for global data sharing by combining the memory level and the file system level Approach Enhance JuxMem with Persistent Storage using Gfarm Support The DISCUSS Sakura bilateral collaboration ( ) NEGST ( ) 5

6 JuxMem: a Grid Data-Sharing Service Generic grid data-sharing service Grid-scale: nodes Transparent data localization Data consistency Fault-tolerance JuxMem ~= DSM + P2P Implementation Multiple replication strategies Configurable consistency protocols Based on JXTA 2.0 ( Integrated into 2 grid programming models GridRPC (DIET, ENS Lyon) Component models (CCM & CCA) Cluster group A Juxmem group Cluster group C Cluster group B Data group D

7 JuxMem’s Data Group: a Fault-Tolerant, Self-Organizing Group GDG : Global Data Group LDG : Local Data Group LDG Client GDG LDG Data group D Data availability despite failures is ensured through replication and fault-tolerant building blocks Hierarchical self-organizing groups Cluster level: Local Data Group (LDG) Grid level: Global Data Group (GDG) Group membership Atomic multicast Consensus Failure detectors Adaptation layer Self-organizing group

8 JuxMem: Memory Model and API Memory model (currently): entry consistency Explicit association of data to locks Multiple Reader Single Writer (MRSW) juxmem_acquire, acquire_read, release Explicit lock acquire/release before/after access API Allocate memory for JuxMem data ptr = juxmem_malloc (size, #clusters, #replicas per cluster, &ID…) Map existing JuxMem data to local memory ptr = juxmem_mmap (ID), juxmem_unmap (ptr) Synchronization before/after data access juxmem_acquire(ptr), juxmem_acquire_read(ptr), juxmem_release(ptr) Read and write data: direct access through pointers! int n = *ptr; *ptr =…

Gfarm: a Global File System [CCGrid 2002] Commodity-based distributed file system that federates storage of each site It can be mounted from all cluster nodes and clients It provides scalable I/O performance wrt the number of parallel processes and users It avoids access concentration by automatic replica selection Gfarm File System /gfarm ggfjp aistgtrc file1file3 file2 file4 file1file2 File replica creation Global namespace mapping

GridFTP, samba, NFS server Compute & fs node Gfarm: a Global File System (2) Files can be shared among all nodes and clients Physically, it may be replicated and stored on any file system node Applications can access it regardless of its location File system nodes can be distributed GridFTP, samba, NFS server Gfarm metadata server Compute & fs node Client PC Note PC /gfarm metadata Gfarm file system … File A File B File C File A File B File C File B US Japan

11 Our Goal: Build a Memory Hierarchy for Global Data Sharing Approach Applications use JuxMem’s API (memory-level sharing) Applications DO NOT use Gfarm directly JuxMem uses Gfarm to enhance data persistence Without Gfarm, JuxMem supports some crashes of memory providers thanks to the self-organizing groups With Gfarm, persistence is further enhanced thanks to secondary storage How does it work? Basic principle: on each lock release, data can be flushed to Gfarm Flush frequency can be tuned to compromise efficiency/fault tolerance

12 Step 1: A Single Flush by One Provider Cluster #1Cluster #2 JuxMem Global Data Group (GDG) JuxMem Provider GDG Leader JuxMem Provider JuxMem Provider GFarm GFSD One particular JuxMem provider (GDG leader) flushes data to Gfarm Then, other Gfarm copies can be created using Gfarm’s gfrep command

13 Step 2: Parallel Flush by LDG Leaders Cluster #1Cluster #2 JuxMem Local Data Group (LDG #1) JuxMem Provider LDG #1 Leader JuxMem Provider GFarm GFSD LDG #2 Leader GFSD JuxMem Local Data Group (LDG #2) One particular JuxMem provider in each cluster (LDG leader) flushes data to Gfarm (parallel copy creation, one copy per cluster) The copies are registered as the same Gfarm file Then, extra Gfarm copies can be created using Gfarm’s gfrep command

14 Step 3: Parallel Flush by All Providers Cluster #1Cluster #2 JuxMem Global Data Group (GDG) JuxMem Provider GFarm GFSD JuxMem Provider GFSD JuxMem Provider GFSD JuxMem Provider GFSD All JuxMem providers in each cluster (LDG leader) flush data to Gfarm All copies are registered as the same Gfarm file Useful to create multiple copies of the Gfarm file per cluster No more replication using gfrep

15 Deployment issues Application deployment on large scale infrastructures Reserve resources Configure the nodes Manage dependencies between processes Start processes Monitor and clean up the nodes Mixed-deployment of GFarm and JuxMem Manage dependencies between processes of both applications Make the JuxMem provider able to act as a Gfarm client Approach: use a generic deployment tool: ADAGE (INRIA, Rennes, France) Design specific plugins for Gfarm and JuxMem

16 ADAGE: Automatic Deployment of Applications in a Grid Environment IRISA/INRIA Paris Research Group Deploy a same application on different kinds of resources from clusters to grids Support multi-middleware applications MPI+CORBA+JXTA+GFARM... Network topology description Latency and bandwidth hierarchy NAT, non-IP networks Firewalls, Asymmetric links Planner as plugin Round robin & Random Preliminary support for dynamic applications Some successes 29,000 JXTA peers on ~400 nodes 4003 components on 974 processors on 7 sites GFarm Application Description JuxMem Application Description Resource Description Generic Application Description Control Parameters Deployment Planning Deployment Plan Execution Application Configuration

17 Roadmap overview (1) Design of the common architecture : 2006 Discussions on possible interactions between JuxMem and Gfarm May 2006, Singapore (CCGRID 2006) June 2006, Paris (HPDC 2006 and NEGST workshop) October 2006: Gabriel Antoniu and Lo ï c Cudennec visited the Gfarm team First deployment tests of Gfarm on G5K Overall Gfarm/JuxMem design December 2006: Osamu Tatebe visited the JuxMem team Refinement of the Gfarm/JuxMem design Implementation of JuxMem on top of Gfarm : 2007 April 2007: Gabriel Antoniu and Lo ï c Cudennec visited the Gfarm team One JuxMem provider (GDG leader) flushes data to Gfarm after each critical section (step 1 done) Master internship: Majd Ghareeb December 2007: Osamu Tatebe visited the JuxMem team Commun paper at Euro-Par 2008

Read performance Gfarm: 69 MB/s Worst case: 39 MB/s Usual case: 100 MB/s

Write performance Gfarm: 42 MB/s Worst case: 28.5 MB/s Usual case: 89 MB/s

20 Roadmap (2) Design the Gfarm plugin for ADAGE (April 2007) Propose a specific application description language for GFarm Translate the specific description into a generic description Start processes with respect of the dependencies Transfer the Gfarm configuration files from: The Metadata Server to the Agents The Agents to their GFSD and Clients Deployment of JuxMem on top of Gfarm (May 2007) - first prototype running on G5K) ADAGE deploys Gfarm, then JuxMem (separate deployment) Limitations: the user still needs to indicate the Gfarm client hostname, the Gfarm configuration file location Design a meta-plugin for ADAGE that automatically deploys a mixed description of a Gfarm+JuxMem configuration (December 2007) Gfarm v1 and v2 Work in progress (2008) Fault-tolerant, distributed meta-data server: Gfarm on top of JuxMem Master internship: Andre Lage