Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and.

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.
Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,
OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini.
Dynamic Grid Optimisation TERENA Conference, Lijmerick 5/6/02 A. P. Millar University of Glasgow.
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Author - Title- Date - n° 1 GDMP The European DataGrid Project Team
IoP HEPP 2004 Birmingham, 7/4/04 David Cameron, University of Glasgow 1 Simulation of Replica Optimisation Strategies for Data.
High Performance Computing Course Notes Grid Computing.
Data Grids Darshan R. Kapadia Gregor von Laszewski
Application of GRID technologies for satellite data analysis Stepan G. Antushev, Andrey V. Golik and Vitaly K. Fischenko 2007.
Optimizing of data access using replication technique Renata Słota 1, Darin Nikolow 1,Łukasz Skitał 2, Jacek Kitowski 1,2 1 Institute of Computer Science.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Computer Science Department 1 Load Balancing and Grid Computing David Finkel Computer Science Department Worcester Polytechnic Institute.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
End-to-End Analysis of Distributed Video-on-Demand Systems P. Mundur, R. Simon, and A. K. Sood IEEE Transactions on Multimedia, Vol. 6, No. 1, Feb 2004.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Hadoop Distributed File System (HDFS) implementation in GENI Wei Kou – University of Connecticut Madhav –Missouri University of Science and Technology.
On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
GRID COMPUTING: REPLICATION CONCEPTS Presented By: Payal Patel.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
1 TFCC / TCDP / TCPP / TCSA and Proposal for a new TC on Scalable Computing (TCSC) Mark Baker University of Portsmouth, UK
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Distributing Layered Encoded Video through Caches Authors: Jussi Kangasharju Felix HartantoMartin Reisslein Keith W. Ross Proceedings of IEEE Infocom 2001,
La partecipazione del Gruppo Informatica di Lecce al Progetto EU-US GRID Earth Observation Systems High Energy Physics ASI ESA.
A Dynamic Data Grid Replication Strategy to Minimize the Data Missed Ming Lei, Susan Vrbsky, Xiaoyan Hong University of Alabama.
Ajou University, South Korea Chameleon: A Resource Scheduler in A Data Grid Environment Sang Min Park  Jai-Hoon Kim Ajou University South Korea.
1 520 Student Presentation GridSim – Grid Modeling and Simulation Toolkit.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
Authors: Ronnie Julio Cole David
Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,
Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008.
Replica Consistency in a Data Grid1 IX International Workshop on Advanced Computing and Analysis Techniques in Physics Research December 1-5, 2003 High.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Data Grid Technologies Sathish Vadhiyar Sources/Credits: Technical papers listed in references.
Globus – Part II Sathish Vadhiyar. Globus Information Service.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
Replica Management Kelly Clynes. Agenda Grid Computing Globus Toolkit What is Replica Management Replica Management in Globus Replica Management Catalog.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Oracle to MySQL synchronization Gianni Pucciani CERN, University of Pisa.
Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.
Video Caching in Radio Access network: Impact on Delay and Capacity
Protocols and Services for Distributed Data- Intensive Science Bill Allcock, ANL ACAT Conference 19 Oct 2000 Fermi National Accelerator Laboratory Contributors:
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science,
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
The Work Package 2 experience
Globus —— Toolkits for Grid Computing
Distributed Data Access and Resource Management in the D0 SAM System
Network Requirements Javier Orellana
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
Towards Unified Management
Presentation transcript:

Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and Young-Bae Ko Ajou University South Korea

Ajou University, South Korea GCC 2003 Presentation 2 Contents Introduction to Data Grid Optimizations in Data Grid Novel Replication Strategy based on Internet Hierarchy Simulation Simulation Results Conclusions

Ajou University, South Korea GCC 2003 Presentation 3 Introduction to Data Grid Data Grid Motivations Petabyte scale data production Distributed data storage to store parts of data Distributed computing resources which process the data Two Most Important Approaches for Data Grid Secure, reliable, and efficient data transport protocol (ex. GridFTP) Replication (ex. Replica catalog) Replication Large size files are partially replicated among sites Reduce data access time Application Scheduling, Dynamic replication issues are emerging

Ajou University, South Korea GCC 2003 Presentation 4 Introduction to Data Grid Typical Job Execution Scenario

Ajou University, South Korea GCC 2003 Presentation 5 Optimizations in Data Grid Reducing the Overall Job Execution Time Scheduling Optimization Deciding where to allocate the job Considering location of replicas and computational capabilities of sites Short-term Optimization Deciding from where to fetch replicas Considering available network bandwidth between sites Long-term Optimization (Dynamic Replication Strategy) Shortage of storage in a site Deciding which file should be remaining as a replica Better to replicate popular files because of its future usage

Ajou University, South Korea GCC 2003 Presentation 6 Existing Dynamic Replication Strategies Replica Optimization based on Site-level Locality Replicate the file that is predicted to be used in future from the perspective of a site Try to reduce the number of fetch Delete Oldest, Delete LRU Method Economic Strategy from European Data Grid Developing OptorSim –Data Grid Optimization Simulator Using Auction Protocol to trigger Long-term Optimization Site-level Locality based on File access patterns

Ajou University, South Korea GCC 2003 Presentation 7 Existing Dynamic Replication Strategies The Limitations of the site-level optimization A Site certainly have limitations of their storage size, which means that the rate of data request locality is also limited There should be predictable file access patterns, but we do not know if there will be.

Ajou University, South Korea GCC 2003 Presentation 8 Replication Strategy based on Bandwidth Hierarchy (BHR) Network-level Locality A site is not the only possible source of locality Another source of locality : Network-level locality If the replica is located in a close site, not long delay would be taken to fetch this replica Fast Replica Transmission Slow Replica Transmission Network Region (e.g., a country)

Ajou University, South Korea GCC 2003 Presentation 9 Replication Strategy based on Bandwidth Hierarchy (BHR) Bandwidth Hierarchy

Ajou University, South Korea GCC 2003 Presentation 10 Replication Strategy based on Bandwidth Hierarchy (BHR) Maximizing Network-level locality 1. Avoiding Replica Duplication in a region 2. Considering popularity of file request at the region-level X X A Region Receiving New Replica a Site No space here! We should remove some file Delete this one! Replica X is duplicated here! A

Ajou University, South Korea GCC 2003 Presentation 11 Simulation OptorSim Data Grid Dynamic Replication Simulation tool Developed as part of European Data Grid Project Implemented in Java Implemented Our own Region-based Optimizer in OptorSim

Ajou University, South Korea GCC 2003 Presentation 12 Simulation Simulation Environment

Ajou University, South Korea GCC 2003 Presentation 13 Simulations ParametersValues Number of jobs1000 Number of job types50 Number of file accessed per job15 Size of single file1 GB Total size of files750 GB ParametersValues Intra-region bandwidth1000 Mbps Inter-region bandwidth1000 Mbps Master-router bandwidth2000 Mbps Storage space at site50 GB General configuration of parameters Bandwidth and Storage Size

Ajou University, South Korea GCC 2003 Presentation 14 Simulation Results Total Job times of three strategies

Ajou University, South Korea GCC 2003 Presentation 15 Simulation Results Total job time with varying bandwidth and storage size

Ajou University, South Korea GCC 2003 Presentation 16 Conclusions The existing dynamic replication strategies are based only on site-level locality of file request BHR strategy is based on the network-locality BHR shows quite good performance when hierarchy of bandwidth clearly appears, and size of storage at a site is small We extend current site-level replica optimization study to more scalable way

Ajou University, South Korea GCC 2003 Presentation 17 References William H. Bell, David G. Cameron, Luigi Capozza, A. Paul Millar, Kurt Stockinger, and Floriano Zini.: Simulation of Dynamic Grid Replication Strategies in OptorSim. In Proc. of the 3rd Int'l. IEEE Workshop on Grid Computing (Grid'2002), Baltimore, USA, November Springer Verlag, Lecture Notes in Computer Science. William H. Bell, David G. Cameron, Ruben Carvajal-Schiaffino, A. Paul Millar, Kurt Stockinger, and Floriano Zini.: Evaluation of an Economy-Based File Replication Strategy for a Data Grid. In International Workshop on Agent based Cluster and Grid Computing at CCGrid 2003, Tokyo, Japan, May IEEE Computer Society Press. Mark Carman, Floriano Zini, Luciano Serafini, and Kurt Stockinger.: Towards an Economy- Based Optimisation of File Access and Replication on a Data Grid. In International Workshop on Agent based Cluster and Grid Computing at International Symposium on Cluster Computing and the Grid (CCGrid'2002), Berlin, Germany, May IEEE Computer Society Press. Ann Chervenak, Ian Foster, Carl Kesselman, Charles Salisbury and Steven Tuecke.: The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications, 23: , EU Data Grid Project:

Ajou University, South Korea GCC 2003 Presentation 18 References I. Foster, C. Kesselman and S. Tuecke.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International J. Supercomputer Applications, 15(3), Wolfgang Hoschek, Javier Jaen-Martinez, Asad Samar, Heinz Stockinger and Kurt Stockinger.: Data Management in an International Data Grid Project. 1st IEEE/ACM International Workshop on Grid Computing (Grid'2000), Bangalore, India, Dec OptorSim – A Replica Optimizer Simulation: wp2/optimization/optorsim.html Sang-Min Park and Jai-Hoon Kim.: Chameleon: A Resource Scheduler in a Data Grid Environment IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'2003), Tokyo, Japan, May IEEE Computer Society Press. Kavitha Ranganathan and Ian Foster.: Design and Evaluation of Dynamic Replication Strategies for a High Performance Data Grid. International Conference on Computing in High Energy and Nuclear Physics, Beijing, September Kavitha Ranganathan and Ian Foster.: Identifying Dynamic Replication Strategies for a High Performance Data Grid. International Workshop on Grid Computing, Denver, November 2001.