Redundant Parallel File Transfer with Anticipative Adjustment Mechanism in Data Grids Chao-Tung Yang, Yao-Chun Chi, Chun-Pin Fu, High-Performance Computing.

Slides:



Advertisements
Similar presentations
Cross-layer Design in Wireless Mesh Networks Hu Wenjie Computer Network and Protocol Testing Laboratory, Dept. of Computer Science & Technology, Tsinghua.
Advertisements

Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.
The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
High Performance Computing Course Notes Grid Computing.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Reference: Message Passing Fundamentals.
Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs John Oleszkiewicz, Li Xiao, Yunhao Liu IEEE TRASACTION ON PARALLEL.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
Dynamic Load Balancing Experiments in a Grid Vrije Universiteit Amsterdam, The Netherlands CWI Amsterdam, The
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Dynamic parallel access to replicated content in the Internet Pablo Rodriguez and Ernst W. Biersack IEEE/ACM Transactions on Networking, August 2002.
Cross Cluster Migration Remote access support Adianto Wibisono supervised by : Dr. Dick van Albada Kamil Iskra, M. Sc.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
OpenFlow-Based Server Load Balancing GoneWild Author : Richard Wang, Dana Butnariu, Jennifer Rexford Publisher : Hot-ICE'11 Proceedings of the 11th USENIX.
6.4 Data And File Replication Presenter : Jing He Instructor: Dr. Yanqing Zhang.
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
Google File System Simulator Pratima Kolan Vinod Ramachandran.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.
Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.
DataGrid Middleware: Enabling Big Science on Big Data One of the most demanding and important challenges that we face as we attempt to construct the distributed.
Kurochkin I.I., Prun A.I. Institute for systems analysis of RAS Centre for grid-technologies and distributed computing GRID-2012, Dubna, Russia july.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
MediaGrid Processing Framework 2009 February 19 Jason Danielson.
1 Optimal Resource Placement in Structured Peer-to-Peer Networks Authors: W. Rao, L. Chen, A.W.-C. Fu, G. Wang Source: IEEE Transactions on Parallel and.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison IEEE Transactions on Parallel and Distributed Systems, Vol.
Multipath Routing for Wireless Sensor Networks: a Hybrid between Source Routing and Diffusion Techniques Mohamed Ebada Systems Science Program University.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
1 On Dynamic Parallelism Adjustment Mechanism for Data Transfer Protocol GridFTP Takeshi Itou, Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,
CGW 04, Stripped replication for the grid environment as a web service1 Stripped replication for the Grid environment as a web service Marek Ciglan, Ondrej.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
Replica Management Kelly Clynes. Agenda Grid Computing Globus Toolkit What is Replica Management Replica Management in Globus Replica Management Catalog.
GridNEWS: A distributed Grid platform for efficient storage, annotating, indexing and searching of large audiovisual news content Ioannis Konstantinou.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Data Manipulation with Globus Toolkit Ivan Ivanovski TU München,
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Amsterdam December 4-6, 2006 eScience 2006 A Grid-based Architecture for the Composition and the Execution of Remote Interactive Measurements Andrea BagnascoAriannaPoggi,
1 / 21 Providing Differentiated Services from an Internet Server Xiangping Chen and Prasant Mohapatra Dept. of Computer Science and Engineering Michigan.
1 Data Management for Internet Backplane Protocol by Tang Ming Assoc/Prof. Francis Lee School of Computer Engineering, Nanyang Technological University,
Seminar On Rain Technology
Name : Mamatha J M Seminar guide: Mr. Kemparaju. GRID COMPUTING.
FILE TRANSFER SPEEDS OVER HTTP AND FTP Yibiao Li 06/01/2009 Christmas Meeting 2008/09.
A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science,
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
The Data Grid: Towards an architecture for Distributed Management
Globus —— Toolkits for Grid Computing
Author: Ragalatha P, Manoj Challa, Sundeep Kumar. K
Distributed P2P File System
CLUSTER COMPUTING.
Distributed computing deals with hardware
Hybrid Programming with OpenMP and MPI
Cloud Computing Architecture
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Performance-Robust Parallel I/O
Presentation transcript:

Redundant Parallel File Transfer with Anticipative Adjustment Mechanism in Data Grids Chao-Tung Yang, Yao-Chun Chi, Chun-Pin Fu, High-Performance Computing Laboratory Department of Computer Science and Information Engineering Tunghai University, Taichung City, 40704, Taiwan R.O.C. Presented by 張肇烜

Outline Introduction Co-Allocation Mechanism System Design And Implementation Related Works Experimental Results And Analyses System Components Conclusion

Introduction Most Data Grid applications execute simultaneously and access large numbers of data files in the Grid environment. In Data Grid environment, large data sets are replicated and distributed to multiple servers.

Introduction (cont.) Downloading large datasets from any replica locations may result in varied performance rates. The downloading speed is limited by the bandwidth traffic congestion in connecting the server and the client.

Introduction (cont.) The methods uses to improve the downloading speed: Replica selection techniques. Co-allocation architecture. Anticipative Recursive-Adjustment Co- allocation.

Co-Allocation Mechanism We used the grid middleware Globus Toolkit as the data grid infrastructure. One of its primary components is MDS. And it uses GridFTP. Replica Location Service (RLS)

Co-Allocation Mechanism (cont.)

System Design And Implementation

System Design And Implementation (cont.)

A new section of a file to be allocated is first defined. SEj=(unassignedFileSize+Total UnfinishedFileSize)*a (0<a<1) SEj denotes the section j such that 1<=j<=k

System Design And Implementation (cont.) In the next step, SEj is divided into several blocks and assigned to n servers.

System Design And Implementation (cont.) A faster channel finishes its assigned data blocks at real finished time RTj may later or earlier than expected time Tj, TSi denotes the actually transfer size at the real finished time RTj is:

System Design And Implementation (cont.)

Related Works Brute-Force Co-Allocation History-based Co-Allocation Conservative Load Balancing Aggressive Load Balancing Co-Allocation Scheme with Duplicate Assignments (DCDA)

Experimental Results and Analyses Network variation between client and servers.

Experimental Results and Analyses (cont.) Completion time of different methods.

System Components

System Components (cont.) This portal is to simplify the operation of replicas management and file transfer. Replica Selection Service Anticipative Recursive-Adjustment Co- Allocation One-way Replica Consistency Service Dynamic Maintenance Service

Conclusion Dynamic co-allocation scheme had led to performance improvement, but it has some shortcoming. The goal is to reduce the total idle time on awaiting the slowest server.