Download presentation
Presentation is loading. Please wait.
Published byCody Lyons Modified over 8 years ago
1
A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science, Tunghai University, Taichung, 40704, Taiwan Chairman: Shih-Chung Chen Presenter : Hoe Jing Tey Advisor : Dr. Yen – Ting Chen Date : 2015.05.06 1
2
Outline Introduction Background Review Algorithm Review Our Using Technology Experiment Conclusion Reference 2
3
Introduction Data Grid Sharing Selection Connection of a wide variety Geographically distributed computational Storages resources 3
4
Introduction Internet is usually the underlying network of a grid In this paradigm, some challenges need to be solved Reduce differences in finish times between selected replica servers Avoid traffic congestion resulting Manage network performance variations among parallel transfers 4
5
MIFAS A tool with user friendly Web Transformer Interface Implemented and used to manage file replications on data grid environment in parallel Integrate nine kinds of algorithm methods MIFAS also supports hospital PACS and DICOM image viewer “Hospital Information System”(HIS) View patients CT, MRI, etc. 5
6
Background Review Replica Management Replica Selection Globus Toolkit and GridFTP Java Commodity Kit(Java Cog Kit) Co-Allocatio Scheme 6
7
Replica Management Process of keeping track of where portions of the data set can be found Process of creating and deleting replicas at a storage site These replicas are created only to harness certain performance benefit A replica manager typically maintains a replica catalog containing replica site addresses and the file instances. 7
8
Replica Selection Process of choosing a replica from among those spread across the Grid Based on some characteristics specified by the application Commonly consists of three steps Data preparation Preprocessing prediction 8
9
Globus Toolkit and GridFTP Used grid middleware Globus Toolkit as Data Grid infrastructure Solutions Security Resource management Data management Information services Monitor and Discovery System(MDS) 9
10
Java Commodity Kit(Java Cog Kit) Provides access to Grid services through Java via higher-level framework Provides a framework for utilizing the many Globus services as part of the Globus toolkit Combines Java technology with Grid Computing to develop advanced Grid Services and accessibility to basic Globus resources Java Cog Kit GSI gridFTP myProxy GRAM 10
11
Co-Allocatio Scheme 11
12
Algorithm Review Brute-Force Co-allocation History-Based Co-allocation Conservative Load Balancing Co-allocation Aggressive Load Balancing Co-allocation DCDA Co-allocation DAS Co-allocation RAM Co-allocation ARAM Co-allocation ARAM+ Co-allocation 12
13
Brute-Force Co-allocation divided file sizes equally 13
14
History-Based Co-allocation Keeps block sizes per flow proportional to transfer rates 14
15
Conservative Load Balancing Co-allocation 15
16
Aggressive Load Balancing Co-allocation Adds functions that changes block size in deliveries with two methods 16
17
DCDA Co-allocation Dynamic Co-allocation with duplicate Assignment based on an algorithm that uses a circular queue 17
18
DAS Co-allocation Dynamic Adjustment Strategy Enables the clients to download data from multiple locations by establishing multiple connections in parallel New data transfer strategy Initial phase Steady phase Completion phase 18
19
RAM Co-allocation Recursively-Adjusting Mechanism Continuous adjusting each replica server’s workload to correspond to its real-time bandwidth during file transfers 19
20
ARAM Co-allocation Anticipative Recursively-Adjusting Mechanism Assign transfer requests to selected replica servers according to the finish rates for previous transfers, and adjusts workloads on selected replica servers according to anticipated bandwidth statuses. The goal is to have the expected finished times of all servers be the same 20
21
21
22
ARAM+ Co-allocation Anticipative Recursively-Adjusting Mechanism Plus TCP Bandwidth Estimation Model (TCPBEM), Burst Mode (BM). adjusts the workloads on selected replica servers by measuring actual bandwidth performance 22
23
23
24
Our Using Technology JSP and Servlet MIFAS component Hadoop Distributed File System MIFAS Transaction Ganglia Log4J 24
25
JSP and Servlet A component-based, platform-independent method for building Web-based applications Access the entire family of Java APIs access a library of HTTP-specific calls and receive all the benefits of the mature Java language 25
26
MIFAS component Measure the units of each step throughout the Department conducts the file transfer performance JSP technology Hadoop Distributed File System (HDFS)via FTP 26
27
27
28
Hadoop Distributed File System stores large files across multiple machine Reliability by replicating data across Serve the data over HTTP, allowing access to all content from a web browser or other client. Data nodes can talk to each other to rebalance data, to move copies around, and to keep the replication of data high Name node 28
29
MIFAS Transaction 29
30
Ganglia distributed monitoring system for high-performance computing systems such as clusters and Grids. based on a hierarchical design targeted at federations of clusters. 30
31
Ganglia Monitor 31
32
Log4J measure the performance of data transfer speed and transaction time in MIFAS Enable logging at runtime without modifying the application binary Logging behavior can be controlled by editing a configuration file, without touching the application binary 32
33
Experiment Experimental Environment Experimental Result 33
34
Experimental Environment MIFAS on Hadoop environment Distributed over 10 clusters located at 2 educational institutions and 1 hospital institutions THU CSMU CSH Globus4.2.1 34
35
Experimental Environment 35
36
Experimental Results measure the three institutes network bandwidth THU, CSH and CSMU 36
37
Three experimental to measure MIFAS Hadoop system download speed (time) and performance measure all co-allocation scheme with difference file sizes at normal fluctuate network bandwidth(10~1500mega bytes) measure all co-allocation scheme with difference file size in unstable (broken line) network bandwidth(10~1500mega bytes) measure all MIFAS Hadoop co-allocation scheme with difference file size average download speed and test performance in each co- allocation scheme.(10 ~ 1000mega bytes) 37
38
First Case 38
39
Second Case 39
40
Third Case 40
41
Conclusion Implemented a web-system call as MIFAS system manage file and PACS viewer Used Hadoop Distributed File System (HDFS) as our back- end cloud file storage manager nine kind of file co-allocation scheme and three institute Deploy in each institute to test our co-allocation scheme 41
42
Reference A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, A. Iamnitchi, C. Kesselman, P. Kunszt, and M. Ripeanu, B. Schwarz, H. Stockinger, K. Stockinger, and B. Tierney. “Giggle: A Framework for Constructing Scalable Replica Location Services,” in Proc. SC, pp. 1-17, 2002. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, “The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets,” Journal of Network and Computer Applications, 23(3), K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Grid information services for distributed resource sharing, in: Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing, HPDC-10’01, August 2001, pp. 181–194. Global Grid Forum, http://www.ggf.org/. IBM Red Books, Introduction to Grid Computing with Globus, IBM Press. www.redbooks.ibm.com/redbooks/pdfs/sg246895.pdf. S. Vazhkudai, “Enabling the Co-Allocation of Grid Data Transfers,” Proceedings of Fourth International Workshop on Grid Computing, pp. 44-51, 17 November 2003. S. Vazhkudai, S. Tuecke, and I. Foster, “Replica Selection in the Globus Data Grid,” Proceedings of the 1st International Symposium on Cluster Computing and the Grid (CCGRID 2001), pp. 106-113, May 2001. C.T. Yang, Y.C. Chi, T.F. Han and C.H. Hsu, “Redundant Parallel File Transfer with Anticipative Recursively-Adjusting Scheme in Data Grids”, ICA3PP 2007, 242–253, 2007. 42
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.