Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science,

Similar presentations


Presentation on theme: "A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science,"— Presentation transcript:

1 A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science, Tunghai University, Taichung, 40704, Taiwan Chairman: Shih-Chung Chen Presenter : Hoe Jing Tey Advisor : Dr. Yen – Ting Chen Date : 2015.05.06 1

2 Outline  Introduction  Background Review  Algorithm Review  Our Using Technology  Experiment  Conclusion  Reference 2

3 Introduction  Data Grid  Sharing  Selection  Connection of a wide variety  Geographically distributed computational  Storages resources 3

4 Introduction  Internet is usually the underlying network of a grid  In this paradigm, some challenges need to be solved  Reduce differences in finish times between selected replica servers  Avoid traffic congestion resulting  Manage network performance variations among parallel transfers 4

5 MIFAS  A tool with user friendly Web Transformer Interface  Implemented and used to manage file replications on data grid environment in parallel  Integrate nine kinds of algorithm methods  MIFAS also supports hospital PACS and DICOM image viewer  “Hospital Information System”(HIS)  View patients CT, MRI, etc. 5

6 Background Review  Replica Management  Replica Selection  Globus Toolkit and GridFTP  Java Commodity Kit(Java Cog Kit)  Co-Allocatio Scheme 6

7 Replica Management  Process of keeping track of where portions of the data set can be found  Process of creating and deleting replicas at a storage site  These replicas are created only to harness certain performance benefit  A replica manager typically maintains a replica catalog containing replica site addresses and the file instances. 7

8 Replica Selection  Process of choosing a replica from among those spread across the Grid  Based on some characteristics specified by the application  Commonly consists of three steps  Data preparation  Preprocessing  prediction 8

9 Globus Toolkit and GridFTP  Used grid middleware Globus Toolkit as Data Grid infrastructure  Solutions  Security  Resource management  Data management  Information services  Monitor and Discovery System(MDS) 9

10 Java Commodity Kit(Java Cog Kit)  Provides access to Grid services through Java via higher-level framework  Provides a framework for utilizing the many Globus services as part of the Globus toolkit  Combines Java technology with Grid Computing to develop advanced Grid Services and accessibility to basic Globus resources  Java Cog Kit  GSI  gridFTP  myProxy  GRAM 10

11 Co-Allocatio Scheme 11

12 Algorithm Review  Brute-Force Co-allocation  History-Based Co-allocation  Conservative Load Balancing Co-allocation  Aggressive Load Balancing Co-allocation  DCDA Co-allocation  DAS Co-allocation  RAM Co-allocation  ARAM Co-allocation  ARAM+ Co-allocation 12

13 Brute-Force Co-allocation  divided file sizes equally 13

14 History-Based Co-allocation  Keeps block sizes per flow proportional to transfer rates 14

15 Conservative Load Balancing Co-allocation 15

16 Aggressive Load Balancing Co-allocation  Adds functions that changes block size in deliveries with two methods 16

17 DCDA Co-allocation  Dynamic Co-allocation with duplicate Assignment  based on an algorithm that uses a circular queue 17

18 DAS Co-allocation  Dynamic Adjustment Strategy  Enables the clients to download data from multiple locations by establishing multiple connections in parallel  New data transfer strategy  Initial phase  Steady phase  Completion phase 18

19 RAM Co-allocation  Recursively-Adjusting Mechanism  Continuous adjusting each replica server’s workload to correspond to its real-time bandwidth during file transfers 19

20 ARAM Co-allocation  Anticipative Recursively-Adjusting Mechanism  Assign transfer requests to selected replica servers according to the finish rates for previous transfers, and adjusts workloads on selected replica servers according to anticipated bandwidth statuses.  The goal is to have the expected finished times of all servers be the same 20

21 21

22 ARAM+ Co-allocation  Anticipative Recursively-Adjusting Mechanism Plus  TCP Bandwidth Estimation Model (TCPBEM), Burst Mode (BM).  adjusts the workloads on selected replica servers by measuring actual bandwidth performance 22

23 23

24 Our Using Technology  JSP and Servlet  MIFAS component  Hadoop Distributed File System  MIFAS Transaction  Ganglia  Log4J 24

25 JSP and Servlet  A component-based, platform-independent method for building Web-based applications  Access the entire family of Java APIs  access a library of HTTP-specific calls and receive all the benefits of the mature Java language 25

26 MIFAS component  Measure the units of each step throughout the Department conducts the file transfer performance  JSP technology  Hadoop Distributed File System (HDFS)via FTP 26

27 27

28 Hadoop Distributed File System  stores large files across multiple machine  Reliability by replicating data across  Serve the data over HTTP, allowing access to all content from a web browser or other client.  Data nodes can talk to each other to rebalance data, to move copies around, and to keep the replication of data high  Name node 28

29 MIFAS Transaction 29

30 Ganglia  distributed monitoring system for high-performance computing systems such as clusters and Grids.  based on a hierarchical design targeted at federations of clusters. 30

31 Ganglia Monitor 31

32 Log4J  measure the performance of data transfer speed and transaction time in MIFAS  Enable logging at runtime without modifying the application binary  Logging behavior can be controlled by editing a configuration file, without touching the application binary 32

33 Experiment  Experimental Environment  Experimental Result 33

34 Experimental Environment  MIFAS on Hadoop environment  Distributed over 10 clusters located at 2 educational institutions and 1 hospital institutions  THU  CSMU  CSH  Globus4.2.1 34

35 Experimental Environment 35

36 Experimental Results  measure the three institutes network bandwidth  THU, CSH and CSMU 36

37  Three experimental to measure MIFAS Hadoop system download speed (time) and performance  measure all co-allocation scheme with difference file sizes at normal fluctuate network bandwidth(10~1500mega bytes)  measure all co-allocation scheme with difference file size in unstable (broken line) network bandwidth(10~1500mega bytes)  measure all MIFAS Hadoop co-allocation scheme with difference file size average download speed and test performance in each co- allocation scheme.(10 ~ 1000mega bytes) 37

38 First Case 38

39 Second Case 39

40 Third Case 40

41 Conclusion  Implemented a web-system call as MIFAS system  manage file and PACS viewer  Used Hadoop Distributed File System (HDFS) as our back- end cloud file storage manager  nine kind of file co-allocation scheme and three institute  Deploy in each institute to test our co-allocation scheme 41

42 Reference  A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, A. Iamnitchi, C. Kesselman, P. Kunszt, and M. Ripeanu, B. Schwarz, H. Stockinger, K. Stockinger, and B. Tierney. “Giggle: A Framework for Constructing Scalable Replica Location Services,” in Proc. SC, pp. 1-17, 2002.  A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, “The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets,” Journal of Network and Computer Applications, 23(3),  K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Grid information services for distributed resource sharing, in: Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing, HPDC-10’01, August 2001, pp. 181–194.  Global Grid Forum, http://www.ggf.org/.  IBM Red Books, Introduction to Grid Computing with Globus, IBM Press. www.redbooks.ibm.com/redbooks/pdfs/sg246895.pdf.  S. Vazhkudai, “Enabling the Co-Allocation of Grid Data Transfers,” Proceedings of Fourth International Workshop on Grid Computing, pp. 44-51, 17 November 2003.  S. Vazhkudai, S. Tuecke, and I. Foster, “Replica Selection in the Globus Data Grid,” Proceedings of the 1st International Symposium on Cluster Computing and the Grid (CCGRID 2001), pp. 106-113, May 2001.  C.T. Yang, Y.C. Chi, T.F. Han and C.H. Hsu, “Redundant Parallel File Transfer with Anticipative Recursively-Adjusting Scheme in Data Grids”, ICA3PP 2007,  242–253, 2007. 42


Download ppt "A Web-based Parallel File Transferring System on Grid and Cloud Environments Chao-Tung Yang Yu-Hsiang Lo Lung-Teng Chen Department of Computer Science,"

Similar presentations


Ads by Google