Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gueyoung Jung, Nathan Gnanasambandam, and Tridib Mukherjee International Conference on Cloud Computing 2012.

Similar presentations


Presentation on theme: "Gueyoung Jung, Nathan Gnanasambandam, and Tridib Mukherjee International Conference on Cloud Computing 2012."— Presentation transcript:

1 Gueyoung Jung, Nathan Gnanasambandam, and Tridib Mukherjee International Conference on Cloud Computing 2012

2  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 2

3  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 3

4  Collected data can exceed hundreds of terabytes and continuously generated ◦ sensors, social media, click-stream, log files, and mobile devices  The solution: Cloud Computing ◦ Analyze big-data by leveraging vast amounts of computing resources available on demand with low resource usage cost 4

5  Parallel data mining ◦ topic mining, pattern mining ◦ analyze large amounts of unstructured data ◦ time constraint  Big-data are partly analyzed on local private resources while rest of big-data are transferred to external computing nodes ◦ more flexible and obvious cost benefits 5

6  The considerations for optimizing parallel data mining ◦ Node determination ◦ Synchronized completion ◦ Data partition determination  Maximally Overlapped Bin-packing driven Bursting (MOBB) 6

7  The goals of MOBB algorithm ◦ Balancing across computing nodes ◦ Time overlap between data transfer delay and computation time in each computing node 7

8  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 8

9  Load distribution ◦ the overhead of data transfer  Maximum overlap between data transfer and computation ◦ determine the order of different sizes of data chunks transferred to each node  Task scheduling among computing nodes ◦ load-balancing (CometCloud) ◦ heterogeneous clouds 9

10  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 10

11 SLA: Service Level Agreement 11

12 12

13 13

14  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 14

15 15 made by the unit of data

16  Estimation of computation time ◦ Response surface model ◦ Queueing model  Estimation of data transfer delay ◦ more dynamic than computation time ◦ Auto-regressive moving average (ARMA) model 16

17 17

18  Determination of bucket size of each node  Sorting of data chunks in descending order  Sorting node bucket sizes in descending order (high delay = lower bucket size) 18

19 19

20 20

21 21

22  Weighted load distribution  Delay-based preference  Buckets are completely filled one at a time ◦ reduce fragmentation of buckets 22

23  Organize the sequence of chunks for maximizing the overlap between data transfer and computation 23

24 24

25 25

26  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 26

27  Frequent Pattern Mining ◦ A phone call log obtained from a call center and web access log ◦ Size: 200 GB (collected for one year) ◦ Objective: Obtain patterns of each user activities on human resource information systems 27

28  Four computing nodes ◦ Low–end Local Central node (LLC)  5 VMs, each has two 2.8 GHz cores, 1GB memory, 1TB hard drive ◦ Low-end Local Worker (LLW)  similar to LLC ◦ High-end Local Worker (HLW)  6 non-virtualized servers, each has 24 2.6 GHz cores, 48GB memory, 10 TB hard drive  Shared by other applications ◦ Mid-end Remote Worker (MRW)  9 VMs, each has two 2.8 GHz, 4 GB memory, 1 TB hard drive 28

29 29

30 30

31 31

32 32 HLW+MRW

33  Ideal optimal data allocation ◦ The slack time must be 0 33

34  Introduction  Related Work  Problem Statement  Maximally Overlapped Cloud-Bursting (MOBB) approach  Experimental Evaluation  Conclusion 34

35  A cloud-bursting based on maximally overlapped load-balancing algorithm which is to optimize the performance of big-data analytics is proposed  Results shows the performance can be improved by 20% to 60% against other approaches 35

36 36


Download ppt "Gueyoung Jung, Nathan Gnanasambandam, and Tridib Mukherjee International Conference on Cloud Computing 2012."

Similar presentations


Ads by Google