CX: A Scalable, Robust Network for Parallel Computing Peter Cappello & Dimitrios Mourloukos Computer Science UCSB.

Slides:



Advertisements
Similar presentations
MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
Advertisements

Distributed Systems CS
High Speed Total Order for SAN infrastructure Tal Anker, Danny Dolev, Gregory Greenman, Ilya Shnaiderman School of Engineering and Computer Science The.
Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
2. Computer Clusters for Scalable Parallel Computing
CILK: An Efficient Multithreaded Runtime System. People n Project at MIT & now at UT Austin –Bobby Blumofe (now UT Austin, Akamai) –Chris Joerg –Brad.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
CS 345 Computer System Overview
Broker Pattern Pattern-Oriented Software Architecture (POSA 1)
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
Ameoba Designed by: Prof Andrew S. Tanenbaum at Vrija University since 1981.
Task Scheduling and Distribution System Saeed Mahameed, Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology
Reference: Message Passing Fundamentals.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
City University London
Introduction  What is an Operating System  What Operating Systems Do  How is it filling our life 1-1 Lecture 1.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Figure 1.1 Interaction between applications and the operating system.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
Department of Computer Science Southern Illinois University Edwardsville Dr. Hiroshi Fujinoki and Kiran Gollamudi {hfujino,
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 2: System Structures.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.
MediaGrid Processing Framework 2009 February 19 Jason Danielson.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
MARISSA: MApReduce Implementation for Streaming Science Applications 作者 : Fadika, Z. ; Hartog, J. ; Govindaraju, M. ; Ramakrishnan, L. ; Gunter, D. ; Canon,
Distributed Software Engineering Lecture 1 Introduction Sam Malek SWE 622, Fall 2012 George Mason University.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Advanced Eager Scheduling for Java-Based Adaptively Parallel Computing Michael O. Neary & Peter Cappello Computer Science Department UC Santa Barbara.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.
J ICOS A Java-centric Internet Computing System Peter Cappello Computer Science Department UC Santa Barbara.
J ICOS’s Abstract Distributed Service Component Peter Cappello Computer Science Department UC Santa Barbara.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization of Distributed & Mobile Systems Dr. Michael R.
Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB.
Scalable and Coordinated Scheduling for Cloud-Scale computing
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Building and managing production bioclusters Chris Dagdigian BIOSILICO Vol2, No. 5 September 2004 Ankur Dhanik.
Distributed Computing Systems CSCI 6900/4900. Review Definition & characteristics of distributed systems Distributed system organization Design goals.
Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
JICOS A Java-Centric Network Computing Service Peter Cappello & Christopher James Coakley Computer Science University of California, Santa Barbara.
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
J ICOS A Java-Centric Distributed Computing Service Peter Cappello & Chris Coakley Computer Science Department UC Santa Barbara.
JICOS A Java-Centric Distributed Computing Service
CX: A Scalable, Robust Network for Parallel Computing
Grid Computing.
Storage Virtualization
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Architectures of distributed systems
Atlas: An Infrastructure for Global Computing
In Distributed Systems
Operating System Overview
Distributed Systems and Concurrency: Distributed Systems
Presentation transcript:

CX: A Scalable, Robust Network for Parallel Computing Peter Cappello & Dimitrios Mourloukos Computer Science UCSB

2 Outline 1.Introduction 2.Related work 3.API 4.Architecture 5.Experimental results 6.Current & future work

3 Introduction “Listen to the technology!” Carver Mead

4 Introduction “Listen to the technology!” Carver Mead What is the technology telling us?

5 Introduction “Listen to the technology!” Carver Mead What is the technology telling us? –Internet’s idle cycles/sec growing rapidly

6 Introduction “Listen to the technology!” Carver Mead What is the technology telling us? –Internet’s idle cycles/sec growing rapidly –Bandwidth increasing & getting cheaper

7 Introduction “Listen to the technology!” Carver Mead What is the technology telling us? –Internet’s idle cycles/sec growing rapidly –Bandwidth is increasing & getting cheaper –Communication latency is not decreasing

8 Introduction “Listen to the technology!” Carver Mead What is the technology telling us? –Internet’s idle cycles/sec growing rapidly –Bandwidth increasing & getting cheaper –Communication latency is not decreasing –Human technology is getting neither cheaper nor faster.

9 Introduction Project Goals 1.Minimize job completion time despite large communication latency

10 Introduction Project Goals 1.Minimize job completion time despite large communication latency 2.Jobs complete with high probability despite faulty components

11 Introduction Project Goals 1.Minimize job completion time despite large communication latency 2.Jobs complete with high probability despite faulty components 3.Application program is oblivious to: Number of processors Inter-process communication Fault tolerance

12 Heterogeneous machine/OS Introduction Fundamental Issue: Heterogeneity M1 OS1 M2 OS2 M3 OS3 M4 OS4 M5 OS5 …

13 Heterogeneous machine/OS Introduction Fundamental Issue: Heterogeneity M1 OS1 M2 OS2 M3 OS3 M4 OS4 M5 OS5 … Functionally Homogeneous JVM 

14 Outline 1.Introduction 2.Related work 3.API 4.Architecture 5.Experimental results 6.Current & future work

15 Related work Cilk  Cilk-NOW  Atlas –DAG computational model –Work-stealing

16 Related work Linda  Piranha  JavaSpaces –Space-based coordination –Decoupled communication

17 Related work Charlotte (Milan project / Calypso prototype) –High performance  Fault tolerance not achieved via transactions –Fault tolerance via eager scheduling

18 Related work SuperWeb  Javelin  Javelin++ –Architecture: client, broker, host

19 Outline 1.Introduction 2.Related work 3.API 4.Architecture 5.Experimental results 6.Current & future work

20 API DAG Computational model int f( int n ) { if ( n < 2 ) return n; else return f( n-1 ) + f( n-2 ); }

21 DAG Computational Model int f( int n ) { if ( n < 2 ) return n; else return f( n-1 ) + f( n-2 ); } f(4) Method invocation tree

22 DAG Computational Model int f( int n ) { if ( n < 2 ) return n; else return f( n-1 ) + f( n-2 ); } f(4) f(3)f(2) Method invocation tree

23 DAG Computational Model int f( int n ) { if ( n < 2 ) return n; else return f( n-1 ) + f( n-2 ); } f(4) f(3)f(2) f(1) f(0) Method invocation tree

24 DAG Computational Model int f( int n ) { if ( n < 2 ) return n; else return f( n-1 ) + f( n-2 ); } f(4) f(3)f(2) f(1) f(0) f(1)f(0) Method invocation tree f(2)

25 DAG Computational Model / API f(4) execute( ) { if ( n < 2 ) setArg(, n ); else { spawn ( ); } _______________________________ f(n-1) + + execute( ) { setArg(, in[0] + in[1] ); } f(n) + + f(n-2)

26 DAG Computational Model / API execute( ) { setArg(, in[0] + in[1] ); } + + f(4) f(3)f(2) + execute( ) { if ( n < 2 ) setArg(, n ); else { spawn ( ); } _______________________________ f(n-1) + + f(n) f(n-2)

27 DAG Computational Model / API execute( ) { setArg(, in[0] + in[1] ); } + + f(4) f(3)f(2) + f(1) f(0) + + execute( ) { if ( n < 2 ) setArg(, n ); else { spawn ( ); } _______________________________ f(n-1) + + f(n) f(n-2)

28 DAG Computational Model / API execute( ) { setArg(, in[0] + in[1] ); } + + f(4) f(3)f(2) + f(1) f(0) + + f(1)f(0) + execute( ) { if ( n < 2 ) setArg(, n ); else { spawn ( ); } _______________________________ f(n-1) + + f(n) f(n-2)

29 Outline 1.Introduction 2.Related work 3.API 4.Architecture 5.Experimental results 6.Current & future work

30 Architecture: Basic Entities CONSUMER PRODUCTION NETWORK CLUSTER NETWORK register ( spawn | getResult )* unregister

31 Architecture: Cluster TASK SERVER PRODUCER

32 A Cluster at Work f(4) f(3)f(2) + f(1) f(0) f(1)f(0) TASK SERVER PRODUCER WAITING READY

33 A Cluster at Work f(4) TASK SERVER PRODUCER WAITING READY f(4)

34 A Cluster at Work f(4) TASK SERVER PRODUCER WAITING READY f(4)

35 A Cluster at Work f(4) TASK SERVER PRODUCER WAITING READY f(4)

36 Decompose execute( ) { if ( n < 2 ) setArg( ArgAddr, n ); else { spawn ( + ); spawn ( f(n-1) ); spawn ( f(n-2) ); }

37 A Cluster at Work f(4) f(3)f(2) + TASK SERVER PRODUCER WAITING READY f(4) + f(3) f(2)

38 A Cluster at Work TASK SERVER PRODUCER WAITING READY + f(3) f(2) f(3)f(2) +

39 A Cluster at Work TASK SERVER PRODUCER WAITING READY + f(3) f(2) f(3) f(2) f(3)f(2) +

40 A Cluster at Work TASK SERVER PRODUCER WAITING READY + f(3) f(2) f(3)f(2) +

41 A Cluster at Work f(3)f(2) + f(1) f(0) + + TASK SERVER PRODUCER WAITING READY + f(3) f(2)+ f(1) + f(0)

42 A Cluster at Work TASK SERVER PRODUCER WAITING READY ++ f(2) f(1) + f(0) + f(2)f(1) f(0) + +

43 A Cluster at Work TASK SERVER PRODUCER WAITING READY ++ f(2) f(1) + f(0) f(2) f(1) + f(2)f(1) f(0) + +

44 A Cluster at Work TASK SERVER PRODUCER WAITING READY ++ f(1) + f(0) f(2) f(1) + f(2)f(1) f(0) + +

45 Compute Base Case execute( ) { if ( n < 2 ) setArg( ArgAddr, n ); else { spawn ( + ); spawn ( f(n-1) ); spawn ( f(n-2) ); }

46 A Cluster at Work + f(2)f(1) f(0) f(1)f(0) TASK SERVER PRODUCER WAITING READY ++ f(1) + f(0) f(2) f(1) + f(0)

47 A Cluster at Work + f(1)f(0) f(1)f(0) TASK SERVER PRODUCER WAITING READY +++ f(0)f(1) + f(0)

48 A Cluster at Work + f(1)f(0) f(1)f(0) TASK SERVER PRODUCER WAITING READY +++ f(0)f(1) + f(0) f(1) f(0)

49 A Cluster at Work + f(1)f(0) f(1)f(0) TASK SERVER PRODUCER WAITING READY f(1) f(0) f(1) f(0)

50 A Cluster at Work + f(1)f(0) f(1)f(0) TASK SERVER PRODUCER WAITING READY f(1) f(0) f(1) f(0)

51 A Cluster at Work + f(1)f(0) TASK SERVER PRODUCER WAITING READY f(1) f(0) +

52 A Cluster at Work + f(1)f(0) TASK SERVER PRODUCER WAITING READY ++ + f(1) f(0) +

53 A Cluster at Work + f(1)f(0) TASK SERVER PRODUCER WAITING READY ++ + f(1) f(0) + + f(1)

54 A Cluster at Work + f(1)f(0) TASK SERVER PRODUCER WAITING READY ++ + f(0) + f(1)

55 Compose execute( ) { setArg( ArgAddr, in[0] + in[1] ); }

56 A Cluster at Work + f(1)f(0) TASK SERVER PRODUCER WAITING READY ++ + f(0) + f(1)

57 A Cluster at Work + f(0) + + TASK SERVER PRODUCER WAITING READY ++ + f(0)

58 A Cluster at Work + f(0) + + TASK SERVER PRODUCER WAITING READY ++ + f(0)

59 A Cluster at Work + f(0) + + TASK SERVER PRODUCER WAITING READY ++ + f(0)

60 A Cluster at Work + f(0) + + TASK SERVER PRODUCER WAITING READY ++ + f(0)

61 A Cluster at Work TASK SERVER PRODUCER WAITING READY

62 A Cluster at Work TASK SERVER PRODUCER WAITING READY ++ +

63 A Cluster at Work TASK SERVER PRODUCER WAITING READY

64 A Cluster at Work TASK SERVER PRODUCER WAITING READY +++

65 A Cluster at Work TASK SERVER PRODUCER WAITING READY +++

66 A Cluster at Work + + TASK SERVER PRODUCER WAITING READY ++ +

67 A Cluster at Work + + TASK SERVER PRODUCER WAITING READY + +

68 A Cluster at Work + + TASK SERVER PRODUCER WAITING READY + + +

69 A Cluster at Work + + TASK SERVER PRODUCER WAITING READY + +

70 A Cluster at Work + + TASK SERVER PRODUCER WAITING READY + +

71 A Cluster at Work + TASK SERVER PRODUCER WAITING READY + +

72 A Cluster at Work + TASK SERVER PRODUCER WAITING READY +

73 A Cluster at Work + TASK SERVER PRODUCER WAITING READY + +

74 A Cluster at Work + TASK SERVER PRODUCER WAITING READY +

75 A Cluster at Work + TASK SERVER PRODUCER WAITING READY + R

76 A Cluster at Work TASK SERVER PRODUCER WAITING READY R 1.Result object is sent to Production Network 2.Production Network returns it to Consumer

77 Task Server Proxy Overlap Communication with Computation PRODUCER Task Server Proxy OUTBOX INBOX COMM COMP READY WAITING TASK SERVER PRIORITY Q

78 Architecture Work stealing & eager scheduling A task is removed from server only after a complete signal is received. A task may be assigned to multiple producers –Balance task load among producers of varying processor speeds –Tasks on failed/retreating producers are re-assigned.

79 Architecture: Scalability A cluster tolerates producer: –Retreat –Failure 1 task server however is a: –Bottleneck –Single point of failure. We introduce a network of task servers.

80 Scalability: Class loading 1.CX class loader loads classes (Consumer JAR) in each server’s class cache 2. Producer loads classes from its server

81 Scalability: Fault-tolerance Replicate a server’s tasks on its sibling

82 Scalability: Fault-tolerance Replicate a server’s tasks on its sibling

83 Scalability: Fault-tolerance Replicate a server’s tasks on its sibling When server fails, its sibling restores state to replacement server

84 Architecture Production network of clusters Network tolerates single server failure. Restores ability to tolerate a single failure.  ability to tolerate a sequence of failures

85 Outline 1.Introduction 2.Related work 3.API 4.Architecture 5.Experimental results 6.Current & future work

86 Preliminary experiments Experiments run on Linux cluster –100 port Lucent P550 Cajun Gigabit Switch Machine –2 Intel EtherExpress Pro 100 Mb/s Ethernet cards –Red Hat Linux 6.0 –JDK 1.2.2_RC3 –Heterogeneous processor speeds processors/machine

87 Fibonacci Tasks with Synthetic Load + + f(n-1) + + f(n) f(n-2) execute( ) { if ( n < 2 ) synthetic workload(); setArg(, n ); else { synthetic workload(); spawn ( ); } execute( ) { synthetic workload(); setArg(, in[0] + in[1] ); }

88 T SEQ vs. T 1 (seconds) Computing F(8) WorkloadT SEQ T1T1 Efficiency

89 Parallel efficiency for F(13) = 0.87 Parallel efficiency for F(18) = 0.99 Average task time: Workload 1 = 1.8 sec. Workload 2 = 3.7 sec.

90 Outline 1.Introduction 2.Related work 3.API 4.Architecture 5.Experimental results 6.Current & future work

91 Current work Implement CX market maker (broker) Solves discovery problem between Consumers & Production networks Enhance Producer with Lea’s Fork/Join Framework –See gee.cs.oswego.edu CONSUMER PRODUCTION NETWORK CONSUMER PRODUCTION NETWORK PRODUCTION NETWORK PRODUCTION NETWORK MARKET MAKER } { JINI Service

92 Current work Enhance computational model: branch & bound. –Propagate new bounds thru production network: 3 steps PRODUCTION NETWORK SEARCH TREE TERMINATE! BRANCH

93 Current work Enhance computational model: branch & bound. –Propagate new bounds thru production network: 3 steps PRODUCTION NETWORK SEARCH TREE TERMINATE!

94 Current work Investigate computations that appear ill-suited to adaptive parallelism –SOR –N-body.

95 End of CX Presentation Next release: End of June, includes source.

96 Introduction Fundamental Issues Communication latency Long latency  Overlap computation with communication. Robustness Massive parallelism  faults Scalability Massive parallelism  login privileges cannot be required. Ease of use Jini  easy upgrade of system components

97 Related work Market mechanisms –Huberman, Waldspurger, Malone, Miller & Drexler, Newhouse & Darlington

98 Related work CX integrates –DAG computational model –Work-stealing scheduler –Space-based, decoupled communication –Fault-tolerance via eager scheduling –Market mechanisms (incentive to participate)

99 Architecture Task identifier Dag has spawn tree TaskID = path id Root.TaskID = 0 TaskID used to detect duplicate: –Tasks –Results. F(4) F(3)F(2) + F(1) F(0) F(1)F(0)

100 Architecture: Basic Entities Consumer Seeks computing resources. Producer Offers computing resources. Task Server Coordinates task distribution among its producers. Production Network A network of task servers & their associated producers.

101 Defining Parallel Efficiency Scalar: Homogeneous set of P machines: Parallel efficiency = (T 1 / P) / T P Vector: Heterogeneous set of P machines: P = [ P 1, P 2, …, P d ], where there are P 1 machines of type 1, P 2 machines of type 2, … P d machines of type d : Parallel efficiency = ( P 1 / T 1 + P 2 / T 2 + … P d / T d ) –1 / T P

102 Future work Support special hardware / data: inter-server task movement. –Diffusion model: Tasks are homogeneous gas atoms diffusing through network. –N-body model: Each kind of atom (task) has its own: Mass (resistance to movement: code size, input size, …) attraction/repulsion to different servers Or other “massive” entities, such as: »special processors »large data base.

103 Future Work CX preprocessor to simplify API.