Presentation is loading. Please wait.

Presentation is loading. Please wait.

Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB.

Similar presentations


Presentation on theme: "Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB."— Presentation transcript:

1 Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB

2 Introduction Goals Service parallel applications that are: –Large: too big for a cluster –Coarse-grain: to hide communication latency Simplicity of use –Design focus: decomposition [composition] of computation. Scalable high performance –despite large communication latency Fault-tolerance –1000s of hosts, each dynamically [dis]associates.

3 Introduction Some Related Work

4 Introduction Some Applications Search for extra-terrestrial life Computer-generated animation Computer modeling of drugs for: –Influenza –Cancer –Reducing chemotherapy’s side-effects Financial modeling Storing nuclear waste

5 Outline Architecture Model of Computation API Scalable Computation Experimental Results Conclusions & Future Work

6 Architecture Basic Components Brokers Clients Hosts

7 Architecture Broker Discovery B BB B B BBB Broker Naming System B H

8 Architecture Broker Discovery B BB B B BBB Broker Naming System B H

9 Architecture Broker Discovery B B B B B B BB Broker Naming System B H

10 Architecture Broker Discovery B BB B B BBB Broker Naming System B H PING (BID?)

11 Architecture Broker Discovery B BB B B BBB Broker Naming System B H

12 Architecture Network of Broker-Managed Host Trees Each broker manages a tree of hosts

13 Architecture Network of Broker-Managed Host Trees Brokers form a network

14 Architecture Network of Broker-Managed Host Trees Brokers form a network Client contacts broker

15 Architecture Network of Broker-Managed Host Trees Brokers form a network Client contacts broker Client gets host trees

16 Scalable Computation Deterministic Work-Stealing Scheduler Task container addTask( task )getTask( ) stealTask( ) HOST

17 Scalable Computation Deterministic Work-Stealing Scheduler Task getWork( ) { if ( my deque has a task ) return task; else if ( any child has a task ) return child’s task; else return parent.getWork( ); } CLIENT HOSTS

18 Models of Computation Master-slave –AFAIK all proposed commercial applications Branch-&-bound optimization –A generalization of master-slave.

19 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 0 UPPER =  LOWER = 0

20 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 2 0 UPPER =  LOWER = 2

21 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 3 2 0 UPPER =  LOWER = 3

22 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 4 3 2 0 UPPER = 4 LOWER = 4

23 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 34 3 2 0 UPPER = 3 LOWER = 3

24 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 34 3 6 2 0 UPPER = 3 LOWER = 6

25 Models of Computation Branch & Bound 34 8 7 12 10 9 3 6 8 2 7 0 UPPER = 3 LOWER = 7 34 3 6 2 7 0

26 Models of Computation Branch & Bound Tasks created dynamically Upper bound is shared To detect termination: scheduler detects tasks that have been: –Completed –Killed (“bounded”) 34 3 6 2 7 0

27 API public class Host implements Runnable {... public void run() { while ( (node = jDM.getWork()) != null ) { if ( isAtomic() ) compute(); // search space; return result else { child = node.branch(); // put children in child array for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) jDM.addWork( child[i] ); //else child is killed implicitly }

28 API private void compute() {... boolean newBest = false; while ( (node = stack.pop()) != null ) { if ( node.isComplete() ) if ( node.getCost() < UpperBound ) { newBest = true; UpperBound = node.getCost(); jDM.propagateValue( UpperBound ); best = Node( child[i] ); } else { child = node.branch(); for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) stack.push( child[i] ); //else child is killed implicitly } } if ( newBest ) jDM.returnResult( best ); }

29 Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

30 Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

31 Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

32 Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

33 Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

34 Scalable Computation Fault Tolerance via Eager Scheduling When: All tasks have been assigned Some results have not been reported A host wants a new task Re-assign a task! Eager scheduling tolerates faults & balances the load. –Computation completes, if at least 1 host communicates with client.

35 Scalable Computation Fault Tolerance via Eager Scheduling Scheduler must know which: –Tasks have completed –Nodes have been killed Performance  balance –Centralized schedule info –Decentralized computation 34 3 6 2 7 0

36 Experimental Results

37 34 8 7 12 10 9 3 6 8 2 7 0 Example of a “bad” graph

38 Conclusions Javelin 2 relieves designer/programmer managing a set of [Inter-] networked processors that is: –Dynamic –Faulty A wide set of applications is covered by: –Master-slave model –Branch & bound model Weak shared memory performs well. Use multicast (?) for: –Code distribution –Propagating values

39 Future Work Improve support for long-lived computation: –Do not require that the client run continuously. A dag model of computation –with limited weak shared memory.

40 Future Work Jini/JavaSpaces Technology TaskManager aka Broker HH HH H H H H “Continuously” disperse Tasks among brokers via a physics model

41 Future Work Jini/JavaSpaces Technology TaskManager uses persistent JavaSpace –Host management: trivial –Eager scheduling: simple No single point of failure –Fat tree topology

42 Future Work Advanced Issues Privacy of data & algorithm Algorithms –New computation-communication complexity model –N-body problem, … Accounting: Associate specific work with specific host –Correctness –Compensation (how to quantify?) Create open source organization –System infrastructure –Application codes


Download ppt "Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB."

Similar presentations


Ads by Google