Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki.

Similar presentations


Presentation on theme: "Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki."— Presentation transcript:

1 Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki

2 2 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion

3 3 Introduction: Cilk-NOW features Ease of use Standard command line interface for running Cilk-NOW programs. Adaptive parallelism Joining & retreating is oblivious to users. Fault tolerance Cilk programs oblivious to: Check-pointing Failure detection & recovery

4 4 Introduction: Cilk-NOW features … Flexibility –Sovereignty of workstation’s owner is preserved: Owner defines “idle”. Security –Customary Unix user security. Users must have Unix login on system. Guaranteed performance Uses Cilk’s thread scheduler: Work-stealing Provably efficient  predictable performance.

5 5 Introduction: Cilk-NOW features … No distributed shared memory No fault tolerance for I/O All workstations share a file system. Work focuses on: –Adaptive parallelism –Fault tolerance

6 6 Organization 1.Introduction 2.The Cilk language & work-stealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion

7 7 Cilk language & work stealing scheduler This is the same as Cilk. The standard Fibonacci example follows.

8 8 Compute the n th Fibonacci Number thread fib ( cont int k, int n ) { if ( n < 2 ) send_argument ( k, n ); else { cont int x, y; spawn_next sum ( k, ?x, ?y ); spawn fib ( x, n – 1 ); spawn fib ( y, n – 2 ); } thread sum ( cont int k, int x, int y ) { send_argument ( k, x + y ); }

9 9

10 10 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion

11 11 Cilk-NOW job architecture A Cilk-NOW job consists of: –A clearinghouse process –1 or more worker processes Begin a job by typing the command CilkChouse -- pfold 3 7 This starts a worker that: Forks a clearinghouse process that –Sends the job description to the macro-scheduler –Waits for messages from its workers.

12 12

13 13 (b) An idle machine joins the job Another machine’s node manager goes “idle” It sends a job request to the macro-scheduler The macro-scheduler returns the pfold job The node manager forks a new worker with no associated clearinghouse The worker registers with the pfold clearinghouse The clearinghouse gives the worker: –Its name (worker names are integers, starting from 0) –A list of other workers on this job The worker steals a closure from a worker.

14 14 (c) A no-longer idle machine retreats The machine’s owner touches the keyboard Node manager sends kill signal to its worker Worker catches signal: –Offloads closures to other workers –Un-registers from clearinghouse –Terminates

15 15 Maintaining the work lists Each worker checks in with clearinghouse every 2 seconds. If a worker’s “lease” expires ( no check in for 30 sec.) then the clearinghouse removes it from its list Clearinghouse returns a list of revisions: –workers to add & delete from local list.

16 16 UDP UDP between: –Workers –Clearinghouse & worker Faster than TCP for the common case. No pretense of reliability when none exists.

17 17 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion

18 18 Adaptive parallelism What happens when a waiting closure gets offloaded to another worker? –How do send_argument invocations get their info to the moved waiting closure? The paper describes a notion of sub- computation, and uses this notion to handle this situation. To be continued …

19 19 A simple way ? Have the waiting closure’s unfilled arguments refer to the continuations that refer to them. –When the waiting closure is offloaded to a new worker, the waiting closure informs its continuations of its new address. –For this to work, when a continuation is passed to another closure, the waiting closure is informed This may be a lot of work. To be continued …

20 20 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion

21 21 Fault tolerance To be continued, based on a fuller understanding of closure migration under worker retreat.

22 22 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion

23 23 Cilk-NOW macro-scheduling

24 24 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion


Download ppt "Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki."

Similar presentations


Ads by Google