Download presentation
Presentation is loading. Please wait.
Published byRodney Beames Modified over 10 years ago
1
Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki
2
2 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
3
3 Introduction: Cilk-NOW features Ease of use Standard command line interface for running Cilk-NOW programs. Adaptive parallelism Joining & retreating is oblivious to users. Fault tolerance Cilk programs oblivious to: Check-pointing Failure detection & recovery
4
4 Introduction: Cilk-NOW features … Flexibility –Sovereignty of workstation’s owner is preserved: Owner defines “idle”. Security –Customary Unix user security. Users must have Unix login on system. Guaranteed performance Uses Cilk’s thread scheduler: Work-stealing Provably efficient predictable performance.
5
5 Introduction: Cilk-NOW features … No distributed shared memory No fault tolerance for I/O All workstations share a file system. Work focuses on: –Adaptive parallelism –Fault tolerance
6
6 Organization 1.Introduction 2.The Cilk language & work-stealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
7
7 Cilk language & work stealing scheduler This is the same as Cilk. The standard Fibonacci example follows.
8
8 Compute the n th Fibonacci Number thread fib ( cont int k, int n ) { if ( n < 2 ) send_argument ( k, n ); else { cont int x, y; spawn_next sum ( k, ?x, ?y ); spawn fib ( x, n – 1 ); spawn fib ( y, n – 2 ); } thread sum ( cont int k, int x, int y ) { send_argument ( k, x + y ); }
9
9
10
10 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
11
11 Cilk-NOW job architecture A Cilk-NOW job consists of: –A clearinghouse process –1 or more worker processes Begin a job by typing the command CilkChouse -- pfold 3 7 This starts a worker that: Forks a clearinghouse process that –Sends the job description to the macro-scheduler –Waits for messages from its workers.
12
12
13
13 (b) An idle machine joins the job Another machine’s node manager goes “idle” It sends a job request to the macro-scheduler The macro-scheduler returns the pfold job The node manager forks a new worker with no associated clearinghouse The worker registers with the pfold clearinghouse The clearinghouse gives the worker: –Its name (worker names are integers, starting from 0) –A list of other workers on this job The worker steals a closure from a worker.
14
14 (c) A no-longer idle machine retreats The machine’s owner touches the keyboard Node manager sends kill signal to its worker Worker catches signal: –Offloads closures to other workers –Un-registers from clearinghouse –Terminates
15
15 Maintaining the work lists Each worker checks in with clearinghouse every 2 seconds. If a worker’s “lease” expires ( no check in for 30 sec.) then the clearinghouse removes it from its list Clearinghouse returns a list of revisions: –workers to add & delete from local list.
16
16 UDP UDP between: –Workers –Clearinghouse & worker Faster than TCP for the common case. No pretense of reliability when none exists.
17
17 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
18
18 Adaptive parallelism What happens when a waiting closure gets offloaded to another worker? –How do send_argument invocations get their info to the moved waiting closure? The paper describes a notion of sub- computation, and uses this notion to handle this situation. To be continued …
19
19 A simple way ? Have the waiting closure’s unfilled arguments refer to the continuations that refer to them. –When the waiting closure is offloaded to a new worker, the waiting closure informs its continuations of its new address. –For this to work, when a continuation is passed to another closure, the waiting closure is informed This may be a lot of work. To be continued …
20
20 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
21
21 Fault tolerance To be continued, based on a fuller understanding of closure migration under worker retreat.
22
22 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
23
23 Cilk-NOW macro-scheduling
24
24 Organization 1.Introduction 2.The Cilk language & workstealing scheduler 3.Cilk-NOW job architecture 4.Adaptive parallelism 5.Fault tolerance 6.Cilk-NOW macro-scheduling 7.Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.