Download presentation
Presentation is loading. Please wait.
Published byJason Aldous Jacobs Modified over 8 years ago
1
BOINC Workshop 10 Hien Nguyen, Eshwar Rohit University of Houston Supervisors: Dr. Jaspal Subhlok University of Houston Dr. David P. Anderson SSL – U.C, Berkeley Enabling interprocess communication for BOINC applications
2
2 RESEARCH GOAL Hien Nguyen University of Houston Enable BOINC to efficiently support apps that require interprocess communication. Goals: Easier programming for communicating applications Reduce execution time (not increase throughput)
3
3 Example Applications Hien Nguyen University of Houston REMD Protein Folding application Each process runs a standard molecular simulation at different temperature 270280290300310320330340 280270290300320310340330 280290270300320340310330 P1P2P3P4P5P6P7P8 STEP-1 STEP-2 STEP-3
4
4 Example Applications Hien Nguyen University of Houston Or many other applications: Differential equation solvers (grid) (synchronous) Game playing with alpha/beta pruning (asynchronous) Search application. ….. Suitable applications: moderate amount of communication.
5
Synchronization point 5 DIFFICULTIES Hien Nguyen University of Houston Job execution Fast host Slow host X X Slow down overall execution speed X X Worse as number of host increases
6
6 OUTLINE Hien Nguyen University of Houston 1.Volpex Dataspace Overview IPC for volunteer environment 2.Integration With BOINC Process management Host selection 3.Future and Related Work
7
7 Volpex Dataspace Hien Nguyen University of Houston Dataspace: global shared space that processes can use for information exchange without a temporal or spatial coupling. Volpex Dataspace Server Put(ABC, 800) Get(ABC,?) (800)
8
8 Volpex Dataspace – Fault Tolerance Hien Nguyen University of Houston Put(ABC, 800) Get(ABC,?) (800) replicated X Volpex DSS is designed to support redundant Put/Get operations unlike Linda & variants
9
9 Volpex Dataspace Hien Nguyen University of Houston Why centralized? Scale issues But: firewalls, no incoming connections
10
10 INTEGRATION WITH BOINC Hien Nguyen University of Houston Mechanics: Process management Simultaneous process starting Fault tolerance Checkpoint/restart Policy: Host selection.
11
11 Job execution scheme Hien Nguyen University of Houston X BOINC Scheduler Volpex Dataspace Server Get work Put data item Get data item Get checkpoint Put checkpoint
12
12 PROCESSES MANAGEMENT Hien Nguyen University of Houston Simultaneous process starting: All processes start computation together: reduce wasted resources because processes will have to wait for eachothers. Volpex jobs have highest (infinite) priority: uninterruptible by other jobs. While waiting for all processes of a Volpex job to be ready: host can do other finite priority volunteer jobs. Use of boinc_temporary_exit()
13
13 PROCESSES MANAGEMENT Hien Nguyen University of Houston Fault tolerance Dead instance spotted by heartbeat mechanism: process instances regularly send heartbeat to Volpex DSS. BOINC scheduler recruits a new volunteer host to replace the dead one.
14
14 PROCESSES MANAGEMENT Hien Nguyen University of Houston Hot spare policy: BOINC Scheduler Volpex Dataspace Server X Fast replacement Hot spare group
15
15 PROCESSES MANAGEMENT Hien Nguyen University of Houston Checkpointing: Process instance commits and uploads checkpoints to Volpex DSS (only stores latest checkpoint for each process). Restarted process instance requests checkpoint from Volpex DSS.
16
16 HOST SELECTION Hien Nguyen University of Houston Volpex job: consists of processes that form a job, submitted by scientist. Has requirements on: Deadline Number of processes Has estimates of: Total flops per process. Flops between 2 consecutive checkpoints. Memory usage, disk usage
17
17 HOST SELECTION POLICY Hien Nguyen University of Houston Criteria for selecting volunteer hosts to assign to a Volpex job: Speed and Availability. Availability: the interval that a host is available w/o interruption (BOINC client allowed to compute).
18
18 HOST SELECTION POLICY Hien Nguyen University of Houston Job’s minimum requirements: Minimum speed : Fast enough to meet job’s deadline Min speed = Total flops / Deadline Minimum expected availability : host is continuously available for x hours to commit at least 1 checkpoint.
19
19 Evaluate Host's Availability Hien Nguyen University of Houston We want to predict host's length of availability interval. Method based on : Exploiting Non- Dedicated Resources for Cloud Computing Artur Andrzejak, Derrick Kondo, David P. Anderson. (NOMS10)
20
20 Evaluate Host's Availability Hien Nguyen University of Houston Last value predictor: simplistic predictor which uses the availability value in the last hourly interval before prediction as the prediction of availability for the next x hours interval. Combined with ranking hosts by predictability: number availability changes per week.
21
21 Evaluate Host's Availability Hien Nguyen University of Houston In essence: select hosts which change availability very rarely. A process assigned to a host with high predictability does not necessarily need to be replicated.
22
22 IMPLEMENTATION STATUS Hien Nguyen University of Houston Volpex utilities: for scientists to submit, abort or query status of a Volpex job. Modified BOINC scheduler: includes host selection for Volpex job. Modified Volpex DSS: handling new type of requests.
23
23 IMPLEMENTATION STATUS Hien Nguyen University of Houston BOINC Server submit job specs Database Create job & WU Volpex Dataspace Server X Hot spare group Scheduler request Scheduler reply dynamically create result from WU BOINC Scheduler heartbeat checkpoint request procID get procID procID of failed instance replacement Scientist
24
24 FUTURE WORK Hien Nguyen University of Houston Experiment and evaluate with different degrees of freedom: Number of processes 10-1M Communication pattern (local/global, synch/asynch) Size and frequency of communication Goal: study (via live experiment or simulation) the performance of Volpex/BOINC over this space Application: Eratosthenes, REMD Protein Folding Enhance host selection policy
25
25 OTHER WORK Hien Nguyen University of Houston Volpex MPI: An MPI library designed for executing parallel applications in volunteer environment. Direct communication between processes. Key Features Controlled redundancy Receiver based direct communication Distributed sender based logging More detail : “ VolpexMPI: an MPI Library for Execution of Parallel Applications on Volatile Nodes” by Troy LeBlanc, Rakhi Anand, Edgar Gabriel, and Jaspal Subhlok.
26
26 If you have application? Hien Nguyen University of Houston We would be happy to cooperate with you. Our team contacts: Dr. Jaspal Subhlok: jaspal@uh.edujaspal@uh.edu Dr. David Anderson: davea@ssl.berkeley.edudavea@ssl.berkeley.edu Dr. Edgar Gabriel: gabriel@cs.uh.edugabriel@cs.uh.edu Hien Nguyen: hien.nguyen.nx@gmail.comhien.nguyen.nx@gmail.com Eshwar Rohit: eshwar.rohit@gmail.comeshwar.rohit@gmail.com Rakhi Anand: rakhi@cs.uh.edurakhi@cs.uh.edu Our Website: http://www2.cs.uh.edu/~jsteach/volpex/index.htm http://www2.cs.uh.edu/~jsteach/volpex/index.htm
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.