BOINC
What is BOINC? “Berkeley Open Infrastructure for Network Computing” Platform for Internet-wide distributed applications Volunteer computing infrastructure Relies on many far-flung users volunteering spare CPU power
Some Facts 1,000,000+ active nodes 521 TFLOPS of computing power 20 active projects (SETI@Home, Folding@Home, Malaria Control…) and several more in development (Current as of March 2007)
Comparison to MapReduce Both are frameworks on which “useful” systems can be built Does not prescribe particular programming style Much more heterogeneous architecture Does not have a formal aggregation step Designed for much longer-running systems (months/years vs. minutes/hours)
Volunteer computing != Grid computing Resource owners anonymous, unaccountable identified, accountable yes – software stack requirements OK Managed systems? no – need plug & play software Clients behind firewall? yes – pull model no – push model ISP bill? yes no ... nor is it “peer-to-peer computing”
System Features Homogenous redundancy Work unit “trickling” Locality scheduling Distribution based on host parameters Recognition metrics to reward volunteers Open source
Architecture Central server runs LAMP (Linux, Apache, MySQL, PHP/Perl) architecture for web + database End-users run client application with modules for actual computation BitTorrent used to distribute data elements efficiently
Architecture
Job Life-Cycle
Replicated Computations
Client software Available as regular application, background “service”, or screensaver Can be administered locally or LAN- administered via RPC Can be configured to use only “low priority” cycles
Client/Task Interaction Client software runs on variety of operating systems, each with different IPC Uses shared memory message passing to transmit information from “manager” to actual tasks and vice versa
Background utility compatibility Background utilities disk defrag disk indexing virus scanning web pre-fetch disk backup Most run only when computer is idle volunteer computing ==> they never run Background manager intelligent decision about when to run various activities
Why Participate? Sense of accomplishment, community involvement, or scientific duty Stress testing machines/networks Potential for fame (if your computer “finds” an alien planet, you can name it!) “Bragging rights” for computing more units “BOINC Credits”
Credit & Cobblestones Work done is rewarded with “cobblestones” 100 cobblestones = 1 day of CPU time for a computer with performance equaling 1,000 double-precision floating-point MIPS (Whetstone) & 1,000 integer VAX MIPS (Dhrystone) Computers are benchmarked by the BOINC system and receive credit appropriate to their machine
Anti-Cheating Measures Work units are computed redundantly by several different machines, and results are compared by the central server for consistency Credit is awarded after the internal server validates the returned work units Work units must be returned before a deadline
The hard non-technical problems How to increase the number of volunteers? currently 1 in 1000 PC owners How to increase the number of projects? currently stuck at about 50 How to get volunteers to diversify?
How to attract and retain volunteers? Active hosts: Retention reminder emails frequent science updates Recruitment Viral “email a friend”, referral reward Organizational World Community Grid: “partner” program Media coverage need more discoveries Bundling
Why aren’t there more projects? Lack of PR among scientists IT antipathy Creating a BOINC project is expensive: Research Needed Science App development Experiment design Paper writing Software/IT Port/debug apps workflow tools server admin Communications Web site development message board admin public relations
Conclusions Versatile infrastructure SETI tasks take a few hours Climate simulation tasks take months Network monitoring tasks are not CPU-bound at all! Scales extremely well to internet-wide applications Provides another flexible middleware layer to base distributed applications on Volunteer computing comes with add’l considerations (rewards, cheating)