David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing
Volunteer computing Projectstartwherearea#hosts GIMPS1994math10,000 distributed.net1995cryptography100,000 I1999UCBSETI600,000 United Devices2002commercialbiomedicine200,000 CPDN2003Oxfordclimate change150,000 WCG2004IBMbiomedicine200,000 II2005UCBSETI850,000 Washbiology100,000 SIMAP2005T.U. Munichbioinformatics10,000 Total of BOINC-based projects: 660,000 participants, 1,000,000 hosts, 450 TeraFLOPS
Why volunteer computing? ● 1 billion PCs – 55% privately owned – most are on Internet ● If 100M participate: – > 100 PetaFLOPs, 1 Exabyte (10^18) storage ● Consumer products drive technology your computers academi c business home PCs
What's different about volunteer computing? ● Must attract and retain volunteers – Credit – Community features – Easy installation; autonomic ● Volunteers are unreliable – one solution: redundant computing ● Heterogeneous, dynamic resource pool
Berkeley Open Infrastructure for Network Computing (BOINC) ● Started in 2002; funded by NSF – 2.75 FTEs; lots of volunteers ● Open-source (LGPL) – client: 20K lines, C++ – server: 10K lines, C++/Python – web: 10K lines, PHP ●
SETI physics Climate biomedical Joe Alice Jens volunteers projects volunteers “attach” computers to projects, allocate resources
Client structure App Core client screensaver servers
Server structure MySQL Transitioner Scheduler Feeder File deleter DB purger Assimilator Validator Work creator Shared mem clients Web volunteers 1 server can handle 8-25 million tasks per day
Credit
Credit information flow
Goals of BOINC ● More projects – Improve/simplify tools – World Community Grid ● More participation – Simplify everything – GridRepublic ● Handle data-intensive apps better – BitTorrent, use network topology – Task graphs