Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010
Outline Volunteer computing BOINC Applications Research directions
High-throughput computing High-performance computing program runs too slow on PC cluster (MPI) supercomputer cluster (batch) Grid Commercial cloud Volunteer computing single job # processors multiple jobs 10K-1M
History of volunteer computing Applications Middleware distributed.net, GIMPS Commercial: Entropia, United Devices,... BOINC Climateprediction.net IBM World Community Grid now Academic: Bayanihan, Javelin,... Applications
Terms ● FLOPS = floating point operations per second ● GigaFLOPS = 10 9 FLOPS – slow PC ● TeraFLOPS = FLOPS – fast GPU or 100-node cluster ● PetaFLOPS = FLOPS – fastest supercomputer ● ExaFLOPS = FLOPS – fantasy and science fiction
The yearly cost of 10 TeraFLOPS ● Amazon EC2 ● small instance: $.09/hour = $788/year ● 10 TeraFLOPS = 5,000 instances ● $3.94M/year plus network, storage costs ● Build your own cluster ● ~ $1.5M/year ● Volunteer computing ● ~ $0.1M/year
BOINC volunteers projects CPDN WCG attachments Scientists create projects using BOINC Volunteers install BOINC, attach to project(s) Applications are silently downloaded and executed on volunteer PCs
The Utopian vision Better research gets more computing power An enlightened public decides what’s better Scientific research The public resources education/outreach
The Consumer Digital Infrastructure ● 1.5 billion PCs ● Graphics Processing Units: TeraFLOPS ● Terabyte-scale storage ● Network speed approaching 1 Gbps ● Ideal for scientific computing!
The state of volunteer computing ● 40 projects ● 500K volunteers ● 800K computers ● 10 PetaFLOPS ● would cost $3.94 billion/year on Amazon EC2
The potential of volunteer computing The volunteer resource pool Current PetaFLOPS breakdown: Potential: ExaFLOPS today – 4M GPUs * 1 TFLOPS * 0.25 availability
Science areas using BOINC ● Biology: protein study, genetic analysis ● Medicine: drug discovery, epidemiology ● Physics: LHC, nanotechnology, quantum computing ● Astronomy: LIGO, radio data analysis; cosmology; galactic modeling ● Environment: climate modeling, botanical ecosystem simulation ● Math
Climateprediction.net ● Oxford University ● Climate change prediction
U of Wisconsin, Max Planck Inst. Gravitational waves; gravitational pulsars
● UC Berkeley ● SETI
● RPI ● Structure of the Milky Way galaxy
GPUGRID.net ● Barcelona Biomed Inst. ● Protein structure and dynamics
D-Wave Systems Simulation of “adiabatic quantum algorithms” for binary quadratic optimization
Quake Catcher Network
BOINC software overview client apps screensaver GUI scheduler MySQL data server daemons volunteer host project server HTTP
Anonymous platform mechanism Volunteer supplies app versions. – security – optimization – unsupported platforms
Account managers
Using virtual machines Application is VM wrapper + virtual machine image + executable BOINC client VM wrapper hypervisor (VirtualBox) VM
Organizational issues ● Single-scientist projects: a dead end ● Barriers to entry are too high ● Wrong marketing model ● Doesn’t handle sporadic requirements ● Umbrella projects ● IBM World Community Grid ● Campus-level ● Science portals (‘Hubs’)
How to realize this? A better model: ScienceUSA.org
Conclusion ● For most scientific computing, volunteer computing is far cheaper than either clouds or clusters ● Volunteer computing involves the public in science ● What is the future? ● will mobile and semi-mobile devices replace desktops and laptops?