Presentation is loading. Please wait.

Presentation is loading. Please wait.

Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.

Similar presentations


Presentation on theme: "Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007."— Presentation transcript:

1 Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007

2 Outline Science needs more computing power What is volunteer computing? How BOINC works Projects using BOINC Future directions

3 Simulation of physical systems Biolog y Climate study Cosmology

4 Data analysis Physic s Astronom y

5 Genetic algorithms and other new computational paradigms

6 Where’s the computing power? Goals of volunteer computing  give science access to maximal computer power  allocate resources based on merit, not money owned by individuals (~1 billion) owned by companies (~100M) owned by government (~50M)

7 A brief history of volunteer computing Projects Platforms 19952005 2000 distributed.net, GIMPS SETI@home, Folding@home Popular Power Entropia United Devices, Parabon BOINC Climateprediction.net Einstein, Rosetta@home IBM World Community Grid

8 The BOINC volunteer/project model Accounts PC Attachments Resource shares 40 % 60 % Volunteers Project s IBM WCG Climateprediction.ne t SETI@home Rosetta@home Einstein@home...

9 The volunteer computing game Internet Projects Volunteers Do more science Involve public in science

10 Participation and computing power 500K active participants, 700K computers ~40 projects Computing power: about 2 PetaFLOPS  That’s about 8X an IBM Blue Gene L ($300M)

11 Cost per TeraFLOPS-year Cluster (6.8 TeraFLOPS)  power and A/C: $750K  network hardware: $175K  computing hardware (780 nodes): $1000K  storage (300 TB RAID-6): $250K  power: $140K/year  sysadmin: $150K/year  total: $124K/year Amazon EC2: $1.75M/year Average BOINC project: $2K/year

12 Volunteer computing ≠ Grid computing Resource owners Managed systems? Clients behind firewall? anonymous, unaccountable; need to check results no – need plug & play software yes – pull model yes – software stack requirements OK no – push model identified, accountable ISP bill? ye s nono... nor is it “peer-to-peer computing”

13 The BOINC project Location: UC Berkeley Space Sciences Lab Personnel  director: David Anderson  employees: 1.5 programmers  lots of volunteers Funding  supported by NSF since 2002  current grant runs through Aug 2010

14 What the BOINC project does We develop software for volunteer computing We enable on-line communities What we don’t do: branding, hosting, authorizing, endorsing, controlling

15 BOINC software Distributed under LGPL license Server side  uses Linux, Apache, MySQL, PHP  Job distribution: C++, 20K lines  Web features: PHP, 30K lines Client side  uses WxWidgets, OpenGL  Client: C++, 30K lines  GUI: C++, 45K lines

16 Job replication Problem: can’t trust volunteers  computational result  claimed credit No replication, application-specific checks Replicated computing  do N copies, require that M of them agree  not bulletproof (collusion) time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 created validate; assimilate Job x x x created sent success Instance 1 x x---------------x created sent error Instance 2 x x--------x created, sent success Instance 3 x x------------x

17 How to compare replicated results? Problem: numerical discrepancies Stable problems: fuzzy comparison Unstable problems  Eliminate discrepancies compiler/flags/libraries  Homogeneous replication send instances only to numerically equivalent hosts (equivalence may depend on app)

18 Work flow work generator (creates stream or batches of jobs) assimilator (handles correct result) validator (compares replicas, selects “correct” result) BOINC

19 BOINC server software High performance, scalability (10M jobs/day) Recovery from client errors and malfeasance MySQL DB (accounts, jobs, etc.) scheduler web site features file upload/ download executables, input files, output files assimilator DB purge file deleter transitioner validator work generator Clients and volunteers

20 Ways to create a BOINC project Set up a server on a Linux box Use the BOINC virtual server (VMware) Use the BOINC VM for Amazon EC2  (in development) Apply to IBM World Community Grid

21 Volunteer’s view 1-click install, zero configuration All platforms Invisible, autonomic

22 BOINC client structure core client application BOINC library GUI screensaver local TCP schedulers, data servers Runtime system user preferences, control

23 Communication: “Pull” model client scheduler I can run Win32 and Win64 512 MB RAM 20GB free disk 2.5 GFLOPS CPU (description of current work) Here are three jobs. Job 1 has application files A,B,C, input files C,D,E and output file F...

24 The BOINC community Projects Other volunteer programmers Alpha testers Online Skype-based help Translators (web, client) Documentation (Wiki) Teams

25 Some BOINC projects Climateprediction.net  Oxford University  Global climate modeling Einstein@home  LIGO scientific collaboration  gravitational wave detection SETI@home  U.C. Berkeley  Radio search for E.T.I. and black hole evaporation Leiden Classical  Leiden University  Surface chemistry using classical dynamics

26 More projects LHC@home  CERN  simulator of LHC, collisions QMC@home  Univ. of Muenster  Quantum chemistry Spinhenge@home  Bielefeld Univ.  Sutdy nanoscale magnetism ABC@home  Leiden Univ.  Number theory

27 Biomed-related BOINC projects Rosetta@home  University of Washington  Rosetta: Protein folding, docking, and design Tanpaku  Tokyo Univ. of Science  Protein structure prediction using Brownian dynamics MalariaControl  The Swiss Tropical Institute  Epidemiological simulation

28 More projects Predictor@home  Scripps Institute  CHARMM, protein structure prediction SIMAP  Tech. Univ. of Munich  Protein similarity matrix Superlink@Technion  Technion  Genetic linkage analysis using Bayesian networks

29 More projects (IBM WCG) Dengue fever drug discovery  U. of Texas, U. of Chicago  Autodock Human Proteome Folding  New York University  Rosetta FightAIDS@home  Scripps Institute  Autodock

30 Future work How to get more volunteers?  media  bundling  social networks How to get more projects? How to use future hardware?  multicore CPUs  GPUs  video game consoles (e.g., PS3/Cell)  set-top boxes  mobile devices

31 Berkeley@home Campus-level “meta-project” Applications  6 pilot apps: climate, fluid dynamics, nanotechnology, genetics, Volunteers  1,000 instructional PCs  5,000 faculty/staff  30,000 students  400,000 alumni  general public NSF proposal submitted

32 Citizen Cyber-Science Distributed thinking  Stardust@home, Clickworkers, GalaxyZoo  Rosetta@play: protein-folding game New software initiatives: Bolt and Bossa

33 Conclusion Volunteer computing: a new paradigm  Distinct research problems, software requirements  Computing power More Cheaper Democratic allocation  Social impact Contact me about:  Using BOINC  Research based on BOINC davea@ssl.berkeley.edu


Download ppt "Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007."

Similar presentations


Ads by Google