Presentation is loading. Please wait.

Presentation is loading. Please wait.

Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007.

Similar presentations


Presentation on theme: "Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007."— Presentation transcript:

1 Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007

2 Outline ● Goals of volunteer computing ● How BOINC works ● Projects using BOINC ● Future directions

3 Goal: Use all the computers in the world to do something worthwhile ● What do we mean by “computers”? ● Who owns the computers? – Individuals (60% and rising) – Organizations ● What does “worthwhile” mean?

4 BOINC (Berkeley Open Infrastructure for Network Computing) ● Middleware for volunteer computing ● Open-source (LGPL) ● Application-driven PC Projects Accounts Attachments with resource share 60% 40 %

5 The volunteer computing game Internet Projects Volunteers ● Do more science ● Involve public in science

6 Computing power ● Folding@home: – 650 TeraFLOPS ● 200 from PCs; 50 from GPUs; 400 from PS3 ● BOINC-based projects:

7 Cost per TeraFLOPS-year ● Cluster (6.8 TeraFLOPS) – power and A/C: $750K – network hardware: $175K – computing hardware (780 nodes): $1000K – storage (300 TB RAID-6): $250K – power: $140K/year – sysadmin: $150K/year – total: $124K ● Amazon EC2: $1.75M ● Average BOINC project: $2K

8 Volunteer computing <> Grid computing Resource owners Managed systems? Clients behind firewall? anonymous, unaccountable; need to check results no – need plug & play software yes – pull model yes – software stack requirements OK no – push model identified, accountable ISP bill? ye s nono... nor is it “peer-to-peer computing”

9 How BOINC works: server DB Platforms Application s Job s Job instances Account s App versions Host s

10 Job replication ● Problem: can’t trust volunteers – computational result – claimed credit ● No replication, application-specific checks ● Replicated computing – do N copies, require that M of them agree – not bulletproof (collusion) time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 created validate; assimilate Job x x x created sent success Instance 1 x x---------------x created sent error Instance 2 x x--------x created sent success Instance 3 x x-------------------x created sent success Instance 4 x x----------------------x

11 How to compare results? ● Problem: numerical discrepancies ● Stable problems: fuzzy comparison ● Unstable problems – Eliminate discrepancies ● compiler/flags/libraries – Homogeneous replication ● send instances only to numerically equivalent hosts (equivalence may depend on app)

12 Work flow work generator (creates stream or batches of jobs) assimilator (handles correct result) validator (compares replicas, selects “correct” result) BOINC

13 Ways to create a BOINC project ● Set up a server manually ● Use the BOINC virtual server ● Use the BOINC VM for Amazon EC2 – (in development) ● Apply to IBM World Community Grid

14 Volunteer’s view ● 1-click install, zero configuration ● All platforms ● Invisible, autonomic

15 BOINC client structure core client application BOINC library GUI screensave r local TCP schedulers, data servers Runtime system user preferences, control

16 Communication: “Pull” model client scheduler I can run Win32 and Win64 512 MB RAM 20GB free disk 2.5 GFLOPS CPU (description of current work) Here are three jobs. Job 1 has application files A,B,C, input files C,D,E and output file F...

17 Some BOINC projects ● Climateprediction.net – Oxford University – Global climate modeling ● Einstein@home – LIGO scientific collaboration – gravitational wave detection ● SETI@home – U.C. Berkeley – Radio search for E.T.I. and black hole evaporation ● Leiden Classical – Leiden University – Surface chemistry using classical dynamics

18 More projects ● LHC@home – CERN – simulator of LHC, collisions ● QMC@home – Univ. of Muenster – Quantum chemistry ● Spinhenge@home – Bielefeld Univ. – Sutdy nanoscale magnetism ● ABC@home – Leiden Univ. – Number theory

19 Biomed-related BOINC projects ● Rosetta@home – University of Washington – Rosetta: Protein folding, docking, and design – 90,000 hosts, 37 TeraFLOPS ● Tanpaku – Tokyo Univ. of Science – Protein structure prediction using Brownian dynamics ● MalariaControl – The Swiss Tropical Institute – Epidemiological simulation

20 More projects ● Predictor@home – Scripps Institute – CHARMM, protein structure prediction ● SIMAP – Tech. Univ. of Munich – Protein similarity matrix ● Superlink@Technion – Technion – Genetic linkage analysis using Bayesian networks

21 More projects (IBM WCG) ● Dengue fever drug discovery – U. of Texas, U. of Chicago – Autodock ● Human Proteome Folding – New York University – Rosetta ● FightAIDS@home – Scripps Institute – Autodock

22 Berkeley@home ● Campus-level “meta-project” ● Applications – 6 pilot apps: climate, fluid dynamics, nanotechnology, genetics, ● Volunteers – 1,000 instructional PCs – 5,000 faculty/staff – 30,000 students – 400,000 alumni – general public ● NSF proposal submitted

23 Multi-threading support ● What’s in a $1000 PC? – 2007: dual-core CPU, 4 GFLOPS, 1 GB RAM – 2010: 80-core CPU, 100 GFLOPS, 8 GB RAM – Volunteer computing provides a use for all those cores, but you may run out of RAM ● BOINC support for multi-thread apps ● Languages/libraries for parallel programming – Open MP – Titanium, Cilk, RapidMind, PeakStream... core client app Try to use N cores OK, I’m using M cores

24 Distributed thinking ● Web-based vision tasks – Stardust@home, Clickworkers, galaxy classification ● Amazon “Mechanical Turk” ● Validation ● Formulation as multi-person game – Louis von Ahn: image tagging ● Motivational axes: competitio n communit y

25 Rosetta@home plan ● Protein structure prediction – low-res: combinatorial, spatial, intuitive; humans do better than computers – high-res: computers do better ● Interactive “protein manipulation” program ● Teams as management structures – tasks are given to (possibly multiple) teams – managers organize and schedule sub-groups with particular skills or resources – communication paths between sub-groups

26 Berkeley Open Learning Technology (BOLT) ● DB-driven CMS and analytics engine for web-based teaching content (lessons, exercises) course structure (XML) teaching engine (PHP) Sequencing, navigation student info, interaction DB Student s analytical tools Educator s

27 Integration Accounts, teams and groups Communication Credit and competition BOINC hosts applicatio ns jobs BOLT lessons courses BOSSA tasks

28 Conclusion ● Volunteer computing: a new paradigm – Distinct research problems, software requirements – Computing power ● More ● Cheaper ● Democratic allocation – Social impact ● Contact me about: – Using BOINC – Research based on BOINC davea@ssl.berkeley.edu


Download ppt "Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007."

Similar presentations


Ads by Google