Volunteer Computing in Biology David P. Anderson Space Sciences Lab U.C. Berkeley 10 Sept 2007.

Slides:



Advertisements
Similar presentations
BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
Advertisements

BOINC Berkeley Open Infrastructure for Network Computing An open-source middleware system for volunteer and grid computing (much of the images and text.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Volunteer Computing.
High-Performance Task Distribution for Volunteer Computing Rom Walton
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 2, 2007.
Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.
Volunteer Thinking with Bossa David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Public-resource computing for CEPC Simulation Wenxiao Kan Computing Center/Institute of High Physics Energy Chinese Academic of Science CEPC2014 Scientific.
Achievements and Opportunities in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 18 April 2008.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 7, 2008.
Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
BOINC.
Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010.
David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.
The 3 rd Pan-Galactic Workshop. Schedule 9:00 – 10:00BOINC status report 10:00 – 10:15Break 10:15 – 11:30BOINC plans 11:30 – 12:30Discussion 12:30 – 1:30Lunch.
Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Celebrating Diversity in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley Sept. 1, 2008.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
CernVM and Volunteer Computing Ivan D Reid Brunel University London Laurence Field CERN.
A Tour of Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007.
Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec
The Future of Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab UH CS Dept. March 22, 2007.
Volunteer Computing in the Next Decade David Anderson Space Sciences Lab University of California, Berkeley 4 May 2012.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
David P. Anderson Space Sciences Laboratory University of California – Berkeley A Million Years of Computing.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
Volunteer Computing: the Ultimate Cloud Dr. David P. Anderson University of California, Berkeley Oct 19, 2010.
A Brief History of (CPU) Time -or- Ten Years of Multitude David P. Anderson Spaces Sciences Lab University of California, Berkeley 2 Sept 2010.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab Nov. 15, 2006.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab January 30, 2007.
Bossa: A platform for distributed thinking David P. Anderson UC Berkeley Space Sciences Lab 11 Oct 2007.
An Overview of Volunteer Computing
A Brief History of BOINC
Volunteer Computing and BOINC
University of California, Berkeley
Building a Global Brain David P. Anderson U. C
Volunteer computing PC owners donate idle cycles to science projects
Volunteer Computing: Planting the Flag David P
Volunteer Computing: SETI and Beyond David P
Volunteer Computing for Science Gateways
Designing a Runtime System for Volunteer Computing David P
Exa-Scale Volunteer Computing
The Global Status of Citizen Cyberscience
Haiyan Meng and Douglas Thain
University of California, Berkeley
Exploring Multi-Core on
Presentation transcript:

Volunteer Computing in Biology David P. Anderson Space Sciences Lab U.C. Berkeley 10 Sept 2007

Outline ● Goals of volunteer computing ● How BOINC works ● Some biology projects using BOINC ● Some new directions

Goal: Use all the computers in the world to do worthwhile things ● What do we mean by “computers”? ● Who owns the computers? – Individuals (60% and rising) – Organizations ● What does “worthwhile” mean?

BOINC (Berkeley Open Infrastructure for Network Computing) ● Middleware for volunteer computing ● Open-source (LGPL) ● Application-driven PC Projects Accounts Attachments with resource share 60% 40 %

The volunteer computing game Internet Projects Volunteers ● Do more science ● Involve public in science

Computing power ● – 650 TeraFLOPS ● 200 from PCs; 50 from GPUs; 400 from PS3 ● BOINC-based projects:

Cost per TeraFLOPS-year ● Cluster (6.8 TeraFLOPS) – power and A/C: $750K – network hardware: $175K – computing hardware (780 nodes): $1000K – storage (300 TB RAID-6): $250K – power: $140K/year – sysadmin: $150K/year – total: $124K ● Amazon EC2: $1.75M ● Average BOINC project: $1.25K

Volunteer computing <> Grid computing Resource owners Managed systems? Clients behind firewall? anonymous, unaccountable; need to check results no – need plug & play software yes – pull model yes – software stack requirements OK no – push model identified, accountable ISP bill? ye s nono... nor is it “peer-to-peer computing”

How BOINC works: server DB Platforms Application s Job s Job instances Account s App versions Host s

Job replication ● Problem: can’t trust volunteers – computational result – claimed credit ● No replication, application-specific checks ● Replicated computing – do N copies, require that M of them agree – not bulletproof (collusion) time created validate; assimilate Job x x x created sent success Instance 1 x x x created sent error Instance 2 x x x created sent success Instance 3 x x x created sent success Instance 4 x x x

How to compare results? ● Problem: numerical discrepancies ● Stable problems: fuzzy comparison ● Unstable problems – Eliminate discrepancies ● compiler/flags/libraries – Homogeneous replication ● send instances only to numerically equivalent hosts (equivalence may depend on app)

Work flow work generator (creates stream or batches of jobs) assimilator (handles correct result) validator (compares replicas, selects “correct” result) BOINC

Ways to create a BOINC project ● Set up a server manually ● Use the BOINC virtual server ● Use the BOINC VM for Amazon EC2 – (in development) ● Apply to IBM World Community Grid

Volunteer’s view ● 1-click install, zero configuration ● All platforms ● Invisible, autonomic

BOINC client structure core client application BOINC library GUI screensave r local TCP schedulers, data servers Runtime system user preferences, control

Communication: “Pull” model client scheduler I can run Win32 and Win MB RAM 20GB free disk 2.5 GFLOPS CPU (description of current work) Here are three jobs. Job 1 has application files A,B,C, input files C,D,E and output file F...

Biomed-related BOINC projects ● – University of Washington – Rosetta: Protein folding, docking, and design – 90,000 hosts, 37 TeraFLOPS ● Tanpaku – Tokyo Univ. of Science – Protein structure prediction using Brownian dynamics ● MalariaControl – The Swiss Tropical Institute – Epidemiological simulation

More projects ● – Scripps Institute – CHARMM, protein structure prediction ● SIMAP – Tech. Univ. of Munich – Protein similarity matrix ● – Technion – Genetic linkage analysis using Bayesian networks

More projects (IBM WCG) ● Dengue fever drug discovery – U. of Texas, U. of Chicago – Autodock ● Human Proteome Folding – New York University – Rosetta ● – Scripps Institute – Autodock

● Campus-level “meta-project” ● Applications – 6 pilot apps: climate, fluid dynamics, nanotechnology, genetics, ● Volunteers – 1,000 instructional PCs – 5,000 faculty/staff – 30,000 students – 400,000 alumni – general public ● NSF proposal submitted

plan ● Protein structure prediction – low-res: combinatorial, spatial, intuitive; humans do better than computers – high-res: computers do better ● Interactive “protein manipulation” program ● Teams as management structures – tasks are given to (possibly multiple) teams – managers organize and schedule sub-groups with particular skills or resources – communication paths between sub-groups

Multi-threading support ● What’s in a $1000 PC? – 2007: dual-core CPU, 4 GFLOPS, 1 GB RAM – 2010: 80-core CPU, 100 GFLOPS, 8 GB RAM – Volunteer computing provides a use for all those cores, but you may run out of RAM ● BOINC support for multi-thread apps ● Languages/libraries for parallel programming – Open MP – Titanium, Cilk, RapidMind, PeakStream... core client app Try to use N cores OK, I’m using M cores

Skill aggregation (human computing) ● Web-based vision tasks – Clickworkers, galaxy classification ● Amazon “Mechanical Turk” ● Validation ● Formulation as multi-person game – Louis von Ahn: image tagging ● Motivational axes: competitio n communit y

Berkeley Open Learning Technology (BOLT) ● DB-driven CMS and analytics engine for web-based teaching content (lessons, exercises) course structure (XML) teaching engine (PHP) Sequencing, navigation student info, interaction DB Student s analytical tools Educator s

DB and web integration Accounts, teams and groups Communication Credit and competition BOINC hosts applicatio ns jobs BOLT lessons courses BOSSA tasks

Conclusion ● Volunteer computing: a new paradigm – distinct research problems, software requirements – big accomplishments, potential ● Social impacts ● Contact me about: – Using BOINC – Research based on BOINC