Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 2, 2007.

Slides:

Advertisements

Similar presentations

BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.

Advertisements

BOINC Berkeley Open Infrastructure for Network Computing An open-source middleware system for volunteer and grid computing (much of the images and text.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Volunteer Computing.

High-Performance Task Distribution for Volunteer Computing Rom Walton

BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.

Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.

Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.

Public-resource computing for CEPC Simulation Wenxiao Kan Computing Center/Institute of High Physics Energy Chinese Academic of Science CEPC2014 Scientific.

1 port BOSS on Wenjing Wu (IHEP-CC)

Achievements and Opportunities in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 18 April 2008.

A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.

Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 7, 2008.

Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley.

Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.

Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.

Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.

David Cameron Riccardo Bianchi Claire Adam Bourdarios Andrej Filipcic Eric Lançon Efrat Tal Hod Wenjing Wu on behalf of the ATLAS Collaboration CHEP 15,

Lessons Learned from David P. Anderson Director, Spaces Sciences Laboratory U.C. Berkeley April 2, 2002.

David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.

Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.

and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.

BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.

TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.

Dr Jukka Klem CHEP06 1 Public Resource Computing at CERN – Philippe Defert, Markku Degerholm, Francois Grey, Jukka Klem, Juan Antonio.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.

Celebrating Diversity in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley Sept. 1, 2008.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

CernVM and Volunteer Computing Ivan D Reid Brunel University London Laurence Field CERN.

Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.

Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.

Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007.

Volunteer Computing: SETI and Beyond David P. Anderson University of California, Berkeley 7 June 2007.

Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.

Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec

The Future of Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab UH CS Dept. March 22, 2007.

Volunteer Computing in Biology David P. Anderson Space Sciences Lab U.C. Berkeley 10 Sept 2007.

Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.

David P. Anderson Space Sciences Laboratory University of California – Berkeley A Million Years of Computing.

Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.

The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.

All the computers in the world (~1 billion) BOINC: high-level goal Computational science biology, medicine Earth sciences, physics, astronomy, math, A.I.,...

Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.

David P. Anderson UC Berkeley Gilles Fedak INRIA The Computational and Storage Potential of Volunteer Computing.

Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.

Local Scheduling for Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab John McLeod VII Sybase March 30, 2007.

Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.

Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011.

Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab Nov. 15, 2006.

Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.

Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab January 30, 2007.

An Overview of Volunteer Computing

Volunteer Computing and BOINC

University of California, Berkeley

Building a Global Brain David P. Anderson U. C

Volunteer computing PC owners donate idle cycles to science projects

Volunteer Computing: Planting the Flag David P

Volunteer Computing: SETI and Beyond David P

Volunteer Computing for Science Gateways

Designing a Runtime System for Volunteer Computing David P

Exa-Scale Volunteer Computing

David P. Anderson Space Sciences Lab UC Berkeley LASER

The Global Status of Citizen Cyberscience

University of California, Berkeley

Exploring Multi-Core on

Presentation transcript:

Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 2, 2007

Outline Goals of volunteer computing How BOINC works Some volunteer computing projects Some research directions Non-technical problems

Goal: Use all the computers in the world, all the time, to do worthwhile things What do we mean by “computers”? Who owns the computers?  Individuals (60% and rising)  Organizations What does “worthwhile” mean?

BOINC (Berkeley Open Infrastructure for Network Computing) Middleware for volunteer computing Open-source (LGPL) Application-driven PC Projects Accounts Attachments with resource share 60% 40%

The volunteer computing game Internet Projects Volunteers Do more science Involve public in science

Volunteer computing != Grid computing Resource owners Managed systems? Clients behind firewall? anonymous, unaccountable; need to check results no – need plug & play software yes – pull model yes – software stack requirements OK no – push model identified, accountable ISP bill? yes no... nor is it “peer-to-peer computing”

Types of utility Compute cycles (usually FLOPS)  with latency bound  with RAM  with storage  guaranteed availability Storage  network bandwidth  availability Network bandwidth at specific time Wide/diverse deployment Human steering of computation

What is resource share? Given: set of hosts, attachments “Schedule”: what projects use resources when Ideally:  Weighted Pareto optimality of utility?  Across multiple hosts In BOINC  determines usage of bottleneck resources on a single host, e.g.: Resource shareDemand Usage A40%15% 15% B40%large 50% C20%large 25%

Credit: approximation of utility Currently  projects can grant however they want  default: FLOPS benchmarks * CPU time application provides FLOP count application does FP benchmark  cheat-proofing To do:  cheat-proof measurements of other resources  projects publish “credit per day” functions Normalization rule Accounting rule

How BOINC works: server DB Platforms Applications Jobs Job instances Accounts App versions Hosts

Job replication Problem: can’t trust volunteers  computational result  claimed credit Application-specific checks, no replication Replicated computing  do N copies, require that M of them agree  not bulletproof (collusion) time created validate; assimilate Job x x x created sent success Instance 1 x x x created sent error Instance 2 x x x created sent success Instance 3 x x x created sent success Instance 4 x x x

How to compare results? Problem: numerical discrepancies Stable problems: fuzzy comparison Unstable problems  Eliminate discrepancies compiler/flags/libraries  Homogeneous replication send instances only to numerically equivalent hosts (equivalence may depend on app)

Work flow work generator (creates stream or batches of jobs) assimilator (handles correct result) validator (compares replicas, selects “correct” result) BOINC

Volunteer’s view 1-click install, zero configuration All platforms Invisible, autonomic

BOINC client structure core client application BOINC library GUI screensave r local TCP schedulers, data servers Runtime system user preferences, control

Communication: “Pull” model client scheduler I can run Win32 and Win MB RAM 20GB free disk 2.5 GFLOPS CPU (description of current work) Here are three jobs. Job 1 has application files A,B,C, input files C,D,E and output file F...

Example: ClimatePrediction.net Application: UK Met Office Unified Model State-of-the-art global climate model  1 million lines of FORTRAN High-dimensional search space  model parameters  boundary conditions  perturbed initial conditions

ClimatePrediction.net Using supercomputers:  1 day per run  total runs Using BOINC:  6 months per run  50,000 active hosts  171,343 runs completed  Nature papers  60-fold savings

Some other BOINC-based projects  LIGO; gravitational wave astronomy  U. Washington; protein study  U.C. Berkeley; SETI  CERN; accelerator simulation  STI, U. of Geneva; malaria epidemiology IBM World Community Grid  several biomedical applications...and about 30 others

Computing power  650 TeraFLOPS 200 from PCs; 50 from GPUs; 400 from PS3 BOINC-based projects:

A sampling of research problems Data-intensive computing Low-latency computing Background utility compatibility Credit mechanism Efficient validation Game consoles and graphics chips Simulation

Data-intensive computing – client limits Q = network transfer per GFLOPS/hr Q = 0.1 MB but wider range is OK:

Server-side limits Internet Server Client $ $

Using free networks commodity Internet Internet2 Client $ Server$ $ $

Using more free networks commodity Internet Internet2 Client $ Server $ $ $ LAN Client

Low-latency computing VC usually minimizes connection frequency What if you want to do 10,000 1-minute jobs in 6 minutes of wall time? job submission time 2 min deadline 4 min

Background utility compatibility Background utilities  disk defrag  disk indexing  virus scanning  web pre-fetch  disk backup Most run only when computer is idle  volunteer computing ==> they never run A) ignore zero-priority CPU activity B) Background manager  intelligent decision about when to run various activities

Credit mechanism Already described

Efficient validation How to validate provably and efficiently with replication factor approaching 1?

Game consoles and graphics chips NVIDIA, ATI, Cell  10X CPU and gaining?  ATI version (50 GFLOPS)  Sony PS3 version (100 GFLOPS) BOINC and on PS3 How to make this available to other projects?

Simulating volunteer computing Ad-hoc development of scheduling policies  slow, noisy  jeopardizes running projects Simulation-based R&D  client simulator client scheduling policies  Project simulator server scheduling policies  Global simulator study data-intensive, low-latency, etc.

The hard non-technical problems How to increase the number of volunteers?  currently 1 in 1000 PC owners How to increase the number of projects?  currently stuck at about 50 How to get volunteers to diversify?

How to attract and retain volunteers? Retention  reminder s  frequent science updates Recruitment  Viral “ a friend”, referral reward  Organizational World Community Grid: “partner” program  Media coverage need more discoveries  Bundling Active hosts:

Why aren’t there more projects? Lack of PR among scientists IT antipathy Creating a BOINC project is expensive: Science App development Experiment design Paper writing Research group Software/IT Port/debug apps workflow tools server admin Communications Web site development message board admin public relations

Meta-projects Virtual Campus Supercomputing Center  Deployment and publicity: PC labs, staff/faculty desktops students alumni public IBM World Community Grid Science App development Experiment design Paper writing Research groups Software/IT Port/debug apps workflow tools server admin Communications Web site development message board admin public relations UCB staff

Encouraging change Cross-project credit system  encourage competition in total credit, not per-project Account Managers  Make it easier to discover/attach/detach projects  GridRepublic, BAM! Science Stock Market?  encourage participation in new high-potential projects Scientific Mutual Funds?  e.g. American Cancer Society BOINC “portfolio”

Conclusion Volunteer computing: a new paradigm  distinct research problems, software requirements  big accomplishments, potential Social impacts Contact me about:  Using BOINC  Research based on BOINC