Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 7, 2008.

Slides:



Advertisements
Similar presentations
BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
Advertisements

BOINC Berkeley Open Infrastructure for Network Computing An open-source middleware system for volunteer and grid computing (much of the images and text.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Volunteer Computing.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 2, 2007.
BOINC The Year in Review David P. Anderson Space Sciences Lab U.C. Berkeley 12 Sept 2008.
Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Public-resource computing for CEPC Simulation Wenxiao Kan Computing Center/Institute of High Physics Energy Chinese Academic of Science CEPC2014 Scientific.
1 port BOSS on Wenjing Wu (IHEP-CC)
Achievements and Opportunities in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 18 April 2008.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
HTCondor and BOINC. › Berkeley Open Infrastructure for Network Computing › Grew out of began in 2002 › Middleware system for volunteer computing.
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
07:44:46Service Oriented Cyberinfrastructure Lab, Introduction to BOINC By: Andrew J Younge
Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010.
David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.
Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
Dr Jukka Klem CHEP06 1 Public Resource Computing at CERN – Philippe Defert, Markku Degerholm, Francois Grey, Jukka Klem, Juan Antonio.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Celebrating Diversity in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley Sept. 1, 2008.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
A Tour of Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007.
Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec
The Future of Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab UH CS Dept. March 22, 2007.
Volunteer Computing in Biology David P. Anderson Space Sciences Lab U.C. Berkeley 10 Sept 2007.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
David P. Anderson Space Sciences Laboratory University of California – Berkeley A Million Years of Computing.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
Volunteer Computing: the Ultimate Cloud Dr. David P. Anderson University of California, Berkeley Oct 19, 2010.
A Brief History of (CPU) Time -or- Ten Years of Multitude David P. Anderson Spaces Sciences Lab University of California, Berkeley 2 Sept 2010.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab Nov. 15, 2006.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab January 30, 2007.
An Overview of Volunteer Computing
A Brief History of BOINC
Volunteer Computing and BOINC
University of California, Berkeley
Building a Global Brain David P. Anderson U. C
Volunteer computing PC owners donate idle cycles to science projects
Volunteer Computing: Planting the Flag David P
Volunteer Computing: SETI and Beyond David P
Volunteer Computing for Science Gateways
Designing a Runtime System for Volunteer Computing David P
Exa-Scale Volunteer Computing
The Global Status of Citizen Cyberscience
The software infrastructure of II
Presentation transcript:

Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 7, 2008

Where’s the computing power? Individuals (~1 billion PCs)‏ Companies (~100M PCs)‏ Government (~50M PCs)‏ Volunteer computing

A brief history of volunteer computing Projects Platforms distributed.net, GIMPS Popular Power Entropia United Devices, Parabon BOINC Climateprediction.net Einstein, IBM World Community Grid

The BOINC project Based at UC Berkeley Space Sciences Lab Funded by NSF since 2002 Personnel  director: David Anderson  other employees: 1.5 programmers  lots of volunteers What we do  develop open-source software  enable online communities What we don’t do  branding, hosting, authorizing, endorsing, controlling

The BOINC community Projects Volunteer programmers Alpha testers Online Skype-based help Translators (web, client)‏ Documentation (Wiki)‏ Teams

The BOINC model Attachments Your PC BOINC-based projects Climateprediction.net Oxford; climate study U. of Washington; biology MalariaControl.net STI; malaria epidemiology World Community Grid IBM; several applications... Simple (but configurable)‏ Secure Invisible Independent No central authority Unique ID: URL

The volunteer computing ecosystem Projects Public Do more science Involve public in science Teach, motivate volunteer

Participation and computing power BOINC  330K active participants  580K computers  ~40 projects  1.2 PetaFLOPS average throughput about 3X an IBM Blue Gene L (non-BOINC)‏  200K active participants  1.4 PetaFLOPS (mostly PS3)‏

Cost per TeraFLOPS-year Cluster: $124K Amazon EC2: $1.75M BOINC: $2K

The road to ExaFLOPS CPUs in PCs (desktop, laptop)‏  1 ExaFLOPS = 50M PCs x 80 GFLOPS x 0.25 avail. GPUs  1 ExaFLOPS = 4M x 1 TFLOPS x 0.25 avail. Video-game consoles (PS3, Xbox)‏ .25 ExaFLOPS = 10M x 100GFLOPS x 0.25 avail Mobile devices (cell phone, PDA, iPod, Kindle)‏ .05 ExaFLOPS = 1B x 100MFLOPS x 0.5 avail Home media (cable box, Blu-ray player)‏  0.1 ExaFLOPS = 100M x 1 GFLOPS x 1.0 avail

But it’s not about numbers The real goals:  enable new computational science  change the way resources are allocated  avoid return to the Dark Ages And that means we must:  make volunteer computing feasible for all scientists  involve the entire public, not just the geeks  solve the “project discovery” problem Progress towards these goals: nonzero but small

BOINC server software Goals  high performance (10M jobs/day)‏  scalability MySQL DB (~1M jobs)‏ scheduler (CGI)‏ Clients feeder shared memory (~1K jobs)‏ Various daemons

Database tables Application Platform  Win32, Win64, Linux x86, Java, etc. App version Job  resource usage estimates, bounds  latency bound  input file descriptions Job instance  output file descriptions Account, team, etc.

Data model Files  have both logical and physical names  immutable (per physical name)‏  may originate on client or server  may be “sticky”  may be compressed in various ways  transferred via HTTP or BitTorrent  app files must be signed Upload/download directory hierarchies

Submitting jobs Create XML description  input, output files  resource usage estimates, bounds  latency bound Put input files into dir hierarchy Call create_work()‏  creates DB record Mass production  bags of tasks  flow-controlled stream of tasks  self-propagating computations  trickle messages

Server scheduling policy Request message:  platform(s)‏  description of hardware CPU, memory, disk, coprocessors  description of availability  current jobs queued and in progress  work request (CPU seconds)‏ Send a set of jobs that  are feasible (will fit in memory/disk)‏  will probably get done by deadline  satisfy the work request

Application platform Multithread and coprocessor support client scheduler List of platforms, Coprocessors #CPUs jobs avg/max #CPUs, coprocessor usage command line app planning function app versions platform app version job

Result validation Problem: can’t trust volunteers  computational result  claimed credit Approaches:  Application-specific checking  Job replication do N copies, require that M of them agree  Adaptive replication  Spot-checking

How to compare results? Problem: numerical discrepancies Stable problems: fuzzy comparison Unstable problems  Eliminate discrepancies compiler/flags/libraries  Homogeneous replication send instances only to numerically equivalent hosts (equivalence may depend on app)‏

Server scheduling policy revisited Goals (possibly conflicting):  Send retries to fast/reliable hosts  Send long jobs to fast hosts  Send demanding jobs (RAM, disk, etc.) to qualified hosts  Send jobs already committed to a homogeneous redundancy class Project-defined “score” function  scan N jobs, send those with highest scores

Server daemons Per application:  work generator  validator  assimilator Transitioner  manages replication, creates job instances  triggers other daemons File deleter DB purger

Ways to create a BOINC server Install BOINC on a Linux box  lots of software dependencies Run BOINC server VM (Vmware)‏  need to worry about hardware Run BOINC server VM on Amazon EC2

BOINC API Typical application structure: boinc_init()‏ loop... boinc_fraction_done(x)‏ if boinc_time_to_checkpoint()‏ write checkpoint file boinc_checkpoint_completed()‏ boinc_finish(0)‏ Graphics Multi-program apps Wrapper for legacy apps

Volunteer’s view 1-click install All platforms Invisible, autonomic Highly configurable (optional)‏

BOINC client structure core client application BOINC library GUI screensaver local TCP schedulers, data servers Runtime system user preferences, control

Some BOINC projects Climateprediction.net  Oxford University  Global climate modeling  LIGO scientific collaboration  gravitational wave detection  U.C. Berkeley  Radio search for E.T.I. and black hole evaporation Leiden Classical  Leiden University  Surface chemistry using classical dynamics

More projects  CERN  simulator of LHC, collisions  Univ. of Muenster  Quantum chemistry  Bielefeld Univ.  Study nanoscale magnetism  Leiden Univ.  Number theory

Biomed-related BOINC projects  University of Washington  Rosetta: Protein folding, docking, and design Tanpaku  Tokyo Univ. of Science  Protein structure prediction using Brownian dynamics MalariaControl  The Swiss Tropical Institute  Epidemiological simulation

More projects  Univ. of Michigan  CHARMM, protein structure prediction SIMAP  Tech. Univ. of Munich  Protein similarity matrix  Technion  Genetic linkage analysis using Bayesian networks Quake Catcher Network  Stanford  Distributed seismograph

More projects (IBM WCG)‏ Dengue fever drug discovery  U. of Texas, U. of Chicago  Autodock Human Proteome Folding  New York University  Rosetta  Scripps Institute  Autodock

Organizational models Single-scientist projects: a dead-end? Campus-level meta-project  UC Berkeley: 1,000 instructional PCs 5,000 faculty/staff 30,000 students 400,000 alumni Lattice  U. Maryland Center for Bioinformatics MindModeling.org  ACT-R community (~20 universities)‏ IBM World Community Grid  ~8 applications from various institutions Extremadura (Spain)‏  consortium of 5-10 universities SZTAKI (Hungary)‏

Conclusion Individuals (~1 billion PCs)‏ Companies (~100M PCs)‏ Government (~50M PCs)‏ Volunteer computing  Contact me about: Using BOINC Research based on BOINC