David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.

Slides:



Advertisements
Similar presentations
BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
Advertisements

BOINC Berkeley Open Infrastructure for Network Computing An open-source middleware system for volunteer and grid computing (much of the images and text.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Volunteer Computing.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 2, 2007.
BOINC The Year in Review David P. Anderson Space Sciences Lab U.C. Berkeley 12 Sept 2008.
Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Public-resource computing for CEPC Simulation Wenxiao Kan Computing Center/Institute of High Physics Energy Chinese Academic of Science CEPC2014 Scientific.
Achievements and Opportunities in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 18 April 2008.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
HTCondor and BOINC. › Berkeley Open Infrastructure for Network Computing › Grew out of began in 2002 › Middleware system for volunteer computing.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 7, 2008.
Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
BOINC.
Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010.
Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Dr Jukka Klem CHEP06 1 Public Resource Computing at CERN – Philippe Defert, Markku Degerholm, Francois Grey, Jukka Klem, Juan Antonio.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Celebrating Diversity in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley Sept. 1, 2008.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
A Tour of Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley 14 Sept 2007.
Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec
The Future of Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab UH CS Dept. March 22, 2007.
Volunteer Computing in the Next Decade David Anderson Space Sciences Lab University of California, Berkeley 4 May 2012.
Volunteer Computing in Biology David P. Anderson Space Sciences Lab U.C. Berkeley 10 Sept 2007.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
David P. Anderson Space Sciences Laboratory University of California – Berkeley A Million Years of Computing.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
Volunteer Computing: the Ultimate Cloud Dr. David P. Anderson University of California, Berkeley Oct 19, 2010.
A Brief History of (CPU) Time -or- Ten Years of Multitude David P. Anderson Spaces Sciences Lab University of California, Berkeley 2 Sept 2010.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab Nov. 15, 2006.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab January 30, 2007.
An Overview of Volunteer Computing
A Brief History of BOINC
Volunteer Computing and BOINC
The Future of Volunteer Computing
University of California, Berkeley
Building a Global Brain David P. Anderson U. C
Volunteer computing PC owners donate idle cycles to science projects
Volunteer Computing: Planting the Flag David P
Volunteer Computing: SETI and Beyond David P
Volunteer Computing for Science Gateways
Designing a Runtime System for Volunteer Computing David P
Exa-Scale Volunteer Computing
David P. Anderson Space Sciences Lab UC Berkeley LASER
The Global Status of Citizen Cyberscience
Presentation transcript:

David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing

A brief history of volunteer computing Applications Platforms distributed.net, GIMPS Entropia, United Devices,... BOINC Climateprediction.net WCG, Einstein, Rosetta, Bayanihan, Javelin,...

Applications Computational biology  protein folding and structure prediction Rosetta++ Biomedical, plant genomics  virtual drug design Autodock, CHARMM Cancer, AIDS, Alzheimer’s, Dengue fever  genetic linkage analysis  phylogenetics Epidemiology  Malaria model Environmental studies  “Virtual Prairie” simulation

More applications High-energy physics  CERN: accelerator, collision simulations Climate prediction  HADSM3 (U.K.)‏  WRF (NCAR)‏ Astronomy  gravitational wave detection  SETI  Milky Way, Big Bang studies Nanotechnology Mathematics Distributed seismography

The PetaFLOPS milestone Sept 19, 2007  current average: 2.67 PetaFLOPS  40% Cell (40K Sony PS3)‏  40% GPU (10K NVIDIA)‏  20% ‏CPU (250,000 computers)‏ BOINC: Jan 31, 2008  current average: 1.2 PetaFLOPS  568,000 computers; 87% Windows)‏ First supercomputer: May 25, 2008  IBM RoadRunner  PetaFLOPS  $133M

Cost per TeraFLOPS-year Cluster: $124,000 Amazon EC2: $1,750,000 Volunteer computing: $2,000

The real goals Enable paradigm-shifting science  change the way resources are allocated Revive public interest in science  avoid return to the Dark Ages So we need to:  make volunteer computing feasible for all scientists  involve the entire public, not just the geeks  solve the “project discovery” problem Progress: non-zero but small

The road to ExaFLOPS Consumer computing resources  CPUs in PCs (desktop, laptop)‏  GPUs in PCs  Video-game consoles  mobile devices  home media devices For each type  what is performance potential? how will it change over time?  ease of programming?  energy efficiency?  network connectivity?  how to publicize and deploy?

CPUs 2 billion PCs by 2015 Performance increases largely from multicore  need to develop parallel apps Availability will decline (green computing)‏ 1 ExaFLOPS:  40,000,000 PCs x 100 GFLOPS x 0.25 availability Promotional partner: MS? HP? Dell?

GPUs NVIDIA 8800: ~500 GFLOPS Programmability: CUDA; OpenCL? 1 ExaFLOPS:  4,000,000 x 1,000 GFLOPS x 0.25 availability

Video-game consoles Sony Playstation 3  Cell (~100 GFLOPS) + GPU  Ships with  Hard to program Microsoft Xbox  3 PowerPC cores (~30GFLOPS) + GPU 0.25 ExaFLOPS:  10,000,000 consoles x 100 GFLOPS x 0.25 availability

Mobile devices (recharging)‏ Cell phones, PDAs, media players, Kindle, etc. Hardware convergence  0.5 GFLOPS CPU (Freescale i.mx37, 65 nm)‏ low power (best FLOPS/watt)‏  >256MB RAM  >10GB stable storage  Internet access  Software  Google Android? 3.3 billion cell phones in ExaFLOPS:  1B x 1 GFLOPS x 0.5 availability

Home media players Cable set-top box, Blu-Ray player Hardware: low-end PC Software environment: Java-based Multimedia home platform (MHP)‏ 0.1 ExaFLOPS:  100M x 2 GFLOPS x 0.5 availability

The BOINC project NSF-funded, based at UC Berkeley  2.5 FTEs  many volunteers Functions:  develop technology for volunteer and desktop grid computing  enable online communities  do research related to volunteer computing

BOINC server software Job scheduling  high performance (10M jobs/day)‏  scalability Web code (PHP)‏  community, social network Ways to create a project:  Set up a server on a Linux box  Run BOINC server VM (VMware)‏  Run BOINC server VM on Amazon EC2 MySQL DB (~1M jobs)‏ scheduler (CGI)‏ Clients feeder shared memory (~1K jobs)‏ Various daemons

BOINC client software core client application BOINC library GUI screensaver local TCP schedulers, data servers user preferences, control Cross-platform (Win/Mac/Linux)‏ Simple, configurable, secure, invisible graphics app BOINC library

BOINC’s project/volunteer model Attachments volunteer PC Projects Independent No central authority ID: URL Climateprediction.net World Comm. Grid

Facilitating project discovery volunteer PC BOINC-based projects Climateprediction.net World Comm. Grid Account Manager Web services

Application platform Multithread and coprocessor support client scheduler List of platforms, Coprocessors #CPUs jobs, app versions app planning function app versions platform app version job Inputs: host, app class Outputs: avg/max #CPUs coprocessor usage estimated FLOPS

Adaptive replication Volunteer PCs are anonymous and untrusted  how do we know results are correct? Replicated computing  require consensus of equivalent results  2x throughput penalty Adaptive replication  maintain estimate of host “validity rate” V(h)‏  if V(h) > K, replicate  else replicate with probability V(h)/K  goal: reduce throughput penalty to 1+ε

Simulators Scheduling policies  client: when to fetch work? what project? how much? CPU scheduling  server: what jobs to send to a given client? Problems with in situ experimentation  hard to control  can do a lot of damage Simulators  client simulator: 1 client, N projects  server simulator (EmBA): 1 project, N clients

Volunteer-facing features Motivators  competition  community Credit  cross-project statistics Web features  friend lists, private messages, message boards  teams MySpace and Facebook widgets and apps

Organizational models Single-scientist projects: a dead-end? Campus-level meta-project: e.g. U. of Houston:  1,000 instructional PCs  5,000 faculty/staff  30,000 students  400,000 alumni Lattice: U. Maryland Center for Bioinformatics MindModeling.org  ACT-R community (~20 universities)‏ IBM World Community Grid  ~8 applications from various institutions Extremadura (Spain)‏  consortium of 5-10 universities EDGeS (SZTAKI)‏  Almere Grid: community

Distributed thinking Clickworkers, GalaxyZoo, Fold It! What can people do better than computers?

New software initiatives Bossa: middleware for distributed thinking  job queueing and replication  volunteer skill estimation Bolt: middleware for web-based training and education Shared infrastructure: malicious useless useful savants BOINC volunteer computing Bolt teaching, training Bossa distributed thinking BOINC Basics accounts, groups, credit, communication

Conclusion Volunteer computing  Some big achievements, but not close to potential  Problems are organizational/political, not technical  Volunteer computing + GPUs = ExaFLOPS Distributed thinking  What are the apps?  What are middleware requirements? Interested in either one? – let’s talk!