David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

Slides:



Advertisements
Similar presentations
BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Volunteer Computing.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley May 2, 2007.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Platform as a Service (PaaS)
Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Public-resource computing for CEPC Simulation Wenxiao Kan Computing Center/Institute of High Physics Energy Chinese Academic of Science CEPC2014 Scientific.
1 port BOSS on Wenjing Wu (IHEP-CC)
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
درس تجارت الکترونیک دوره کارشناسی ارشد مجازی Volunteer Computing.
Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
07:44:46Service Oriented Cyberinfrastructure Lab, Introduction to BOINC By: Andrew J Younge
Lessons Learned from David P. Anderson Director, Spaces Sciences Laboratory U.C. Berkeley April 2, 2002.
BOINC.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
CernVM and Volunteer Computing Ivan D Reid Brunel University London Laurence Field CERN.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.
The Future of Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab UH CS Dept. March 22, 2007.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Volunteer Computing Involving the World in Science David P. Anderson Space Sciences Lab U.C. Berkeley 13 December 2007.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
David P. Anderson UC Berkeley Gilles Fedak INRIA The Computational and Storage Potential of Volunteer Computing.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011.
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab Nov. 15, 2006.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab January 30, 2007.
An Overview of Volunteer Computing
Platform as a Service (PaaS)
Volunteer Computing and BOINC
The Future of Volunteer Computing
The 9th Annual BOINC Workshop
Platform as a Service (PaaS)
University of California, Berkeley
Volunteer Computing: Planting the Flag David P
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Volunteer Computing: SETI and Beyond David P
Volunteer Computing for Science Gateways
Designing a Runtime System for Volunteer Computing David P
The Global Status of Citizen Cyberscience
The software infrastructure of II
Presentation transcript:

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Public-resource computing GIMPS, distributed.net climateprediction.net names: public-resource computing peer-to-peer computing (no!) public distributed computing computing your computers academ ic business home PCs

The potential of public computing ● 500,000 CPUs, 65 TeraFLOPs ● 1 billion Internet-connected PCs in 2010, 50% privately owned ● If 100M participate: – ~ 100 PetaFLOPs – ~ 1 Exabyte (10^18) storage public computing Grid computing cluster computing supercomputin g p CPU power, storage capacity cost

Public/Grid differences

Economics (0 th order) cluster/Grid computingpublic-resource computing resources ($$) resources (free) you Internet ($$) Network (free) $1 buys 1 computer/day or 20 GB data transfer on commercial Internet Suppose processing 1 GB data takes X computer days Cost of processing 1 GB: cluster/Grid: $X PRC: $1/20 So PRC is cheaper if X > 1/20 X = 1,000)

Economics revisited Underutilized free Internet (e.g. Internet2) you commodity Internet... other institutions Bursty, underutilized flat-rate ISP connection Traffic shapers can send at zero priority ==> bandwidth may be free also

Why isn't PRC more widely used? ● Lack of platform – jxta, Jabber: not a solution – Java: apps are in C, FORTRAN – commercial platforms: business issues – cosm, XtremWeb: not complete ● Need to make PRC technology easy to use for scientists

BOINC: Berkeley Open Infrastructure for Network Computing ● Goals for computing projects – easy/cheap to create and operate projects – wide range of applications possible – no central authority ● Goals for participants – easy to participate in multiple projects – invisible use of disk, CPU, network ● NSF-funded; open source; in beta test –

requirements ideal: current: commercial Internet Berkeley participants tapes Internet2 commercial Internet Berkeley Stanford USC participants 50 Mbps 0.3 MB = 8 hrs CPU

Climateprediction.net ● Global climate study (Oxford Univ.) ● Input: ~10MB executable, 1MB data ● CPU time: 2-3 months (can't migrate) ● Output per workunit: – 10 MB summary (always upload) – 1 GB detail file (archive on client, may upload) ● Chaotic (incomparable results)

(planned) ● Gravity wave detection; LIGO; UW/CalTech ● 30, MB data sets ● Each data set is analyzed w/ 40,000 different parameter sets; each takes ~6 hrs CPU ● Data distribution: replicated 2TB servers ● Scheduling problem is more complex than “bag of tasks”

Intel/UCB Network Study (planned) ● Goal: map/measure the Internet ● Each workunit lasts for 1 day but is active only briefly (pings, UDP) ● Need to control time-of-day when active ● Need to turn off other apps ● Need to measure system load indices (network/CPU/VM)

General structure of BOINC ● Project: ● Participant: Scheduling server (C++) BOINC DB (MySQL) Work generation data server (HTTP) App data server (HTTP) Web interfaces (PHP) Core client (C++) Project back end Retry generation Result validation Result processing Garbage collection

Project web site features ● Download core client ● Create account ● Edit preferences – General: disk usage, work limits, buffering – Project-specific: allocation, graphics – venues (home/school/work) ● Profiles ● Teams ● Message boards, adaptive FAQs

General preferences

Project-specific preferences

Data architecture ● Files – immutable, replicated – may originate on client or project – may remain resident on client ● Executables are digitally signed ● Upload certificates: prevent DOS arecibo_ _jun_23_ uwi7eyufiw8e972h8f9w

Computation abstractions ● Applications ● Platforms ● Application versions – may involve many files ● Work units: inputs to a computation – soft deadline; CPU/disk/mem estimates ● Results: outputs of a computation

Scheduling: pull model scheduling server core client data server request X seconds of work host description result 1... result n download upload...compute...

Redundant computing replicator assimilator validator work generator canonical result clients scheduler select canonical result assign credit

BOINC core client core client file transfers restartable concurrent user limited program execution semi-sandboxed graphics control checkpoint control % done, CPU time app API app API shared mem

User interface screensaver control panel core client control/state RPCs activate screensaver app graphics

Anonymous platform mechanism ● User compiles applications from source, registers them with core client ● Report platform as “anonymous” to scheduler ● Purposes: – obscure platforms – security-conscious participants – performance tuning of applications

Project management tools ● Python scripts for project creation/start/stop ● Remote debugging – collect/store crash info (stack trace) – web-based browsing interface ● Strip charts – record, graph system performance metrics ● Watchdogs – detect system failures; dial pager

Conclusion ● Public-resource computing is a distinct paradigm from Grid computing ● PRC has tremendous potential for many applications (computing and storage) ● BOINC: enabling technology for PRC –