BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley
Public-resource computing Home PCs business academic Advantages: scale free growth public education no policy issues Challenges: low BW at client costly BW at server firewall/NAT issues sporadic connection untrustworthy, insecure clients server security heterogeneity need PR, glitzy GUI your computers
Uses public-resource computing for radio DSP Better data analysis: –Signal types (Gaussian, spike, pulsed, triplet) –Multiple frequency resolutions –Multiple drift rates, coherent integration Tradeoff: smaller frequency band –2.5MHz vs. 100 MHz
Operations data recorder screensavers science DBuser DB WU storage splitters DLT tapes data server result queue acct. queue garbage collector tape archive, delete tape backup master DB redundancy checking RFI elimination repeat detection web site CGI program web page generator
Achievements of 1,000,000 years of CPU time in 3 years Sustained 30 TeraFLOPs 1.5E21 floating-point operations 3,600,000 users in 226 countries 40 Terabytes of data processed 3 billion “events” detected Solved scaling, security problems
Extensions of Broadband pulse search on existing data Parkes observatory: Southern sky Multi-beam receivers Expanded frequency range Use KL transform Data archival on clients
software shortcomings Monolithic Limited communication model Limited computation/data model Incoherent, insecure accounting
Goals of a PRC platform Research lab X University YPublic project Z projects applications resource pool Participants install one program, select projects, specify constraints; all else is automatic Projects are autonomous Advantages of a shared platform: Better long-term resource utilization Better instantaneous resource utilization Faster/cheaper for projects, software is better Easier for projects to get participants Participants learn more
Distributed computing platforms Academic and open-source –Globus –Cosm –XtremWeb –Jxta Commercial –Entropia –United Devices –Parabon
Goals of BOINC (Berkeley Open Infrastructure for Network Computing) Public-resource computing/storage Multi-project, multi-application –Participants can apportion resources Handle fairly diverse applications Work with legacy apps Support many participant platforms Small, simple
General structure of BOINC Project: Participant: Scheduling server (C++) BOINC DB (MySQL) Project work manager data server (HTTP) App agent data server (HTTP) Web interfaces (PHP) Core agent (C++)
Project Files Attributes: –Name –URL list –Persistent flag –Upload-when-present flag –executable –MD5 checksum Files are immutable Files may originate in client or in project work manager protein_db.12 ftp://x.y/z fw7398h
Projects/apps/versions Platforms Windows/x86 Linux/x86 MacOS/PPC … Applications nameBeta version Production version Arecibo SETI129 Parkes SETI115 …… App versions versionFile infos 11 … 13 … ……
Workunits and results Workunit nameXML docExpected resources cmdline args wu15 wu16 Result nameExit code CPU time XML inXML out Stderr out res res Application Host out123 out123 1 out aed847 Foo.C, line 124: divide by zero out123 out123 input
Work sequences (long computations with big footprints) Results (or workunits) are linked into “sequences” Normally, a result is sent only to the host that handled the previous result If a result times out, the sequence is shifted to another host Upload state Check for abort
Remote file management Clients regularly report persistent files Scheduling server maintains DB of files on active hosts Project work manager can issue requests for particular hosts to upload, download, or delete files
Hosts and scheduling Host measurements –CPU performance (integer/FP/memory) –RAM, cache, disk free/total –On/connected statistics –Network bandwidth statistics Workunit properties –RAM/disk/computation requirements Scheduling policy –feasibility –High/low water mark
Accounting and result validation Standardized unit of credit (CPEuro?) –CPU time * (int+FP+mem) Result validation (optional): –Compare redundant results, flag incorrect results Granted credit: –Minimum of claimed credit among correct results
Participant preferences Examples: –Work only while user away –Confirm before connecting –Don’t work if on batteries –High, low water marks –Limits on disk space, bandwidth –Application-specific preferences –List of projects + authenticators + % allocation Edited via Web interface Can define multiple “preference sets”
Participation Initial project: –Create account on project web site –Authenticator is ed –Install core client, enter authenticator Subsequent projects: –Authenticator is ed –Create account on project web site –Add project to preferences on home site
Client/server protocol (XML-RPC) Request –Authentication –Host description –Persistent file descriptions –Result descriptions –Duration of work requested Reply –Application, workunit, result descriptors –Result acknowledgements –File transfer descriptors –New preferences –Control messages (redirect, back off, etc.)
Core client: goals Function –Use multiprocessors –Concurrent communicate/compute –Low-profile file transfer –Obey preferences Appearance –Application, screensaver or service Multi-platform
Client file structure BOINC home directory Project_1Project_nCPU_1CPU_m app1.exe file1332 file3328 app.exe infile outfile boinc.exe client_state.xml accounts.xml links
file transfers running applications wait() Client FSM structure active sockets select() HTTP transactions main loop poll Scheduler requests
Application Programming API is optional Checkpoint/restart: MFILE class Graphics –Application window –Screensaver Interact w/ core client via XML files
Conclusion BOINC status –Mostly feature-complete –Client runs on Linux, Solaris, Windows, MacOS X –Small: client is 5,000 lines, server 2,000 Projects: –Astropulse (later this year) –Other (Parkes etc.) –Genetic art –Climate, oceanography projects