Download presentation
Presentation is loading. Please wait.
Published byMiles Johnston Modified over 8 years ago
1
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley
2
Public-resource computing Home PCs business academic Advantages: scale free growth public education no policy issues Challenges: low BW at client costly BW at server firewall/NAT issues sporadic connection untrustworthy, insecure clients server security heterogeneity need PR, glitzy GUI your computers
3
SETI@home Uses public-resource computing for radio DSP Better data analysis: –Signal types (Gaussian, spike, pulsed, triplet) –Multiple frequency resolutions –Multiple drift rates, coherent integration Tradeoff: smaller frequency band –2.5MHz vs. 100 MHz
4
SETI@home Operations data recorder screensavers science DBuser DB WU storage splitters DLT tapes data server result queue acct. queue garbage collector tape archive, delete tape backup master DB redundancy checking RFI elimination repeat detection web site CGI program web page generator
5
Achievements of SETI@home 1,000,000 years of CPU time in 3 years Sustained 30 TeraFLOPs 1.5E21 floating-point operations 3,600,000 users in 226 countries 40 Terabytes of data processed 3 billion “events” detected Solved scaling, security problems
6
Extensions of SETI@home Broadband pulse search on existing data Parkes observatory: Southern sky Multi-beam receivers Expanded frequency range Use KL transform Data archival on clients
7
SETI@home software shortcomings Monolithic Limited communication model Limited computation/data model Incoherent, insecure accounting
8
Goals of a PRC platform Research lab X University YPublic project Z projects applications resource pool Participants install one program, select projects, specify constraints; all else is automatic Projects are autonomous Advantages of a shared platform: Better long-term resource utilization Better instantaneous resource utilization Faster/cheaper for projects, software is better Easier for projects to get participants Participants learn more
9
Distributed computing platforms Academic and open-source –Globus –Cosm –XtremWeb –Jxta Commercial –Entropia –United Devices –Parabon
10
Goals of BOINC (Berkeley Open Infrastructure for Network Computing) Public-resource computing/storage Multi-project, multi-application –Participants can apportion resources Handle fairly diverse applications Work with legacy apps Support many participant platforms Small, simple
11
General structure of BOINC Project: Participant: Scheduling server (C++) BOINC DB (MySQL) Project work manager data server (HTTP) App agent data server (HTTP) Web interfaces (PHP) Core agent (C++)
12
Project Files Attributes: –Name –URL list –Persistent flag –Upload-when-present flag –executable –MD5 checksum Files are immutable Files may originate in client or in project work manager protein_db.12 http://a.b/c ftp://x.y/z fw7398h 4782747
13
Projects/apps/versions Platforms Windows/x86 Linux/x86 MacOS/PPC … Applications nameBeta version Production version Arecibo SETI129 Parkes SETI115 …… App versions versionFile infos 11 … 13 … ……
14
Workunits and results Workunit nameXML docExpected resources cmdline args wu15 wu16 Result nameExit code CPU time XML inXML out Stderr out res123913.2 res124514.1 Application Host out123 http://… out123 1 out123 http://… 182aed847 Foo.C, line 124: divide by zero out123 http://… out123 input
15
Work sequences (long computations with big footprints) Results (or workunits) are linked into “sequences” Normally, a result is sent only to the host that handled the previous result If a result times out, the sequence is shifted to another host Upload state Check for abort
16
Remote file management Clients regularly report persistent files Scheduling server maintains DB of files on active hosts Project work manager can issue requests for particular hosts to upload, download, or delete files
17
Hosts and scheduling Host measurements –CPU performance (integer/FP/memory) –RAM, cache, disk free/total –On/connected statistics –Network bandwidth statistics Workunit properties –RAM/disk/computation requirements Scheduling policy –feasibility –High/low water mark
18
Accounting and result validation Standardized unit of credit (CPEuro?) –CPU time * (int+FP+mem) Result validation (optional): –Compare redundant results, flag incorrect results Granted credit: –Minimum of claimed credit among correct results
19
Participant preferences Examples: –Work only while user away –Confirm before connecting –Don’t work if on batteries –High, low water marks –Limits on disk space, bandwidth –Application-specific preferences –List of projects + authenticators + % allocation Edited via Web interface Can define multiple “preference sets”
20
Participation Initial project: –Create account on project web site –Authenticator is emailed –Install core client, enter authenticator Subsequent projects: –Authenticator is emailed –Create account on project web site –Add project to preferences on home site
21
Client/server protocol (XML-RPC) Request –Authentication –Host description –Persistent file descriptions –Result descriptions –Duration of work requested Reply –Application, workunit, result descriptors –Result acknowledgements –File transfer descriptors –New preferences –Control messages (redirect, back off, etc.)
22
Core client: goals Function –Use multiprocessors –Concurrent communicate/compute –Low-profile file transfer –Obey preferences Appearance –Application, screensaver or service Multi-platform
23
Client file structure BOINC home directory Project_1Project_nCPU_1CPU_m app1.exe file1332 file3328 app.exe infile outfile boinc.exe client_state.xml accounts.xml links
24
file transfers running applications wait() Client FSM structure active sockets select() HTTP transactions main loop poll Scheduler requests
25
Application Programming API is optional Checkpoint/restart: MFILE class Graphics –Application window –Screensaver Interact w/ core client via XML files
26
Conclusion BOINC status –Mostly feature-complete –Client runs on Linux, Solaris, Windows, MacOS X –Small: client is 5,000 lines, server 2,000 Projects: –Astropulse (later this year) –Other SETI@home (Parkes etc.) –Genetic art –Climate, oceanography projects
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.