Download presentation
Presentation is loading. Please wait.
1
The software infrastructure of SETI@home II
David P. Anderson Space Sciences Laboratory U.C. Berkeley
2
Public-resource computing
Home PCs your computers academic business Challenges: low bandwidth at client costly bandwidth at server firewall/NAT issues sporadic connection untrustworthy, insecure clients server security heterogeneity must recruit participants Advantages: scale free growth public education no institutional policy issues
3
Achievements of SETI@home
1,000,000 years of CPU time in 3 years Sustained 30 TeraFLOPs 1.5E21 floating-point operations 3,600,000 users in 226 countries 40 Terabytes of data processed 3 billion “events” detected Solved scaling, security problems
4
SETI@home II Broadband pulse search on existing data
Parkes observatory: Southern sky Multi-beam receivers Wider frequency band Use KL transform Data archival on clients
5
SETI@home software shortcomings
Monolithic client and server Limited communication model Limited computation/data model Ad hoc accounting model
6
PRC platform goals Research lab X University Y Public project Z
applications projects Research lab X University Y Public project Z resource pool Participants install one program, select projects, specify constraints Projects are autonomous Advantages of a shared platform: Better instantaneous resource utilization Better long-term resource utilization Faster/cheaper for projects, software is better Easier for projects to get participants Participants learn more
7
Distributed computing platforms
Academic and open-source Globus Cosm XtremWeb Jxta Commercial Entropia United Devices Avaki
8
BOINC (Berkeley Open Infrastructure for Network Computing)
Overall structure Storage model Computation model Programming interface Operational interface Participant’s view
9
Scheduling server (C++)
Overall structure Project: Participant: BOINC DB (MySQL) Project work manager lib Scheduling server (C++) Web interfaces (PHP) data server (HTTP) data server (HTTP) data server (HTTP) App agent App agent App agent Core agent (C++)
10
Storage model Files: input, output, executables
Created by client or project Files are immutable File transfer by HTTP File attributes: Name URL list Persistent Upload-when-present executable MD5 checksum Digital signature <file_info> <name>protein_db.12</name> <persistent/> <url> <url>ftp://x.y/z</url> <md5_cksum>fw7398h</md_cksum> <nbytes> </nbytes> </file_info>
11
File management Implicit Explicit
Executables, input and output files are transferred pursuant to computation Explicit Clients report persistent files Scheduling server maintains DB of files on active hosts Project can request upload, download, delete
12
Workunits Represents inputs to a computation Components:
Cmdline args, environment vars Expected resource usage Description of input files <file_info> <name>out123</name> <url> </file_info> <workunit> <file_assoc> <file_name>out123</file_name> <app_name>input</app_name> </file_assoc> </workunit>
13
Results Represents results of a computation Components:
Which host did the computation Exit status Stderr output CPU time Output file description Template Actual <file_info> <name>out123</name> <generated_locally/> <upload_when_present/> <url> </file_info> <result> <file_assoc> <file_name>out123</file_name> <fd>1</fd> </ file_assoc > </result> <file_info> <name>out123</name> <url> <md5_cksum>182aed847</md5_cksum> </file_info>
14
Work sequences (long computations with big footprints)
Results can be linked into sequences Result is sent to host that handled predecessor If result times out, sequence is shifted to another host Upload state Check for abort
15
Hosts and scheduling Host measurements Workunit properties
CPU performance (integer/FP/memory) RAM, cache, disk free/total On/idle/connected statistics Network bandwidth statistics Workunit properties RAM/disk/computation requirements Scheduling policy Client: project quotas; high/low water marks Server: workunit feasibility test; prioritization
16
Accounting and result validation
Standardized unit of credit (CPeUro?) CPU time * (int+FP+mem) Result validation (optional): Compare redundant results, flag incorrect results Granted credit: Minimum of claimed credit among correct results
17
Programming interfaces
Application May be multi-file; any executable API for interaction with core client (optional) Checkpoint/restart: MFILE class Graphics: render to shared memory Software development tools Version management Web-based bug tracking
18
Operational interfaces
Operations Add/manage app versions Create workunits/results Query results Query client problems Interfaces C++ libraries Scriptable apps Web-based
19
Participant preferences
Examples: Work only while computer idle Confirm before connecting Don’t work if running on batteries High, low water marks Limits on disk space, bandwidth Application-specific preferences List of projects + authenticators + % allocation Edited via Web interface Can define multiple “preference sets”
20
Participation Initial project registration: Subsequent projects:
Create account on project web site Authenticator is ed Install core client, enter authenticator Subsequent projects: Add project to preferences on home site
21
Core client Goals FSM structure Concurrent communicate/compute
Obey user preferences Application, screensaver or service Multi-platform; multiprocessor-capable FSM structure file transfers Scheduler requests main loop poll HTTP transactions running applications wait() active sockets select()
22
Conclusion BOINC features BOINC status Projects:
Multiproject, multi-app open PRC platform Simple/small but general BOINC status Mostly feature-complete Client runs on Linux, Solaris, Windows, MacOS X Projects: Arecibo (later this year) Other (Parkes etc.) Climate modeling, other science projects Genetic art
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.