Presentation is loading. Please wait.

Presentation is loading. Please wait.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

Similar presentations


Presentation on theme: "David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC."— Presentation transcript:

1 David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

2 Public-resource computing Advantages: scale free growth public education no policy issues Challenges: low BW at client costly BW at server firewall/NAT issues sporadic connection untrustworthy, insecure clients server security heterogeneity need PR, glitzy GUI your computers academ ic business home PCs

3 Public-resource computing (cont.) ● 1 billion Internet-connected PCs in 2010 ● 50% privately owned ● If 10% participate: – At least 100 PetaFLOPs, 1 Exabyte (10^18) storage public computing Grid computing cluster computing supercomputin g p CPU power, storage capacity cost

4 SETI@home ● Running since May 1999 ● ~500,000 active participants ● ~60 TeraFLOPs ● Problems with current software – hard to change/add algorithms – can't share participants w/ other projects – inflexible data architecture

5 SETI@home data architecture ideal: current: commercial Internet Berkeley participants tapes Internet2 commercial Internet Berkeley Stanford USC participants 50 Mbps

6 BOINC: Berkeley Open Infrastructure for Network Computing ● Goals for computing projects – easy/cheap to create and operate DC projects – wide range of applications possible – no central authority ● Goals for participants – easy to participate in multiple projects – invisible use of disk, CPU, network

7 General structure of BOINC ● Project: ● Participant: Scheduling server (C++) BOINC DB (MySQL) Work generation data server (HTTP) App agent data server (HTTP) Web interfaces (PHP) Core agent (C++) Project back end Retry generation Result validation Result processing Garbage collection

8 Data model ● Immutable files ● Replication across servers ● Can originate on clients or servers ● Can be retained on clients ● Computations can have multiple input and output files ● Applications can consist of multiple files

9 Computation model ● Redundant computing: work generation assimilation validation distribution canonical result

10 Computation model (cont.) ● Scheduling – task resource estimates (disk/mem/CPU) – soft deadlines ● Long-running tasks – trickle messages, preemption ● API – minimal (file I/O, checkpoint, graphics)

11 Participant features ● Can register with multiple projects, control resource allocation ● Preferences – global, per-project – edited via web interface ● Platforms: Windows, Mac OS/X, Unix/Linux ● Anonymous platform mechanism ● Views – GUI, screensaver, Windows service

12 Participant Credit ● Goals: – credit for work actually done (CPU, network, storage) – don't know workunit size in advance – cheat-proof ● Integration with redundancy – claimed credit = benchmark * CPU time – granted credit = minimum claimed credit ● Handling graphics coprocessors – project-specific benchmarks

13 Participant web features ● User profiles ● Forums ● Self-moderating FAQs ● Teams ● XML data export (3 rd party statistics reporting)

14 Projects ● Current (at Space Sciences Lab) – Astropulse (black hole / pulsar search) – SETI@home ● In progress – Folding@home (Stanford) – Climateprediction.net (Oxford) ● Planned – LIGO (physics) – CERN – DIMES (network performance study)

15 Summary and status ● Public distributed computing ● BOINC: a platform for PDC ● BOINC is funded by NSF ● Source code is free for noncommercial use: http://boinc.berkeley.edu


Download ppt "David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC."

Similar presentations


Ads by Google