Download presentation
Presentation is loading. Please wait.
Published byMiranda Henderson Modified over 8 years ago
1
Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010
2
Goals ● Learn about volunteer computing ● Learn how to create a volunteer computing project using BOINC Target audience: ● High-throughput computing users ● Technical skills: ● Basic Linux/Apache sysadmin, familiarity with PHP, SQL and XML, C/C++ (optional)
3
Schedule ● Session 1 ● Why use volunteer computing? ● Basic concepts of BOINC ● Developing BOINC applications (15 minute break) ● Session 2 ● Deploying a BOINC server ● Deploying applications ● Processing jobs ● Organizational issues
4
Software to install ● BOINC client ● http://boinc.berkeley.edu ● SSH client ● http://www.putty.org/
5
Part 1: Why volunteer computing?
6
The Consumer Digital Infrastructure ● 1 billion PCs ● current GPUs: 1 TeraFLOPS (1,000 ExaFLOPS total) ● Storage: ~1,000 Exabytes ● Commodity Internet: 10-1,000 Mbps to home ● Consumers pay for ● hardware ● sysadmin ● network costs ● electricity
7
Volunteer computing ● PC owners donate computing resources to projects (e.g., computational science) ● Applications run ● at zero priority while PC in use ● and/or while PC is not in use
8
Examples Projectstartwhereareapeak #hosts GIMPS1994math10,000 distributed.net1995cryptography100,000 SETI@home I1999UCBSETI600,000 Folding@home1999Stanfordbiology200,000 United Devices2002commercialbiomedicine200,000 CPDN2003Oxfordclimate change150,000 LHC@home2004CERNphysics60,000 Predictor@home2004Scrippsbiology100,000 WCG2004commercialbiomedicine200,000 Einstein@home2005LIGOastrophysics200,000 SETI@home II2005UCBSETI850,000 Rosetta@home2005U. Washbiology100,000 SIMAP2005T.U. Munichbioinformatics10,000...............
9
Current status ● ~50 projects ● 500,000 volunteers ● 800,000 computers ● ~12 PetaFLOPS
10
High-throughput computing High-performance computing cluster (MPI) supercomputer cluster (batch) Grid Commercial cloud Volunteer computing single job # processors multiple jobs 10K-1M 1000 100 1 HPC paradigms
11
Volunteer computing is different from other paradigms ● You don’t buy resources; you ask for them ● Resources are: – heterogeneous – sporadically available and connected – untrusted and not private – behind firewalls/NATs/proxies
12
The cost of 10 TeraFLOPS ● Amazon EC2 ● small instance: $.09/hour = $788/year ● 10 TeraFLOPS = 5,000 instances ● $3.94M/year plus network, storage costs ● Build your own cluster ● ~ $1.5M/year ● Volunteer computing ● ~ $0.1M/year
13
Part 2: Basic concepts of BOINC
14
About BOINC ● Funded by NSF since 2002 ● Open-source (LGPL) ● Based at UC Berkeley ● Few staff, but lots of volunteers ● software testing ● translation ● documentation ● support (email lists, message boards, Skype)
15
Volunteers and projects volunteers projects CPDN LHC@home WCG attachments
16
BOINC software overview client apps screensaver GUI scheduler MySQL data server daemons volunteer host project server HTTP web site scheduler graphics apps
17
What is a job? ● Conventional systems – job = executable + input files – run once, on specific platform ● BOINC – a job may run on any platform – a job may have multiple instances
18
Applications applications ● Each has – internal name – external name
19
Application versions applications Win32 + NVIDIA Win64 Mac OS X app versions Win32 N-core Win32 ● Each has – platform – plan class – version# – list of files (digitally signed)
20
Jobs applications Win32 + NVIDIA Win64 Mac OS X app versions jobs Win32 N-core Win32 ● Each has – list of input files – latency bound – FLOPS estimate, bound – RAM, disk estimates
21
Job instances applications Win32 + NVIDIA Win64 Mac OS X app versions jobs instances Win32 N-core Win32 ● Created by BOINC ● Each has – list of output files – reference to host – reference to app version
22
BOINC scheduler applications Win32 + NVIDIA Win64 Mac OS X app versions jobs instances Win32 N-core Win32 - HW, SW description - existing workload - per resource type: # of instances requested # of seconds requested - app version descriptions - job descriptions
23
Job validation ● You send a program and input files to a host, and it returns an output file ● How do you know that – the result is correct? – they did any computing at all?
24
Approaches to validation ● None ● App-specific “sanity check” of results – e.g., conservation of energy ● Job replication
25
Job replication ● Run two instances, see if they agree – “agree” may be fuzzy ● Homogeneous replication – numerical equivalence of hosts ● Adaptive replication – reduce replication for hosts that seem trustworthy
26
Job pipeline (per application) work generator BOINC validator assimilator
27
The BOINC data model ● App versions, job inputs, job outputs can consist of arbitrarily many files ● Each file has a physical name (unique, immutable); each reference to a file has a logical name ● Files have various attributes (e.g., sticky) ● Each file can have one or more URLs, and are transferred via HTTP ● App version files are digitally signed
28
What kinds of jobs can BOINC handle? ● Anything you’d run on a Grid ● Bags or streams of tasks (no MPI yet) ● Short/long jobs ● Data intensive, to a point ● Geared towards – Few apps, many jobs (startup overhead per app) – Jobs with high slack time
29
Part 3: Application development for BOINC
30
The BOINC runtime environment processe s files
31
Native BOINC applications ● boinc_init() – create runtime system thread ● boinc_finish() – write finish file ● boinc_resolve_filename(logical, physical) ● boinc_fraction_done(x)
32
Checkpointing ● bool boinc_time_to_checkpoint() – call when in checkpointable state – if returns true, must write checkpoint ● boinc_checkpoint_done() – call when checkpoint written
33
Multithread apps ● boinc_init_parallel() ● Allows suspend/resume of all threads – Unix: fork/exec – Windows: direct thread control
34
BOINC API available for ● C/C++ ● FORTRAN ● Java ● Python
35
The BOINC wrapper ● Can use for legacy apps ● XML input file lists sub-jobs – executable, input files ● What it does: – interfaces to BOINC client – copies files to/from slot directory – runs executables – does checkpointing at sub-job level
36
Building app versions ● Linux – gcc – build on “compatibility VM” ● Windows – Visual Studio – minGW (gcc) ● Mac OS X – xcode
37
GPU app versions ● Develop for NVIDIA or ATI, with CUDA, CAL, OpenCL, etc. (BOINC supplies samples) ● Each version has a “plan class” ● For each plan class, supply a scheduler function that determines – can app run on this host? ● hardware, driver version, etc. – what resources will it use? ● #CPUs, #GPUs, GPU RAM, etc.
38
VM apps ● Develop apps on your favorite OS ● Create a VirtualBox VM image ● App version consists of – “VM wrapper” (supplied by BOINC) ● controls VM execution ● moves files to/from VM – VM image – app executable
39
Part 4: Deploying a BOINC server
40
Demo server ● ssh to maxwell.ssl.berkeley.edu ● login: boincadm ● passwd: 4dq2usYM
41
Hardware options ● Native Linux host – download/compile BOINC software ● BOINC server VM (VMware/Debian) ● BOINC Amazon EC2 image
42
Components of a project ● Master URL ● MySQL database ● Directory hierarchy ● A set of daemon processes and cron jobs
43
Creating a project make_project [options] name ● creates – directory hierarchy – DB – mods for httpd.conf – crontab entry
44
Processes work generator validator assimilator feeder MySQL DB scheduler transitioner file deleter DB purger clients
45
Project directory hierarchy apps/application files bin/daemon programs and tools cgi-bin/scheduler and file upload GCI programs config.xmlconfiguration file download/downloadable files (directory tree) html/web site; master URL points here keys/keys for code signing, upload auth log_(hostname)daemon log files project.xmllist of platforms and apps upload/uploaded files (directory tree)
46
BOINC database platform app app_version user host workunit result...
47
Project configuration and control ● config.xml – scheduling and other options – list of daemons – list of periodic tasks ● project control – bin/start: start daemons, enable scheduler – bin/stop: stop daemons, disable scheduler – bin/status
48
Upgrading server software ● svn update in boinc source dir ● configure; make ● tools/upgrade project_name – updates all daemons, web code etc. – updates DB structure if needed
49
Scaling a BOINC server ● Components can run on different machines sharing an NFS file system ● Each component can be distributed ● MySQL server is typically the bottleneck ● 1 server machine can process ~100K jobs/day; 4 machines can process > 1 million
50
Part 5: Deploying applications
51
Adding an application ● edit project.xml ● run bin/xadd multi_thread Test multi-thread apps
52
Adding an application version ● Create application version directory ● Sign files on offline computer ● run bin/update_versions apps/ example_app/ example_app_6.14_windows_intelx86__cuda.exe/ example_app_6.14_windows_intelx86__cuda.exe graphics_app=example_app_graphics_6.14_windows_intelx86.exe logo.jpg Helvetica.txf
53
Implement job pipeline ● Decide on a validation policy ● Deploy validator, assimilator for app
54
Part 6: Submitting jobs
55
Describing job inputs ● Input template file 0 0 in 1 -cpu_time 60 446797000000000 279248000000000
56
Describing job outputs ● Output template file 5000000 out
57
Submitting a job ● Stage input files ● Submit job create_work –-appname A –-wu_name B –-wu_template C –-result_template D or int create_work( DB_WORKUNIT& wu, const char* wu_template, const char* result_template_filename, const char* result_template_filepath, const char** infiles, int ninfiles SCHED_CONFIG&, const char* command_line = NULL, const char* additional_xml = NULL ); cp test_files/12ja04aa `bin/dir_hier_path 12ja04aa`
58
Job processing demo > cd projects/sc10 > bin/demo_submit infile
59
Monitoring a BOINC project ● Operational web interface – http://maxwell.ssl.berkeley.edu/sc10_ops ● Log files – projects/sc10/log_maxwell/*
60
Your project web site ● Customizing – html/project/project.inc – in progress: BOINC/Drupal ● Message boards ● News ● Profiles ● email newsletters
61
Recommended plan ● Prototype/test project – experiment with BOINC here – test, debug applications here – attach your own PCs, open to volunteers if you want ● Public project (later) – pick URL carefully – use code-signing protocol – develop a good web site
62
Part 7: Organizational issues
63
Single-scientist projects ● Need to: ● Port apps ● Deploy, maintain servers ● Publicize ● Interface with public ● Not many research groups have the resources to do all of these ● And it creates a lot of competing “brands”
64
Umbrella projects Example: IBM World Community Grid Project publicity web development sysadmin app porting
65
The Berkeley@home model A university has – scientists who need HPC – a powerful “brand” – PR resources – IT infrastructure – lots of alumni (UCB: 500,000)
66
Hubs nanoHUB: “science portal” for nanoscience – social network + “app store” – sharing of ideas, data, software – computational portal HUBzero: generalization to other areas – currently ~20 hubs Integration of BOINC with HUBzero – each hub has a volunteer computing project
67
Conclusion ● Volunteer computing is the most cost-effective HPC paradigm ● BOINC solves the technical problems ● Organizational issues are critical ● Other resources – http://boinc.berkeley.edu – email lists (boinc_projects, boinc_dev,...) – me: davea@ssl.berkeley.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.