Presentation is loading. Please wait.

Presentation is loading. Please wait.

Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010.

Similar presentations


Presentation on theme: "Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010."— Presentation transcript:

1 Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010

2 Goals ● Learn about volunteer computing ● Learn how to create a volunteer computing project using BOINC Target audience: ● High-throughput computing users ● Technical skills: ● Basic Linux/Apache sysadmin, familiarity with PHP, SQL and XML, C/C++ (optional)

3 Schedule ● Session 1 ● Why use volunteer computing? ● Basic concepts of BOINC ● Developing BOINC applications (15 minute break) ● Session 2 ● Deploying a BOINC server ● Deploying applications ● Processing jobs ● Organizational issues

4 Software to install ● BOINC client ● http://boinc.berkeley.edu ● SSH client ● http://www.putty.org/

5 Part 1: Why volunteer computing?

6 The Consumer Digital Infrastructure ● 1 billion PCs ● current GPUs: 1 TeraFLOPS (1,000 ExaFLOPS total) ● Storage: ~1,000 Exabytes ● Commodity Internet: 10-1,000 Mbps to home ● Consumers pay for ● hardware ● sysadmin ● network costs ● electricity

7 Volunteer computing ● PC owners donate computing resources to projects (e.g., computational science) ● Applications run ● at zero priority while PC in use ● and/or while PC is not in use

8 Examples Projectstartwhereareapeak #hosts GIMPS1994math10,000 distributed.net1995cryptography100,000 SETI@home I1999UCBSETI600,000 Folding@home1999Stanfordbiology200,000 United Devices2002commercialbiomedicine200,000 CPDN2003Oxfordclimate change150,000 LHC@home2004CERNphysics60,000 Predictor@home2004Scrippsbiology100,000 WCG2004commercialbiomedicine200,000 Einstein@home2005LIGOastrophysics200,000 SETI@home II2005UCBSETI850,000 Rosetta@home2005U. Washbiology100,000 SIMAP2005T.U. Munichbioinformatics10,000...............

9 Current status ● ~50 projects ● 500,000 volunteers ● 800,000 computers ● ~12 PetaFLOPS

10 High-throughput computing High-performance computing cluster (MPI) supercomputer cluster (batch) Grid Commercial cloud Volunteer computing single job # processors multiple jobs 10K-1M 1000 100 1 HPC paradigms

11 Volunteer computing is different from other paradigms ● You don’t buy resources; you ask for them ● Resources are: – heterogeneous – sporadically available and connected – untrusted and not private – behind firewalls/NATs/proxies

12 The cost of 10 TeraFLOPS ● Amazon EC2 ● small instance: $.09/hour = $788/year ● 10 TeraFLOPS = 5,000 instances ● $3.94M/year plus network, storage costs ● Build your own cluster ● ~ $1.5M/year ● Volunteer computing ● ~ $0.1M/year

13 Part 2: Basic concepts of BOINC

14 About BOINC ● Funded by NSF since 2002 ● Open-source (LGPL) ● Based at UC Berkeley ● Few staff, but lots of volunteers ● software testing ● translation ● documentation ● support (email lists, message boards, Skype)

15 Volunteers and projects volunteers projects CPDN LHC@home WCG attachments

16 BOINC software overview client apps screensaver GUI scheduler MySQL data server daemons volunteer host project server HTTP web site scheduler graphics apps

17 What is a job? ● Conventional systems – job = executable + input files – run once, on specific platform ● BOINC – a job may run on any platform – a job may have multiple instances

18 Applications applications ● Each has – internal name – external name

19 Application versions applications Win32 + NVIDIA Win64 Mac OS X app versions Win32 N-core Win32 ● Each has – platform – plan class – version# – list of files (digitally signed)

20 Jobs applications Win32 + NVIDIA Win64 Mac OS X app versions jobs Win32 N-core Win32 ● Each has – list of input files – latency bound – FLOPS estimate, bound – RAM, disk estimates

21 Job instances applications Win32 + NVIDIA Win64 Mac OS X app versions jobs instances Win32 N-core Win32 ● Created by BOINC ● Each has – list of output files – reference to host – reference to app version

22 BOINC scheduler applications Win32 + NVIDIA Win64 Mac OS X app versions jobs instances Win32 N-core Win32 - HW, SW description - existing workload - per resource type: # of instances requested # of seconds requested - app version descriptions - job descriptions

23 Job validation ● You send a program and input files to a host, and it returns an output file ● How do you know that – the result is correct? – they did any computing at all?

24 Approaches to validation ● None ● App-specific “sanity check” of results – e.g., conservation of energy ● Job replication

25 Job replication ● Run two instances, see if they agree – “agree” may be fuzzy ● Homogeneous replication – numerical equivalence of hosts ● Adaptive replication – reduce replication for hosts that seem trustworthy

26 Job pipeline (per application) work generator BOINC validator assimilator

27 The BOINC data model ● App versions, job inputs, job outputs can consist of arbitrarily many files ● Each file has a physical name (unique, immutable); each reference to a file has a logical name ● Files have various attributes (e.g., sticky) ● Each file can have one or more URLs, and are transferred via HTTP ● App version files are digitally signed

28 What kinds of jobs can BOINC handle? ● Anything you’d run on a Grid ● Bags or streams of tasks (no MPI yet) ● Short/long jobs ● Data intensive, to a point ● Geared towards – Few apps, many jobs (startup overhead per app) – Jobs with high slack time

29 Part 3: Application development for BOINC

30 The BOINC runtime environment processe s files

31 Native BOINC applications ● boinc_init() – create runtime system thread ● boinc_finish() – write finish file ● boinc_resolve_filename(logical, physical) ● boinc_fraction_done(x)

32 Checkpointing ● bool boinc_time_to_checkpoint() – call when in checkpointable state – if returns true, must write checkpoint ● boinc_checkpoint_done() – call when checkpoint written

33 Multithread apps ● boinc_init_parallel() ● Allows suspend/resume of all threads – Unix: fork/exec – Windows: direct thread control

34 BOINC API available for ● C/C++ ● FORTRAN ● Java ● Python

35 The BOINC wrapper ● Can use for legacy apps ● XML input file lists sub-jobs – executable, input files ● What it does: – interfaces to BOINC client – copies files to/from slot directory – runs executables – does checkpointing at sub-job level

36 Building app versions ● Linux – gcc – build on “compatibility VM” ● Windows – Visual Studio – minGW (gcc) ● Mac OS X – xcode

37 GPU app versions ● Develop for NVIDIA or ATI, with CUDA, CAL, OpenCL, etc. (BOINC supplies samples) ● Each version has a “plan class” ● For each plan class, supply a scheduler function that determines – can app run on this host? ● hardware, driver version, etc. – what resources will it use? ● #CPUs, #GPUs, GPU RAM, etc.

38 VM apps ● Develop apps on your favorite OS ● Create a VirtualBox VM image ● App version consists of – “VM wrapper” (supplied by BOINC) ● controls VM execution ● moves files to/from VM – VM image – app executable

39 Part 4: Deploying a BOINC server

40 Demo server ● ssh to maxwell.ssl.berkeley.edu ● login: boincadm ● passwd: 4dq2usYM

41 Hardware options ● Native Linux host – download/compile BOINC software ● BOINC server VM (VMware/Debian) ● BOINC Amazon EC2 image

42 Components of a project ● Master URL ● MySQL database ● Directory hierarchy ● A set of daemon processes and cron jobs

43 Creating a project make_project [options] name ● creates – directory hierarchy – DB – mods for httpd.conf – crontab entry

44 Processes work generator validator assimilator feeder MySQL DB scheduler transitioner file deleter DB purger clients

45 Project directory hierarchy apps/application files bin/daemon programs and tools cgi-bin/scheduler and file upload GCI programs config.xmlconfiguration file download/downloadable files (directory tree) html/web site; master URL points here keys/keys for code signing, upload auth log_(hostname)daemon log files project.xmllist of platforms and apps upload/uploaded files (directory tree)

46 BOINC database platform app app_version user host workunit result...

47 Project configuration and control ● config.xml – scheduling and other options – list of daemons – list of periodic tasks ● project control – bin/start: start daemons, enable scheduler – bin/stop: stop daemons, disable scheduler – bin/status

48 Upgrading server software ● svn update in boinc source dir ● configure; make ● tools/upgrade project_name – updates all daemons, web code etc. – updates DB structure if needed

49 Scaling a BOINC server ● Components can run on different machines sharing an NFS file system ● Each component can be distributed ● MySQL server is typically the bottleneck ● 1 server machine can process ~100K jobs/day; 4 machines can process > 1 million

50 Part 5: Deploying applications

51 Adding an application ● edit project.xml ● run bin/xadd multi_thread Test multi-thread apps

52 Adding an application version ● Create application version directory ● Sign files on offline computer ● run bin/update_versions apps/ example_app/ example_app_6.14_windows_intelx86__cuda.exe/ example_app_6.14_windows_intelx86__cuda.exe graphics_app=example_app_graphics_6.14_windows_intelx86.exe logo.jpg Helvetica.txf

53 Implement job pipeline ● Decide on a validation policy ● Deploy validator, assimilator for app

54 Part 6: Submitting jobs

55 Describing job inputs ● Input template file 0 0 in 1 -cpu_time 60 446797000000000 279248000000000

56 Describing job outputs ● Output template file 5000000 out

57 Submitting a job ● Stage input files ● Submit job create_work –-appname A –-wu_name B –-wu_template C –-result_template D or int create_work( DB_WORKUNIT& wu, const char* wu_template, const char* result_template_filename, const char* result_template_filepath, const char** infiles, int ninfiles SCHED_CONFIG&, const char* command_line = NULL, const char* additional_xml = NULL ); cp test_files/12ja04aa `bin/dir_hier_path 12ja04aa`

58 Job processing demo > cd projects/sc10 > bin/demo_submit infile

59 Monitoring a BOINC project ● Operational web interface – http://maxwell.ssl.berkeley.edu/sc10_ops ● Log files – projects/sc10/log_maxwell/*

60 Your project web site ● Customizing – html/project/project.inc – in progress: BOINC/Drupal ● Message boards ● News ● Profiles ● email newsletters

61 Recommended plan ● Prototype/test project – experiment with BOINC here – test, debug applications here – attach your own PCs, open to volunteers if you want ● Public project (later) – pick URL carefully – use code-signing protocol – develop a good web site

62 Part 7: Organizational issues

63 Single-scientist projects ● Need to: ● Port apps ● Deploy, maintain servers ● Publicize ● Interface with public ● Not many research groups have the resources to do all of these ● And it creates a lot of competing “brands”

64 Umbrella projects Example: IBM World Community Grid Project publicity web development sysadmin app porting

65 The Berkeley@home model A university has – scientists who need HPC – a powerful “brand” – PR resources – IT infrastructure – lots of alumni (UCB: 500,000)

66 Hubs nanoHUB: “science portal” for nanoscience – social network + “app store” – sharing of ideas, data, software – computational portal HUBzero: generalization to other areas – currently ~20 hubs Integration of BOINC with HUBzero – each hub has a volunteer computing project

67 Conclusion ● Volunteer computing is the most cost-effective HPC paradigm ● BOINC solves the technical problems ● Organizational issues are critical ● Other resources – http://boinc.berkeley.edu – email lists (boinc_projects, boinc_dev,...) – me: davea@ssl.berkeley.edu


Download ppt "Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010."

Similar presentations


Ads by Google