Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November.

Similar presentations


Presentation on theme: "Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November."— Presentation transcript:

1 Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November 7, 2013

2 Science needs computing power ● High-performance computing ● High-throughput computing – Thousands or millions of independent jobs – What matters is the rate of job completion, not the turnaround time of individual jobs

3 High-throughput computing applications ● Physical simulation – particle collision – atomic/molecular (bio, nano) – Earth climate system ● Compute-intensive data analysis – particle physics (LHC) – Astrophysics (radio, gravitational) – genomics ● Bio-inspired optimization – genetic algorithms, flocking, ant colony etc.

4 Approaches to HTC ● Cluster computing – lots of commodity or rack-mounted PCs in a room ● Grid computing – share clusters between organizations ● Cloud computing – rent cluster nodes, e.g. Amazon EC2 ● Volunteer computing – use computers owned by consumers

5 The Consumer Digital Infrastructure ● Computing devices – Desktop and laptop computers – Mobiles devices: tablets, smart phones – Game consoles – Set-top boxes, DVRs – Appliances ● Commodity Internet – Cable, DSL, fiber to the home, cell networks

6 Measures of computing speed ● Floating-point operation (FLOP) ● GigaFLOPS (10 9 /sec): 1 Central Processing Unit (CPU) ● TeraFLOPS (10 12 /sec): 1 Graphics Processing Unit (GPU) ● PetaFLOPS (10 15 /sec): 1 supercomputer ● ExaFLOPS (10 18 /sec): current Holy Grail

7 CDI performance potential ● 1 billion Desktop/laptop PCs – CPUs: 10 ExaFLOPS – GPUs: 1,000 ExaFLOPS ● 2.5 billion smartphones – CPUs: 10 ExaFLOPS

8 Volunteer computing ● Consumers donate computing capacity to – support science – be in a community – compete ● History – 1997: GIMPS, distributed.net – 1999: SETI@home, Folding@home – 2003: BOINC

9 Limiting factors ● Volunteership – Study of college students [Toth 2006] ● 5% would “definitely participate” ● 10% would “possible participate” ● PC availability – 65% average availability [Kondo 2008] – 35% of PCs are available 24/7

10 Other limiting factors ● Network bandwidth (client, server) – Commodity Internet ● Memory, disk usage – new PCs average 6 GB RAM

11 BOINC: middleware for volunteer computing ● Supported by NSF since 2002 ● Open source (LGPL) ● Based at University of California, Berkeley ● http://boinc.berkeley.edu

12 Volunteer computing with BOINC volunteers projects CPDN LHC@home WCG attachments

13 How to volunteer

14 Choose projects

15 Configure

16 Community

17 Creating a BOINC project ● Install BOINC server software on a Linux box ● Compile apps for Windows/Mac/Linux ● Attract volunteers – develop web site – generate publicity – communicate with volunteers

18 Volunteer computing today ● 500,000 active computers ● 50 projects ● 15 PetaFLOPS average

19 Some BOINC-based projects ● IBM World Community Grid ● Einstein@home ● Climateprediction.net ● LHC@home ● Rosetta@home

20 Cost The cost of 10 TeraFLOPS for 1 year: ● CPU cluster: $1.5M ● Amazon EC2: $4M – 5,000 small instances ● Volunteer: ~ $0.1M

21 How BOINC works home PC BOINC client project HTTP download data, executables compute upload outputs BOINC server get jobs

22 Issues handled by BOINC ● Heterogeneous computers ● Untrusted, anonymous computers – Result validation ● replication, adaptive replication ● Credit: amount of work done ● Consumer-friendly client

23 Using GPUs ● BOINC detects and schedules GPUs – NVIDIA, AMD, Intel – multiple/mixed GPUs – various language systems (CUDA, OpenCL, CAL) ● Issues – non-preemptive GPU scheduling – no paging of GPU memory

24 Multicore apps ● Next-generation PCs may have 100 cores ● BOINC supports multi-core apps – OpenMP, MPI – OpenCL CPU apps

25 Using VM technology ● CDI platforms: – 85% Windows – 7% Linux – 7% Mac OS X ● Developing and maintaining versions for different platforms is hard ● Even making a portable Linux executable is hard

26 Virtual machines Host operating system Guest operating system application

27 Virtual machines Windows 7 Debian Linux 2.6 application

28 BOINC VM support ● Create a VM image for your favorite environment ● Create executables for that environment BOINC client VirtualBox executive Vbox wrapper VM instance shared directory: executable input, output files

29 VM advantages ● Develop in your favorite environment – No need for multiple versions ● A VM is a strong “sandbox” – Can run untrusted applications ● Free “checkpointing”

30 BOINC on Android ● New GUI ● Battery-related issues ● Released July 2013 – Google, Amazon App Stores – ~50K active devices

31 Why hasn’t volunteer computing gained traction? ● “Ecosystem of projects” model – Lots of competing projects ● Problems with this model – Creating/operating a project is too hard and risky – Volunteers need simplicity – No coherent PR; too many brands

32 Umbrella projects ● One project serves many scientists ● Examples – CAS@home (Chinese Academy of Science) – World Community Grid (IBM) – U. of Westminster (desktop grid) – Ibercivis (Spanish consortium)

33 Integrating BOINC ● HTCondor (U. of Wisconsin) – Goal: BOINC-based back end for Open Science Grid or any Condor pool BOINC server BOINC server HTCondor node Grid manager BOINC GAHP Job submission

34 Integrating BOINC ● HUBzero (Purdue) – Goal: BOINC-based back end for science portals such as nanoHUB BOINC server BOINC server Hub projects PCs

35 Proposal: Science@home ● Single “brand” for volunteer computing ● Volunteers register for science areas rather than projects ● How to allocate computing power? – Involve the HPC, scientific funding communities

36 projects Implementing Science@home ● BOINC “account manager” architecture Science@home BOINC client BOINC client projects

37 Summary ● Volunteer computing is – Usable for most HTC applications – A path to ExaFLOPS computing – A way to popularize science ● BOINC provides the software infrastructure ● Barriers are largely organizational

38 Contacts ● http://boinc.berkeley.edu ● davea@ssl.berkeley.edu


Download ppt "Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November."

Similar presentations


Ads by Google