The 9th Annual BOINC Workshop

Slides:

Advertisements

Similar presentations

STANFORD UNIVERSITY INFORMATION TECHNOLOGY SERVICES IT Services Storage And Backup Low Cost Central Storage (LCCS) January 9,

Advertisements

BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.

Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.

Implementing Failover Clustering with Hyper-V

Case Study - GFS.

The 9 th Annual Workshop September 2013 INRIA, Grenoble, France

Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.

1 port BOSS on Wenjing Wu (IHEP-CC)

OSG Public Storage and iRODS

ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)

A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.

EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.

Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.

Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.

Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Presenters: Rezan Amiri Sahar Delroshan

Microsoft Virtual Academy. STANDARDIZATION SELF SERVICEAUTOMATION Give Customers of IT services the ability to identify, access and request services.

GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.

BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.

Intro to Datazen.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

Course 03 Basic Concepts assist. eng. Jánó Rajmond, PhD

E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.

Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec

Volunteer Computing in the Next Decade David Anderson Space Sciences Lab University of California, Berkeley 4 May 2012.

Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.

Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.

Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.

Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.

Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.

Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov

Network customization

Volunteer Computing and BOINC

A new model for volunteer computing

Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung

The Future of Volunteer Computing

File Syncing Technology Advancement in Seafile -- Drive Client and Real-time Backup Server Johnathan Xu CTO, Seafile Ltd.

Chapter 1: Introduction

Memory Management.

Volunteer Computing: SETI and Beyond David P

Database Replication and Monitoring

Volunteer Computing for Science Gateways

Nebula A cloud-based back end for

Chapter 1: Introduction

System Architecture & Hardware Configurations

PHP / MySQL Introduction

Chapter 1: Introduction

The software infrastructure of II

Chapter 1: Introduction

Haiyan Meng and Douglas Thain

Chapter 2: System Structures

Cloud computing mechanisms

Chapter 1: Introduction

Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.

Technical Capabilities

University of California, Berkeley

Ivan Reid (Brunel University London/CMS)

Windows Virtual PC / Hyper-V

Chapter 1: Introduction

Chapter 1: Introduction

Chapter 1: Introduction

Exploring Multi-Core on

Presentation transcript:

The 9th Annual BOINC Workshop INRIA, Grenoble, France 25-27 Sept. 2013 http://boinc.berkeley.edu/trac/wiki/WorkShop13

The BOINC community Projects PC volunteers (300,000) UC Berkeley developers (2.5) Other volunteers: testing translation support Computer scientists

Workshop goals Learn what everyone else is doing Form collaborations Steer BOINC development tell us what you want

Hackfest (tomorrow) Goal: get something concrete done Improve docs design and/or implement software learn and use a new feature

The state of volunteer computing Volunteers: stagnant BOINC: 290K people, 450K computers Science projects: stagnant Computer science research: stagnant Let’s keep trying anyway

Requests to projects Do outreach notices automated emails mass emails message boards mass media Use current server code

To developers/researchers Talk with me before starting anything, especially if it’s of general utility davea@ssl.berkeley.edu

What’s new in BOINC? Storage and data-intensive computing Virtual machine apps GPU apps Scheduling Remote job submission Other

Storage and data-intensive computing Disk space average 50 GB available per client 35 Petabytes total Trends disk sizes increasing exponentially, faster than processors 1 TB * 1M clients = 1 Exabyte

BOINC storage architecture Locality scheduling Data archival Applications Dataset storage Result archival BOINC storage infrastructure

BOINC storage infrastructure: managing client space Volunteer prefs determines BOINC’s allocation Allocation to projects is based on resource share Non-BOINC free BOINC

BOINC storage infrastructure: RPC/server structure “Sticky file” mechanism project Project disk usage Project disk share List of sticky files home PC BOINC client scheduler scheduler Desired space Files to delete Files to upload Files to download application- specific logic

Volunteer data archival Files originate on server Chunks of files are stored on clients Files can be reconstructed on server (with high latency) Goals: arbitrarily high reliability (99.999) support large files

Replication Divide file into N chunks Store each chunk on M clients If a client fails upload another replica to server download to a new client Problems high space overhead

Erasure Coding A way of dividing a file into N+K chunks The original file can be reconstructed from any N of these chunks. Example: N=40, K=20 can tolerate simultaneous failure of 20 clients space overhead is only 50% N = 4 K = 2

Problems with erasure coding When any chunk fails, need to upload all other chunks to server High network load at server High transient disk usage at server

Two-level coding Can tolerate K2 client failures Space overhead: 125%

Two-level coding + replication M = 2 Most recoveries involve only 1 chunk Space overhead: 250%

VDA Implementation DB tables vda_file vda_chunk_host Scheduler plugin handle transfers, sticky file list VDA daemon process files needing update, dead hosts Emulator compute performance metrics

Support for large files Restartable download of compressed files include <gzip/> in <file_info> currently only for app version files Combine uncompress, verify Asynchronous file copy, uncompress/verify 10MB threshold Handle > 2GB files; use stat64()

VM app support Vbox service BOINC client vbox wrapper virtual machine shared dir

VM app support Use Vbox “snapshot” mechanism for checkpointing Report non-ancestral PID (VM) to client Report network traffic to client Use Remote Desktop Protocol to allow user to view console CPU throttling Multicore

GPU app support Pass device type and number in init_data.xml OpenCL initialization: boinc_get_opencl_ids() Plan classes configurable in XML file

Scheduling: batch-level (proposed) Policy: feeder enumeration order Goals Give short batches priority over long batches But don’t let a stream of short batches starve long batches enforce user quotas over long term

Scheduling: batch-level Each user has “logical start time” LST(U) when submit batch, increment by expected runtime / share(U) Each batch has “logical end time” LET(B) set to LST(U) + expected runtime Give priority to batch for which LET(B) is least LST(U) LET(B) user 1 batch logical time batch batch batch LST(U) user 2

Scheduling: job-level Policies feeder enumeration order job selection from shared mem cache choice of app version deadline assignment QoS types non-batch, throughput-oriented Long-deadline batches As fast as possible (AFAP) batches short-deadline batches

Scheduling: job-level Goals accelerate batch completion avoid tight job deadlines avoid long delays between instances minimize server configuration

Scheduling: job-level For each (host, app version) maintain percentile incorporating average turnaround time consecutive valid results Dynamic batch completion estimation based on completed and validated jobs

Scheduling: job-level Feeder enumeration order LET(J) ascending, # retries descending

Scheduling: job-level batch created now x est_comp(B) busy est RT For each job for each usable app version AV if x < est_completion(B) send job using AV with deadline est_completion(B) else if percentile(H, AV) > 90% send job using AV with deadline x

Locality scheduling Have a large dataset Each file in the dataset is input for a large number of jobs Goal: process the dataset using the least network traffic Example: Einstein@home analysis of LIGO gravity- wave detector data

. . . Locality scheduling PCs Processing jobs sequentially is pessimal every file gets sent to every client

Locality scheduling: ideal PCs jobs jobs jobs jobs Each file is downloaded to 1 host Problems Typically need job replication Widely variable host throughput

Locality scheduling: proposed teams jobs jobs jobs jobs

Locality Scheduling Lite Optional feature of existing scheduler Use when # files < # shared-mem slots

Remote job submission Operations Portal web site cmdline tools BOINC server HTTPS/XML PHP API Operations estimate, submit, query, abort, get result files, retire

Remote job submission Input file options local: file already exists on server inline: file is passed in request XML semilocal: file is accessible via HTTP from server; server fetches and serves it remote: file is on a server accessible to clients; must supply size and MD5

Broadcast and targeted jobs Broadcast jobs run once on all hosts, present and future can limit to user or team Not handled by validator or assimilator Targeted jobs targeted to a host, user, or team handled by validator, assimilator can do this when create job, or dynamically

Git migration Branches master (development) new code goes here server_stable hot fixes may go here client_release_X_Y

Other things for CERN T4T Web-based app graphics app implements an HTTP server port is conveyed to Manager “app graphics” opens a browser window “need network” app version flag don’t run if network not available

New OS support Windows 8 Mac OS X 10.8 Xcode 4.5 Debian 6.0 Android

Large DB IDs SETI@home has done > 2B jobs made IDs unsigned (31->32 bits) eventually will need to move to 64 bit

Validator Runtime outlier flag don’t use this job in runtime, credit statistics Test harness validator_test file1 file2

BOINC in app stores Operated by OS vendors (Apple, MS, Google) Vendor screen apps and takes a cut Goal: package BOINC for app stores and maybe project-specific versions

Didn’t get done OpenID support remodel computing preferences