The 9 th Annual Workshop 25-27 September 2013 INRIA, Grenoble, France

Slides:



Advertisements
Similar presentations
Parasol Architecture A mild case of scary asynchronous system stuff.
Advertisements

RCAC Research Computing Presents: DiaGird Overview Tuesday, September 24, 2013.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.
Authors: Mateusz Jarus, Ewa Kowalczuk, Michał Madziar, Ariel Oleksiak, Andrzej Pałejko, Michał Witkowski Poznań Supercomputing and Networking Center GICOMP.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
Assignment 3: A Team-based and Integrated Term Paper and Project Semester 1, 2012.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
1 port BOSS on Wenjing Wu (IHEP-CC)
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
HTCondor and BOINC. › Berkeley Open Infrastructure for Network Computing › Grew out of began in 2002 › Middleware system for volunteer computing.
Computing on the Cloud Jason Detchevery March 4 th 2009.
111 EMC CONFIDENTIAL—INTERNAL USE ONLY NMC -- NW Administration NMC Team NetWorker 7.3 TOI July 28, 2005.
OpenSees on NEEShub Frank McKenna UC Berkeley. Bell’s Law Bell's Law of Computer Class formation was discovered about It states that technology.
Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010.
Turning science problems into HTC jobs Wednesday, July 29, 2011 Zach Miller Condor Team University of Wisconsin-Madison.
CS 346 – Chapter 2 OS services –OS user interface –System calls –System programs How to make an OS –Implementation –Structure –Virtual machines Commitment.
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Cactus/TIKSL/KDI/Portal Synch Day. Agenda n Main Goals:  Overview of Cactus, TIKSL, KDI, and Portal efforts  present plans for each project  make sure.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
Module 2 Part I Introduction To Windows Operating Systems Intro & History Introduction To Windows Operating Systems Intro & History.
Portal Update Plan Ashok Adiga (512)
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Mr. Justin “JET” Turner CSCI 3000 – Fall 2015 CRN Section A – TR 9:30-10:45 CRN – Section B – TR 5:30-6:45.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Volunteer Computing and BOINC Dr. David P. Anderson University of California, Berkeley Dec 3, 2010.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec
The 6 th Annual Pangalactic BOINC Workshop. BOINC: The Year in Review David Anderson 31 Aug 2010.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov
10/2/20161 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam King,
A Brief History of BOINC
Volunteer Computing and BOINC
A new model for volunteer computing
The Future of Volunteer Computing
The 9th Annual BOINC Workshop
University of California, Berkeley
Simulation Production System
Chapter 1: Introduction
Volunteer Computing for Science Gateways
Designing a Runtime System for Volunteer Computing David P
The Client/Server Database Environment
Exa-Scale Volunteer Computing
Process Management Presented By Aditya Gupta Assistant Professor
The Client/Server Database Environment
Initial job submission and monitoring efforts with JClarens
University of California, Berkeley
Ivan Reid (Brunel University London/CMS)
Chapter 2: Operating-System Structures
Building and running HPC apps in Windows Azure
Chapter 3: Processes.
Chapter 2: Operating-System Structures
Presentation transcript:

The 9 th Annual Workshop September 2013 INRIA, Grenoble, France

The BOINC Community UC Berkeley developers (2.5) Projects PC volunteers (240,000) Other volunteers: testing translation support Computer scientists

Workshop goals ● Learn what everyone else is doing ● Form collaborations ● Get ideas ● Steer BOINC development

Hackfest (Thu/Fri) ● Goal: get something done – design and/or implement software – improve docs – learn and use a new feature ● Bring your ideas

The state of volunteer computing ● Volunteership: stagnant – 240K people (down from 290K) – 350K computers ● Science projects: stagnant ● Computer Science research: a little ● Let’s keep trying anyway

Requests to projects ● Do public outreach – Notices (with pictures) – Automated reminder s – News s – Message boards – Mass media ● Use current server code – Avoid code divergence

To developers/researchers ● Talk with me before starting anything – especially if it’s of general utility ● Let me know if you need data

What’s new in BOINC? ● Funding ● Integration projects ● Remote job and file management ● Android ● Scheduler ● GPU and multicore apps ● Client ● Plans

Funding ● Current NSF grant runs another 18 months ● Not clear if current model will continue ● Collaborations are important for future funding ● Projects may need to help fund BOINC directly

Integration projects ● HTCondor (U. of Wisconsin) – Goal: BOINC-based back end for Open Science Grid or any Condor pool BOINC server BOINC server Condor node Grid manager BOINC GAHP Remote operations

Integration projects ● HUBzero (Purdue U.) – Goal: BOINC-based back end for science portals such as nanoHUB BOINC server BOINC server Hub

Integration projects ● Texas Advanced Computing Center (TACC) – Android/iOS app – They supply ● Interfaces, visualization, support for scientists ● Storage ● BOINC server

Remote input file management ● Issues – Naming/immutability – Efficiency – Garbage collection ● User file sandbox (web-based) used by CAS

Content-based file management ● Server file names based on MD5 ● DB table for file/batch association; garbage collection ● Web RPCs to query lists of files, upload files BOINC server Submit host Jf_ec3056e9ed14c837e3e68c80bb14871 f Jf_dac0160fd3d7f910bae550ec26a164a 8 Submit host Input.dat Foo.dat

Remote job submission ● Web RPCs – Batch: estimate, create, query, abort, retire – Batch expire time – Job: query, abort – App: get templates ● Input file modes – Local, local-staged, semilocal, remote, inline ● C++, PHP bindings

Output retrieval ● Web RPCs to – Get specific output files – Get zip of job’s outputs – Get zip of batch’s outputs

BOINC on Android ● New GUI ● Battery-related issues ● Device naming ● Released July 22 – Google Play Store, Amazon App Store – ~30K active devices

Job size matching ● Problem: 1000X speed difference GPU vs Android ● An app can have jobs of N “size classes” ● “size_census.php”: computes quantiles of effective speed for each app ● Scheduler tries to send jobs of size class i to devices in quantile i ● “size regulator” makes sure jobs of all size classes are available to send

New score-based scheduler for each resource type (starting w/ GPUs) scan job array starting at random point make list of jobs with app version for resource assign score (include job-size term) sort list for each job in list do quick checks lock array entry, do slow checks send job if request satisfied, break

BOINC client ● New work-fetch, job scheduling – Handle GPU exclusions ● “App config” mechanism – User can set device usage parameters, limit # of concurrent jobs per app ● Maintain/report current, previous uptime ● Maintain list of completed jobs ● Sub-second CPU throttling

GPU and multicore apps ● Support Intel GPUs ● Support OpenCL CPU apps – Detect, advertise multiple OpenCL libraries ● Develop OpenCL example app ● Detect GPUs in a separate process – Mac notebooks: allow system to use low-power GPU

BOINC runtime system ● Replace heartbeat with PID check – Not on Win2K: PID reuse ● Support apps that are in a critical section most of the time (e.g. GPU apps)

Volunteer storage ● Finished data archival system – Store large files for long periods – Multi-level erasure coding ● Developed simulator for testing, performance study

Software engineering ● Finished SVN → git migration ● Automated translation process – build_po → Pootle → commit → deploy ● Code hardening – strcpy() → strlcpy() – MAXPATHLEN

Didn’t start ● OpenID/OpenAuth support ● Remodel computing preferences ● BOINC in app stores (Windows, Apple)

Planned ● Automated build/test using Jenkins – Server code release management ● Accelerated batch completion ● Apple iOS client

My wish list: new GPU design ● Current: all GPUs of a given vendor are equivalent – Scheduler requests ask for NVIDIA jobs, not jobs for a specific NVIDIA GPU – This doesn’t work well for machines with heterogeneous GPUS – Work-arounds (GPU exclusions) cause problems ● Proposed: treat each GPU as a separate resource

My wish list: fully embrace latency- oriented scheduling ● Types of workload – Throughput-oriented – Small/fast batches – Large/slow batches ● Suppose a project has all three? – Goal: client requests and processes short jobs even if fast jobs are in progress – Requires complete redesign of scheduling policies

● The “project ecosystem” hasn’t materialized – Creating a project is too difficult, too risky – Volunteers tend to be passive – Marketing and PR: too many brands ● Umbrella projects: good, but not enough

● Single “brand” for volunteer computing ● Register for science areas rather than projects ● Facebook/Google login ● Use account-manager architecture ● How to allocate computing power? – Involve the HPC, scientific funding communities