A new model for volunteer computing

Slides:

Advertisements

Similar presentations

Service Manager for MSPs

Advertisements

May 12, 2015 XSEDE New User Tutorials and User Support: Lessons Learned Marcela Madrid.

Campus High Throughput Computing (HTC) Infrastructures (aka Campus Grids) Dan Fraser OSG Production Coordinator Campus Grids Lead.

BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.

Federated Searching Pre-Conference Workshop - The federated searching cookbook Qin Zhu HP Labs Research Library February 18, 2007.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.

The 9 th Annual Workshop September 2013 INRIA, Grenoble, France

Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.

A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.

Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.

Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.

Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.

Module 10 Administering and Configuring SharePoint Search.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.

Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.

Institute For Digital Research and Education Implementation of the UCLA Grid Using the Globus Toolkit Grid Center’s 2005 Community Workshop University.

Obtaining Computer Allocations and Monitoring Use SCD User Meeting at AMS January 11, 2005 Ginger Caldwell, SCD.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.

BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.

Structural Patterns1 Nour El Kadri SEG 3202 Software Design and Architecture Notes based on U of T Design Patterns class.

Leveraging the InCommon Federation to access the NSF TeraGrid Jim Basney Senior Research Scientist National Center for Supercomputing Applications University.

ELandings Data Extract Mechanisms. Data Extract Options:

TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.

Administering Groups Chapter Eight. Exam Objectives In this Chapter:  Plan a security group hierarchy based upon delegation requirements  Plan a security.

BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

A worldwide e-Infrastructure and Virtual Research Community for NMR and structural biology Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for.

CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.

INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.

CernVM and Volunteer Computing Ivan D Reid Brunel University London Laurence Field CERN.

Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec

The Future of Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab UH CS Dept. March 22, 2007.

The 6 th Annual Pangalactic BOINC Workshop. BOINC: The Year in Review David Anderson 31 Aug 2010.

Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.

Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.

A Brief History of (CPU) Time -or- Ten Years of Multitude David P. Anderson Spaces Sciences Lab University of California, Berkeley 2 Sept 2010.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.

The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.

Local Scheduling for Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab John McLeod VII Sybase March 30, 2007.

Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.

Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.

Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov

BOINC current work David Anderson 11 July Where we're at ● We've come a long way – Some successful projects – Progress on software ● The long road.

An Overview of Volunteer Computing

Architecture Review 10/11/2004

Volunteer Computing and BOINC

The Future of Volunteer Computing

Internet Made Easy! Make sure all your information is always up to date and instantly available to all your clients.

University of California, Berkeley

Volunteer Computing: SETI and Beyond David P

Investigation authentication using AAF for the CVL on NeCTAR

Volunteer Computing for Science Gateways

Joslynn Lee – Data Science Educator

Designing a Runtime System for Volunteer Computing David P

Select Survey Invitations

CyVerse Discovery Environment

A Roadmap for Volunteer Computing in the U.S.

Exa-Scale Volunteer Computing

David P. Anderson Space Sciences Lab UC Berkeley LASER

David Cameron ATLAS Site Jamboree, 20 Jan 2017

USAJOBS – Application Manager

Grid Portal Services IeSE (the Integrated e-Science Environment)

Software Testing With Testopia

Artem Trunov and EKP team EPK – Uni Karlsruhe

An easier path? Customizing a “Global Solution”

Development of the Nanoconfinement Science Gateway

Facebook and Flickr Tips for beginners Steph Swalwell ~ March 2017.

University of California, Berkeley

with Pearson’s MyITLab for Office 2013

with Pearson’s MyITLab for Office 2010

Presentation transcript:

A new model for volunteer computing David P. Anderson Space Sciences Lab University of California, Berkeley Sept. 2017

The original BOINC model Scientists lots of them (100s or 1000s) will create BOINC projects They’ll compete for volunteers by doing PR and making great web sites describing their research

The original model Volunteers periodically survey the project web sites rank projects based on the importance of the science area to them the credentials of the scientists dynamically choose projects based on rankings

The original model Dynamic ecosystem of competing projects Project get computing power in proportion to their (apparent) merit otherwise infeasible research gets done high-risk research gets done The public learns about science A big fraction of global computing power does science

The original model We’ve had considerable success, but: The set of projects has been small and static, and includes little mainstream U.S. science The set of volunteers small, narrow, declining project lock-in motivated largely by credit

Why so few projects? High risk/reward Creating a BOINC project requires resources that few research groups have Few scientists know about volunteer computing

Why few/static volunteers? Evaluating projects is hard BOINC looks too complex in general Marketing volunteer computing is hard too many brands

Other issues The HPC world views VC as a gimmick, and ignores it The computer science world ignores VC It’s hard (for me at least) to get funding

A new model Goals Serve more scientists Get more volunteers Move VC toward the mainstream of computational science, HPC, computer science get more smart people studying VC, developing BOINC, integrating BOINC

A new model Scientists Volunteers add VC to existing HPC facilities like supercomputer centers and portals Volunteers create an interface to VC based on science goals rather than projects create a unified brand (not ‘BOINC’)

Adding VC to HPC Texas Advanced Computing Center (TACC) nanoHUB ~20% of jobs can run on VC “launcher”: cmdline tool for running job batches nanoHUB portal for nanoscience web interface to ~30 standard apps uncertainty quantification: lots of jobs currently use small cluster and AWS

Adding VC to HPC BOINC “universal app”: vboxwrapper + Docker- player VM image Docker image and input files are in workunit TACC and nanoHUB already package apps as Docker images We can support 1000s of apps and scientists with no incremental work!

Adding VC to HPC Remote job submission Remote file management each job in a batch can have its own templates Remote file management Identity mapping

Keywords Attributes of a job What area of research does it contribute to? Where (geographically and institutionally) are the researchers? For projects like TACC, these attributes are per-job, not per app or project

Keywords What can we use these attributes for show project attributes in list show user info about jobs they’re running let users filter work from a given project do accounting of computing power per science area let users express preferences at the top level

Keyword architecture Keywords have: short and long names (dynamic) integer ID and symbol (static) hierarchy level and parent ID (dynamic) category science area location

Keywords There must be a single, authoritative keyword list boinc/doc/keywords.inc community-based selection of keywords and hierarchy

Keywords for job/project selection Each user can express a yes/maybe/no preference per keyword (e.g. in TBD or elsewhere) Each project has a list of (keyword, work fraction) and an optional list of keywords per job Don’t attach user to project with a “no” keyword with work fraction 1. Don’t send jobs with “no”keywords

TBD A volunteer interface based on keywords rather than projects. A resource allocation mechanism A brand for marketing volunteer computing

TBD Architecture TBD Web site DB AM RPC Projects

TBD: accounting TBD maintains daily history for Tracked quantities total project user Tracked quantities REC (CPU, GPU) runtime (CPU, GPU) jobs (success, fail)

TBD: Allocation Linear allocation model supports both continuous and bursty demands possibly supports QoS guarantees

TBD: Allocation Goals Respect user preferences Respect project allocations Maximize total throughput Minimize projects per host

TBD: Project allocation Might involve National Science Foundation (XSEDE) EU commission others as appropriate Possible criteria # scientists served merit of research

TBD… is a unified brand for marketing for co-promotion endorsed by trusted government agencies lets grant proposals include no-risk VC component supports sporadic or 1-time computing needs

Names Not ‘BOINC’ Sigma / Scigma ∑ Sciphon Sciborg SCI N nboard

One-click install New user scenario: Fill out account form (email, science prefs) Click “Join” redirects to boinc.berkeley.edu/concierge.php downloads appropriate client installer encodes AM/account info in installer filename BOINC client finds this welcome dialog, no login dialog This mechanism can be used by projects too!

TBD Status Funded by NSF (UCB/Purdue/TACC) through 6/20 Most of the technology has been developed Github: davidpanderson/scienceunited UCB server/URL (scion.berkeley.edu)