Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.

Slides:



Advertisements
Similar presentations
Operating System Structures
Advertisements

A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
1 port BOSS on Wenjing Wu (IHEP-CC)
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
A Guided Tour of BOINC David P. Anderson Space Sciences Lab University of California, Berkeley TACC November 8, 2013.
HTCondor and BOINC. › Berkeley Open Infrastructure for Network Computing › Grew out of began in 2002 › Middleware system for volunteer computing.
OS provide a user-friendly environment and manage resources of the computer system. Operating systems manage: –Processes –Memory –Storage –I/O subsystem.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems CSCI-6140 – Computer Operating Systems David Goldschmidt, Ph.D.
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley.
© 2008 Open Grid Forum Independent Software Vendor (ISV) Remote Computing Primer Steven Newhouse.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of California, Berkeley.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Review of Condor,SGE,LSF,PBS
Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.
Cotter-cs431 Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved Chapter 8 Multiple Processor Systems.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Local Scheduling for Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab John McLeod VII Sybase March 30, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Technology for Citizen Cyberscience Dr. David P. Anderson University of California, Berkeley May 2011.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov
Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab January 30, 2007.
BOINC current work David Anderson 11 July Where we're at ● We've come a long way – Some successful projects – Progress on software ● The long road.
Volunteer Computing and BOINC
University of California, Berkeley
Volunteer computing PC owners donate idle cycles to science projects
Volunteer Computing: SETI and Beyond David P
TrueTime.
Volunteer Computing for Science Gateways
AWS Integration in Distributed Computing
Designing a Runtime System for Volunteer Computing David P
Nebula A cloud-based back end for
GWE Core Grid Wizard Enterprise (
Chapter 2: System Structures
The Global Status of Citizen Cyberscience
Volunteer computing and volunteer thinking Dr. David P
Introduction to Operating System (OS)
Quick Introduction to OS
Haiyan Meng and Douglas Thain
Chapter 2: The Linux System Part 1
Multiple Processor Systems
University of California, Berkeley
Chapter 2: Operating-System Structures
Prof. Leonardo Mostarda University of Camerino
Chapter 2: Operating-System Structures
LO2 – Understand Computer Software
Outline System architecture Current work Experiments Next Steps
Chapter 2: Operating-System Structures
Process/Code Migration and Cloning
Exploring Multi-Core on
Presentation transcript:

Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011

Volunteer computing ● 800K computers ● 85% Win, 7% Mac, 7% Linux ● 2.4 cores/computer ● 41% have a modern GPU ● 65% average availability ● 90% behind firewall or NAT ● 12 PetaFLOPS ● worth $5 billion/year on Amazon EC2

BOINC CPU s GPUs

BOINC CPU s GPUs BOINC client jobs

BOINC CPU s GPUs BOINC client jobs Project schedule r DB scheduler RPC

BOINC CPU s GPUs BOINC client jobs 50 % IBM World Community Grid Climateprediction.net 20 % 30 %

Client scheduling policies ● Job scheduling policy ● what jobs to run ● whether to leave suspended jobs in memory ● Work fetch policy ● when to get more jobs ● what project to get them from ● how much to request These policies have big impact on system performance. They must work in a large space of scenarios

Scenarios ● Preferences ● Hardware ● Availability (computing, network) ● # of projects ● For each project/application ● distribution of job size ● accuracy of runtime estimate ● latency bound ● resource usage ● project availability

The BOINC project ● 2.5 developers ● A couple of computers each ● We can’t reproduce most scenarios

So... ● How can we design good scheduling policies? ● How can we debug the BOINC client? ● How can we plan for the future? ● many cores ● faster GPUs ● tight latency bounds ● large-RAM applications

Early days ● Design plausible policies, test on our computers ● Release software to the public ● Monitor message boards

Using volunteer testers ● ~100 volunteers run suite of tests on pre- release software ● Report results via web interface

What if there are problems? ● BOINC client can be configured to generate detailed log messages in ~20 areas ● Ask volunteer to send us message log ● Debug by looking at log, code ● Ask volunteer to install new version ● Repeat as needed

The BOINC Client Emulator Main logic Scheduling policies Availability Job execution Scheduler RPC Emulated (same source code) Simulated

Inputs ● Client state file ● describes hardware, availability, projects and their characteristics ● Preferences, configuration files ● Scheduling policy choices ● current client implements 2 of each ● Duration, time step of simulation

Emulator outputs ● Figures of merit ● idle fraction ● wasted fraction ● resource share violation ● monotony ● RPCs per job ● Timeline ● message log ● graphs of scheduling data

Emulator interfaces ● Web-based ● volunteers upload scenarios ● can see all scenarios, run simulations against them, comment on them ● Scripted ● sweep an input parameter ● compare 2 policies across a set of scenarios

Usage ● By volunteer testers ● see a problem ● upload scenario, reproduce problem in emulator ● developers can study problem under debugger ● By developers ● assemble a library of scenarios ● develop/debug using the emulator

Case studies ● Resource share enforcement ● Old: per resource type ● New: across all resource types ● Job fetch policy ● Old: keep buffer full; often fetch a single job ● New: hysteresis

Future work ● Characterize the scenario population ● Monte-Carlo sampling ● Study new policies ● e.g. alternatives to EDF ● More features in emulator ● memory usage ● file transfer time ● application checkpointing behavior ● Better model of scheduler behavior ● maybe emulate it also (EmBOINC)