Local Scheduling for Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab John McLeod VII Sybase March 30, 2007.

Slides:



Advertisements
Similar presentations
BOINC: A System for Public-Resource Computing and Storage David P. Anderson University of California, Berkeley.
Advertisements

Problem 12: Large Distributed Systems ECU1 BUS CC1 ECU2CC2 ECU3CC3 S1 S2 S3 S4 S5 5 Real-Time Input Streams - with jitter - with bursts - deadline > period.
© Ibrahim Korpeoglu Bilkent University
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Performance Evaluation
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Grid Computing Exposing the myths of desktop scavenging grids John Easton – IBM Grid computing
1Chapter 05, Fall 2008 CPU Scheduling The CPU scheduler (sometimes called the dispatcher or short-term scheduler): Selects a process from the ready queue.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
Volunteer Computing and Hubs David P. Anderson Space Sciences Lab University of California, Berkeley HUBbub September 26, 2013.
Integrated Risk Analysis for a Commercial Computing Service Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept.
CPU Scheduling Chapter 6 Chapter 6.
CSC 360- Instructor: K. Wu CPU Scheduling. CSC 360- Instructor: K. Wu Agenda 1.What is CPU scheduling? 2.CPU burst distribution 3.CPU scheduler and dispatcher.
CS3530 OPERATING SYSTEMS Summer 2014 Processor Scheduling Chapter 5.
1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.
Chapter 6 Scheduling. Basic concepts Goal is maximum utilization –what does this mean? –cpu pegged at 100% ?? Most programs are I/O bound Thus some other.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 31 – Multimedia OS (Part 1) Klara Nahrstedt Spring 2011.
BOINC Workshop 10 Hien Nguyen, Eshwar Rohit University of Houston Supervisors: Dr. Jaspal Subhlok University of Houston Dr. David P. Anderson SSL – U.C,
A Method for Transparent Admission Control and Request Scheduling in E-Commerce Web Sites S. Elnikety, E. Nahum, J. Tracey and W. Zwaenpoel Presented By.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Volunteer Computing with GPUs David P. Anderson Space Sciences Laboratory U.C. Berkeley.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
and Citizen Cyber-Science David P. Anderson Space Sciences Laboratory U.C. Berkeley.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Designing Middleware for Volunteer Computing.
6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 31 – Process Management (Part 1) Klara Nahrstedt Spring 2009.
BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley.
Managing Network Resources in Condor Jim Basney Computer Sciences Department University of Wisconsin-Madison
Celebrating Diversity in Volunteer Computing David P. Anderson Space Sciences Lab U.C. Berkeley Sept. 1, 2008.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 32 – Multimedia OS Klara Nahrstedt Spring 2010.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
CPU Scheduling G.Anuradha Reference : Galvin. CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 30 Dec
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
Volunteer Computing: Involving the World in Science David P. Anderson U.C. Berkeley Space Sciences Lab February 16, 2007.
David P. Anderson Space Sciences Laboratory University of California – Berkeley Supercomputing with Personal Computers.
The Limits of Volunteer Computing Dr. David P. Anderson University of California, Berkeley March 20, 2011.
David P. Anderson UC Berkeley Gilles Fedak INRIA The Computational and Storage Potential of Volunteer Computing.
Volunteer Computing and Large-Scale Simulation David P. Anderson U.C. Berkeley Space Sciences Lab February 3, 2007.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
Volunteer Computing with BOINC: a Tutorial David P. Anderson Space Sciences Laboratory University of California – Berkeley May 16, 2006.
Frontiers of Volunteer Computing David Anderson Space Sciences Lab UC Berkeley 28 Nov
Uniprocessor Scheduling Chapter 9 Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community College, Venice,
BOINC current work David Anderson 11 July Where we're at ● We've come a long way – Some successful projects – Progress on software ● The long road.
Volunteer Computing and BOINC
A new model for volunteer computing
University of California, Berkeley
OpenPBS – Distributed Workload Management System
Volunteer Computing for Science Gateways
EEE Embedded Systems Design Process in Operating Systems 서강대학교 전자공학과
Designing a Runtime System for Volunteer Computing David P
Job Scheduling in a Grid Computing Environment
Condor – A Hunter of Idle Workstation
David Cameron ATLAS Site Jamboree, 20 Jan 2017
Grid Computing Colton Lewis.
Cloud-Assisted VR.
Lecture 24: Process Scheduling Examples and for Real-time Systems
CPU Scheduling G.Anuradha
Lecture 21: Introduction to Process Scheduling
University of California, Berkeley
Lecture 2 Part 3 CPU Scheduling
Lecture 21: Introduction to Process Scheduling
Exploring Multi-Core on
Presentation transcript:

Local Scheduling for Volunteer Computing David P. Anderson U.C. Berkeley Space Sciences Lab John McLeod VII Sybase March 30, 2007

BOINC projects and volunteers ● Projects – CPDN, etc. – independent, separate servers – may be down or not have work ● Volunteers – run BOINC client on PC – “attach” client to projects > 1 project recommended – client queues jobs ● Scheduler RPCs – initiated by client – request X seconds of work

Local scheduling policies ● CPU scheduling – what job(s) to run ● Work fetch – when to get more jobs – what project to get them from – how much work to ask for

Inputs to the policies: host ● Host hardware – # CPUs, FP/int benchmarks – RAM size ● Host availability – % time BOINC is running and allowed to compute – % time network connected

Inputs: user ● User preferences – project resource shares – whether to compute when user active – time-of-day limits – CPU throttling – RAM usage limits while active/idle – Network connection interval – SchedulingInterval (~1 hour) ● User controls – suspend/resume all computation – suspend/resume/attach/detach project – suspend/resume/abort job

Inputs: static job parameters ● Each job has: – deadline – estimate of FP ops – limits on disk/RAM/FP ops ● Redundant computing and deadlines: – if a job is not reported by its deadline, it’s increasingly likely that it will get no credit, and have no value to the project.

Inputs: dynamic job parameters ● Checkpointing – requested periodically by BOINC client – reported by application ● Fraction done – reported periodically by application

Goals of local scheduling policies ● Maximize rate of granted credit – utilize CPU – avoid missed deadlines ● Enforce resource shares over long term – don’t count long unavailable periods ● Maximize project variety – avoid running 1 project for weeks

Scheduling scenarios ConnInterval:12 hrs MaxCPUS:1 ProjectsP1 P2P3 Resource share Job length100 hrs6 hrs3 hrs Job deadline200 hrs12 hrs5 hrs

Early scheduling policies ● CPU scheduling: round-robin, weighted by resource share ● Work fetch: maintain at least ConnInterval work, proportional to resource share, at least one job per project ● This fail disastrously for the given scenario (all deadlines missed) ● Earliest deadline first: a little better, but most deadlines missed

Current scheduling policies ● Job completion estimation – maintain “duration correction factor” estimate estimate: FA + (1-F)B where F is fraction done, A is estimate based on CPU time and F, B is estimate based on FP count and benchmarks ● Debt (per project) – short-term: used for CPU scheduling – long-term: used for work fetch

Round-robin simulation ● Computes: deadline misses; per-project CPU shortfall; total CPU shortfall ● Example: 2 CPUs; project A has share 2 and jobs A1, A2; project B has share 1 and job B1

CPU scheduling policy ● Do round-robin simulation ● EDF among projects with deadline misses ● weighted round-robin among others ● avoid preempting jobs that haven’t checkpointed recently

Work fetch policy ● A project is “overworked” is LongTermDebt(P) < -SchedulingInterval (this happens if P’s jobs have tight deadlines and are usually run in EDF mode) ● Policy: find a project P that maximizes LongTermDebt(P) + Shortfall(P) and is not overworked, and has no deadline misses ● Ask it for max(Shortfall(P), TotalShortfall/share) work

Memory-aware scheduling ● Periodically measure working-set size of running applications, maintain smoothed average ● Run only a set of jobs that fits in allowed RAM

Future work ● More intelligence in server – scheduler RPC includes list of queued jobs, including deadlines and fraction done – scheduler runs EDF simulation to see if each prospective job will jeopardize deadlines ● Maintain better statistical info on – job run-times – availability, network connectivity info ● Develop simulator to evaluate policies