P. (Saday) Sadayappan Ohio State University

P. (Saday) Sadayappan Ohio State University
Job Scheduling P. (Saday) Sadayappan Ohio State University

Problem Statement Given a stream of parallel jobs and a set of computing resources, determine when and where to execute each job In the form that the job scheduling problem is addressed at most supercomputer centers: Homogeneous set of processors Each job asks for a specific, fixed number of processors

Job Scheduling Today Earliest job schedulers (Intel iPSC) used a simple FCFS strategy; low utilization (50%) Back-filling was implemented at Argonne Give an earliest-possible reservation to job at head of the queue, but allow a later arriving job to bypass it, if the reservation is not violated Utilization improves to ~90% Used at most production facilities today

Can Performance be Improved?
Metrics: System Metric: Utilization User Metrics: Response time (wait+run time), Slowdown (response-time/run-time) Over a hundred papers published: Focus mainly on improving user metrics: much greater potential for its improvement than utilization Question: How important is it to squeeze an additional 5-10% utilization on a system that is already achieving over 85% utilization?

Improving Response Time
Question: How important is it to evaluate alternatives to standard back-fill scheduling, with a goal of improved user response-time? Many studies have reported simulation studies showing significant improvement of slowdown or response-time with new schemes; but most production schedulers simply use aggressive back-fill. Why?

Possible Reasons for Non-Adoption
Academic studies do not model specific policy issues of a center, e.g. “good citizen rules,” multiple queues etc. Most results are based on job log traces at Feitelson’s archive, with many logs from academic centers exhibiting low system utilization (< 70%). Most studies report overall averages over entire trace: insufficient to assess impact of change: E.g., using a Shortest-Job-First queue policy instead of the usual FCFS policy significantly improves overall average slowdown by a factor of 4; but increases response time for 24 hour jobs to 50 hours instead of 26 hours.

QoS for Job Scheduling Job schedulers do not provide QoS:
No response time guarantees No equitable way of offering different service for urgent versus non-urgent jobs Technical and Accounting issues: Develop job schedulers that can do deadline-based scheduling Develop accounting models to charge based on urgency of job: Charge = f1(resource-usage) + f2(wait-time-limit) Question: How desirable is it to develop job schedulers with QoS functionality?

Questions?

P. (Saday) Sadayappan Ohio State University

Similar presentations

Presentation on theme: "P. (Saday) Sadayappan Ohio State University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

P. (Saday) Sadayappan Ohio State University

Similar presentations

Presentation on theme: "P. (Saday) Sadayappan Ohio State University"— Presentation transcript:

Similar presentations

About project

Feedback