The Forgotten Factor: FACTS on Performance Evaluation and its Dependence on Workloads Dror Feitelson Hebrew University.

Slides:



Advertisements
Similar presentations
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Advertisements

JSSPP-11, Boston, MA June 19, Pitfalls in Parallel Job Scheduling Evaluation Designing Parallel Operating Systems using Modern Interconnects Pitfalls.
Parallel Job Scheduling Algorithms and Interfaces Research Exam for Cynthia Bailey Lee Department of Computer Science and Engineering University of California,
Scheduling in Batch Systems
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Performance Evaluation
What we will cover…  CPU Scheduling  Basic Concepts  Scheduling Criteria  Scheduling Algorithms  Evaluations 1-1 Lecture 4.
Workload Modeling and its Effect on Performance Evaluation Dror Feitelson Hebrew University.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Chapter 5: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Basic Concepts Maximum CPU utilization.
Chapter 6: CPU Scheduling
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
CPU S CHEDULING Lecture: Operating System Concepts Lecturer: Pooja Sharma Computer Science Department, Punjabi University, Patiala.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Chapter 3 System Performance and Models. 2 Systems and Models The concept of modeling in the study of the dynamic behavior of simple system is be able.
Scheduling. Alternating Sequence of CPU And I/O Bursts.
Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
Alternating Sequence of CPU And I/O Bursts. Histogram of CPU-burst Times.
Chapter 10 Verification and Validation of Simulation Models
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 5 CPU Scheduling Slide 1 Chapter 5 CPU Scheduling.
6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
Lecture Topics: 11/15 CPU scheduling: –Scheduling goals and algorithms.
1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
1 Module 5: Scheduling CPU Scheduling Scheduling Algorithms Reading: Chapter
Basic Concepts Maximum CPU utilization obtained with multiprogramming
OPERATING SYSTEMS CS 3502 Fall 2017
OPERATING SYSTEMS CS 3502 Fall 2017
Copyright ©: Nahrstedt, Angrave, Abdelzaher
CPU SCHEDULING.
Dan C. Marinescu Office: HEC 439 B. Office hours: M, Wd 3 – 4:30 PM.
Dror Feitelson Hebrew University
Workload Modeling and its Effect on Performance Evaluation
Copyright ©: Nahrstedt, Angrave, Abdelzaher
CPU Scheduling Chapter 5.
Chapter 6: CPU Scheduling
Chapter 10 Verification and Validation of Simulation Models
CS 143A - Principles of Operating Systems
CPU Scheduling Basic Concepts Scheduling Criteria
CPU Scheduling G.Anuradha
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Chapter 5: CPU Scheduling
Operating System Concepts
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter5: CPU Scheduling
Chapter 6: CPU Scheduling
CPU SCHEDULING.
Chapter 5: CPU Scheduling
CPU Scheduling David Ferry CSCI 3500 – Operating Systems
Processor Scheduling Hank Levy 1.
Uniprocessor scheduling
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Chapter 6: CPU Scheduling
Experimental Computer Science: Focus on Workloads
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
CPU Scheduling David Ferry CSCI 3500 – Operating Systems
Presentation transcript:

The Forgotten Factor: FACTS on Performance Evaluation and its Dependence on Workloads Dror Feitelson Hebrew University

Performance Evaluation In system design –Selection of algorithms –Setting parameter values In procurement decisions –Value for money –Meet usage goals For capacity planing

The Good Old Days… The skies were blue The simulation results were conclusive Our scheme was better than theirs Feitelson & Jette, JSSPP 1997

But in their papers, Their scheme was better than ours!

How could they be so wrong?

The system’s design (What we teach in algorithms and data structures) Its implementation (What we teach in programming courses) The workload to which it is subjected The metric used in the evaluation Interactions between these factors Performance evaluation depends on:

The system’s design (What we teach in algorithms and data structures) Its implementation (What we teach in programming courses) The workload to which it is subjected The metric used in the evaluation Interactions between these factors Performance evaluation depends on:

Outline for Today Three examples of how workloads affect performance evaluation Workload modeling Research agenda In the context of parallel job scheduling

Example #1 Gang Scheduling and Job Size Distribution

Gang What?!? Time slicing parallel jobs with coordinated context switching Ousterhout matrix Ousterhout, ICDCS 1982

Gang What?!? Time slicing parallel jobs with coordinated context switching Ousterhout matrix Optimization: Alternative scheduling Ousterhout, ICDCS 1982

Packing Jobs Use a buddy system for allocating processors Feitelson & Rudolph, Computer 1990

Packing Jobs Use a buddy system for allocating processors

Packing Jobs Use a buddy system for allocating processors

Packing Jobs Use a buddy system for allocating processors

Packing Jobs Use a buddy system for allocating processors

The Question: The buddy system leads to internal fragmentation But it also improves the chances of alternative scheduling, because processors are allocated in predefined groups Which effect dominates the other?

The Answer (part 1): Feitelson & Rudolph, JPDC 1996

Proof of Utilization Bound A uniform distribution:

Proof of Utilization Bound Round up to next power of 2:

Proof of Utilization Bound Recover some fragmented space using selective disabling:

The Answer (part 2):

Many small jobs Many sequential jobs Many power of two jobs Practically no jobs use full machine Conclusion: buddy system should work well

Verification Feitelson, JSSPP 1996

Example #2 Parallel Job Scheduling and Job Scaling

Variable Partitioning Each job gets a dedicated partition for the duration of its execution Resembles 2D bin packing Packing large jobs first should lead to better performance But what about correlation of size and runtime?

“Scan” Algorithm Keep jobs in separate queues according to size (sizes are powers of 2) Serve the queues Round Robin, scheduling all jobs from each queue (they pack perfectly) Assuming constant work model, large jobs only block the machine for a short time Krueger et al., IEEE TPDS 1994

Scaling Models Constant work –Parallelism for speedup: Amdahl’s Law –Large first  SJF Constant time –Size and runtime are uncorrelated Memory bound –Large first  LJF –Full-size jobs lead to blockout Worley, SIAM JSSC 1990

The Data Data: SDSC Paragon, 1995/6

The Data Data: SDSC Paragon, 1995/6

The Data Data: SDSC Paragon, 1995/6

Conclusion Parallelism used for better results, not for faster results Constant work model is unrealistic Memory bound model is reasonable Scan algorithm will probably not perform well in practice

Example #3 Backfilling and User Runtime Estimation

Backfilling Variable partitioning can suffer from external fragmentation Backfilling optimization: move jobs forward to fill in holes in the schedule Requires knowledge of expected job runtimes

Variants EASY backfilling Make reservation for first queued job Conservative backfilling Make reservation for all queued jobs

User Runtime Estimates Lower estimates improve chance of backfilling and better response time Too low estimates run the risk of having the job killed So estimates should be accurate, right?

They Aren’t Mu’alem & Feitelson, IEEE TPDS 2001

Surprising Consequences Inaccurate estimates actually lead to improved performance Performance evaluation results may depend on the accuracy of runtime estimates –Example: EASY vs. conservative –Using different workloads –And different metrics

EASY vs. Conservative Using CTC SP2 workload

EASY vs. Conservative Using Jann workload model

EASY vs. Conservative Using Feitelson workload model

Conflicting Results Explained Jann uses accurate runtime estimates This leads to a tighter schedule EASY is not affected too much Conservative manages less backfilling of long jobs, because respects more reservations

Conservative is bad for the long jobs Good for short ones that are respected Conservative EASY

Conflicting Results Explained Response time sensitive to long jobs, which favor EASY Slowdown sensitive to short jobs, which favor conservative All this does not happen at CTC, because estimates are so loose that backfill can occur even under conservative

Verification Run CTC workload with accurate estimates

But What About My Model? Simply does not have such small long jobs

Workload Modeling

No Data Innovative unprecedented systems –Wireless –Hand-held Use an educated guess –Self similarity –Heavy tails –Zipf distribution

Serendipitous Data Data may be collected for various reasons –Accounting logs –Audit logs –Debugging logs –Just-so logs Can lead to wealth of information

NASA Ames iPSC/860 log jobs from Oct-Dec 1993 user job nodes runtime date time user4 cmd /10/93 10:13:17 user4 cmd /10/93 10:19:30 user42 nqs /10/93 10:22:07 user41 cmd /10/93 10:22:37 sysadmin pwd /10/93 10:22:42 user4 cmd /10/93 10:25:42 sysadmin pwd /10/93 10:30:43 user41 cmd /10/93 10:31:32 Feitelson & Nitzberg, JSSPP 1995

Distribution of Job Sizes

Distribution of Resource Use

Degree of Multiprogramming

System Utilization

Job Arrivals

Arriving Job Sizes

Distribution of Interarrival Times

Distribution of Runtimes

Job Scaling

User Activity

Repeated Execution

Application Moldability

Distribution of Run Lengths

Predictability in Repeated Runs

Research Agenda

The Needs New systems tend to be more complex Differences tend to be finer Evaluations require more detailed data Getting more data requires more work Important areas: –Internal structure of applications –User behavior

Generic Application Model Iterations of –Compute granularity Memory working set / locality –I/O Interprocess locality –Communicate Pattern, volume Option of phases with different patterns of iterations compute I/O communicate

Consequences Model the interaction of the application with the system –Support for communication pattern –Availability of memory Application attributes depend on system Effect of multi-resource schedulers

Missing Data There has been some work on the characterization of specific applications There has been no work on the distribution of application types in a complete workload –Distribution of granularities –Distribution of working set sizes –Distribution of communication patterns

Effect of Users Workload is generated by users Human users do not behave like a random sampling process –Feedback based on system performance –Repetitive working patterns

Feedback User population is finite Users back off when performance is inadequate Negative feedback Better system stability Need to explicitly model this behavior

Locality of Sampling Users display different levels of activity at different times At any given time, only a small subset of users is active These users repeatedly do the same thing Workload observed by system is not a random sample from long-term distribution

Final Words…

We like to think that we design systems based on solid foundations…

But beware: the foundations might be unbased assumptions!

Computer Systems are Complex We should have more “science” in computer science: Run experiments under different conditions Make measurements and observations Make predictions and verify them

Acknowledgements Students: Ahuva Mu’alem, David Talby, Uri Lublin Larry Rudolph / MIT Data in Parallel Workloads Archive –Joefon Jann / IBM –CTC SP2 log –SDSC Paragon log –SDSC SP2 log –NASA iPSC/860 log