Understanding Performance in Operating Systems

Slides:



Advertisements
Similar presentations
Topics to be discussed Introduction Performance Factors Methodology Test Process Tools Conclusion Abu Bakr Siddiq.
Advertisements

1 CS533 Modeling and Performance Evaluation of Network and Computer Systems Capacity Planning and Benchmarking (Chapter 9)
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark.
ITEC 451 Network Design and Analysis. 2 You will Learn: (1) Specifying performance requirements Evaluating design alternatives Comparing two or more systems.
Chapter 4 M. Keshtgary Spring 91 Type of Workloads.
The Comparison of the Software Cost Estimating Methods
Lecture 2 Page 1 CS 236, Spring 2008 Security Principles and Policies CS 236 On-Line MS Program Networks and Systems Security Peter Reiher Spring, 2008.
SE 450 Software Processes & Product Metrics Reliability: An Introduction.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Performance Evaluation
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
1 PERFORMANCE EVALUATION H Often in Computer Science you need to: – demonstrate that a new concept, technique, or algorithm is feasible –demonstrate that.
Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Swami NatarajanJuly 14, 2015 RIT Software Engineering Reliability: Introduction.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
Chapter 8: Systems analysis and design
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Ratio Games and Designing Experiments Andy Wang CIS Computer Systems Performance Analysis.
Lecture 2b: Performance Metrics. Performance Metrics Measurable characteristics of a computer system: Count of an event Duration of a time interval Size.
Selecting Evaluation Techniques Andy Wang CIS 5930 Computer Systems Performance Analysis.
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
A Really Bad Graph. For Discussion Today Project Proposal 1.Statement of hypothesis 2.Workload decisions 3.Metrics to be used 4.Method.
Operating Systems Lecture 2 Processes and Threads Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Measurement Tools Andy Wang CIS Computer Systems Performance Analysis.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
1 Common Mistakes in Performance Evaluation (1) 1.No Goals  Goals  Techniques, Metrics, Workload 2.Biased Goals  (Ex) To show that OUR system is better.
Modeling Virtualized Environments in Simalytic ® Models by Computing Missing Service Demand Parameters CMG2009 Paper 9103, December 11, 2009 Dr. Tim R.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Understanding Performance in Operating Systems Andy Wang COP 5611 Advanced Operating Systems.
© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final.
Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental.
Introduction Andy Wang CIS Computer Systems Performance Analysis.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Computer System Structures
Common Mistakes in Performance Evaluation The Art of Computer Systems Performance Analysis By Raj Jain Adel Nadjaran Toosi.
OPERATING SYSTEMS CS 3502 Fall 2017
Resource Management IB Computer Science.
Selecting Evaluation Techniques
Software Architecture in Practice
Performance evaluation
Network Performance and Quality of Service
Advanced QlikView Performance Tuning Techniques
Basics of Intrusion Detection
Andy Wang CIS 5930 Computer Systems Performance Analysis
Ratio Games and Designing Experiments
Where are being used the OS?
ITEC 451 Network Design and Analysis
Outline Introduction Characteristics of intrusion detection systems
Chapter 10 Verification and Validation of Simulation Models
Chapter 12: Automated data collection methods
Chapter 9: Virtual-Memory Management
Computer Systems Performance Evaluation
CPU SCHEDULING.
Andy Wang CIS 5930 Computer Systems Performance Analysis
Outline Chapter 2 (cont) OS Design OS structure
Computer Systems Performance Evaluation
Why Threads Are A Bad Idea (for most purposes)
Building Valid, Credible, and Appropriately Detailed Simulation Models
Andy Wang CIS Computer Systems Performance Analysis
DESIGN OF EXPERIMENTS by R. C. Baker
Why Threads Are A Bad Idea (for most purposes)
Security Principles and Policies CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
Why Threads Are A Bad Idea (for most purposes)
Lesson Overview 1.1 What Is Science?.
Presentation transcript:

Understanding Performance in Operating Systems Andy Wang COP 5611 Advanced Operating Systems

Outline Importance of operating systems performance Major issues in understanding operating systems performance Issues in experiment design

Importance of OS Performance Performance is almost always a key issue in operating systems File system research OS tools for multimedia Practically any OS area Since everyone uses the OS (sometimes heavily), everyone is impacted by its performance A solution that doesn’t perform well isn’t a solution at all

Importance of Understanding OS Performance Great, so we work on improving OS performance How do we tell if we succeeded? Successful research must prove its performance characteristics to a skeptical community

So What? Proper performance evaluation is difficult Knowing what to study is tricky Performance evaluations take a lot of careful work Understanding the results is hard Presenting them effectively is challenging

For Example, An idea - save power from a portable computer’s battery by using its wireless card to execute tasks remotely Maybe that’s a good idea, maybe it isn’t How do we tell? Performance experiments to validate concept

But What Experiments? What tasks should we check? What should be the conditions of the portable computer? What should be the conditions of the network? What should be the conditions of the server? How do I tell if my result is statistically valid?

Issues in Understanding OS Performance Techniques for understanding OS performance Elements of performance evaluation Common mistakes in performance evaluation Choosing proper performance metrics Workload design/selection Monitors Software measurement tools

Techniques for Understanding OS Performance Analytic modeling Simulation Measurement Which technique is right for a given situation?

Analytic Modeling Sometimes relatively quick Within limitations of model, testing alternatives usually easy Mathematical tractability may require simplifications Not everything models well Question of validity of model

Simulation Great flexibility Can capture an arbitrary level of detail Often a tremendous amount of work to write and run Testing a new alternative often requires repeating a lot of work Question of validity of simulation

Experimentation Lesser problems of validity Sometimes easy to get started Can be very labor-intensive Often hard to perform measurement Sometimes hard to separate out effects you want to study Sometimes impossible to generate cases you need to study

Elements of Performance Evaluation Performance metrics Workloads Proper measurement technique Proper statistical techniques Minimization of effort Proper data presentation techniques

Performance Metrics The criteria used to evaluate the performance of a system E.g., response time, cache hit ratio, bandwidth delivered, etc. Choosing the proper metrics is key to a real understanding of system performance

Workloads The requests users make on a system If you don’t evaluate with a proper workload, you aren’t measuring what real users will experience Typical workloads - Stream of file system requests Set of jobs performed by users List of URLs submitted to a Web server

Proper Performance Measurement Techniques You need at least two components to measure performance 1. A load generator To apply a workload to the system 2. A monitor To find out what happened

Proper Statistical Techniques Computer performance measurements generally not purely deterministic Most performance evaluations weigh the effects of different alternatives How to separate meaningless variations from vital data in measurements? Requires proper statistical techniques

Minimizing Your Work Unless you design carefully, you’ll measure a lot more than you need to A careful design can save you from doing lots of measurements Should identify critical factors And determine the smallest number of experiments that gives a sufficiently accurate answer

Proper Data Presentation Techniques You’ve got pertinent, statistically accurate data that describes your system Now what? How to present it - Honestly Clearly Convincingly

Why Is Performance Analysis Difficult? Because it’s an art - it’s not mechanical You can’t just apply a handful of principles and expect good results You’ve got to understand your system You’ve got to select your measurement techniques and tools properly You’ve got to be careful and honest

Some Common Mistakes in Performance Evaluation No goals Biased goals Unsystematic approach Analysis without understanding Incorrect performance metrics Unrepresentative workload Wrong evaluation technique

More Common Performance Evaluation Mistakes Overlooking important parameters Ignoring significant factors Inappropriate experiment design No analysis Erroneous analysis No sensitivity analysis

Yet More Common Mistakes Ignoring input errors Improper treatment of outliers Assuming static systems Ignoring variability Too complex analysis Improper presentation of results Ignoring social aspects Omitting assumptions/limitations

Choosing Proper Performance Metrics Three types of common metrics: Time (responsiveness) Processing rate (productivity) Resource consumption (utilization) Can also measure various error parameters

Response Time How quickly does system produce results? Critical for applications such as: Time sharing/interactive systems Real-time systems Parallel computing

Processing Rate How much work is done per unit time? Important for: Determining feasibility of hardware Comparing different configurations Multimedia

Resource Consumption How much does the work cost? Used in: Capacity planning Identifying bottlenecks Also helps to identify the “next” bottleneck

Typical Error Metrics Successful service (speed) Incorrect service (reliability) No service (availability)

Characterizing Metrics Usually necessary to summarize Sometimes means are enough Variability is usually critical

Essentials of Statistical Evaluation Choose an appropriate summary Mean, median, and/or mode Report measures of variation Standard deviation, range, etc. Provide confidence intervals (³95%) Use confidence intervals to compare means

Choosing What to Measure Pick metrics based on: Completeness (Non-)redundancy Variability

Designing Workloads What is a workload? Synthetic workloads Real-World benchmarks Application benchmarks “Standard” benchmarks Exercisers and drivers

What is a Workload? A workload is anything a computer is asked to do Test workload: any workload used to analyze performance Real workload: any workload observed during normal operations Synthetic workload: any workload created for controlled testing

Real Workloads They represent reality Uncontrolled Can’t be repeated Can’t be described simply Difficult to analyze Nevertheless, often useful for “final analysis” papers

Synthetic Workloads Controllable Repeatable Portable to other systems Easily modified Can never be sure real world will be the same

What Are Synthetic Workloads? Complete programs designed specifically for measurement May do real or “fake” work May be adjustable (parameterized) Two major classes: Benchmarks Exercisers

Real-World Benchmarks Pick a representative application and sample data Run it on system to be tested Modified Andrew Benchmark, MAB, is a real-world benchmark Easy to do, accurate for that sample application and data Doesn’t consider other applications and data

Application Benchmarks Variation on real-world benchmarks Choose most important subset of functions Write benchmark to test those functions Tests what computer will be used for Need to be sure it captures all important characteristics

“Standard” Benchmarks Often need to compare general-purpose systems for general-purpose use Should I buy a Compaq or a Dell PC? Tougher: Mac or PC? Need an easy, comprehensive answer People writing articles often need to compare tens of machines

“Standard” Benchmarks (cont’d) Often need comparisons over time How much faster is this year’s Pentium Pro than last year’s Pentium? Writing new benchmark undesirable Could be buggy or not representative Want to compare many people’s results

Exercisers and Drivers For I/O, network, non-CPU measurements Generate a workload, feed to internal or external measured system I/O on local OS Network Sometimes uses dedicated system, interface hardware

Advantages and Disadvantages of Exercisers Easy to develop, port Incorporates measurement Easy to parameterize, adjust High cost if external Often too small compared to real workloads

Workload Selection Services exercised Completeness Level of detail Representativeness Timeliness Other considerations

Services Exercised What services does system actually use? Speeding up response to keystrokes won’t help a file server What metrics measure these services?

Completeness Computer systems are complex Effect of interactions hard to predict So must be sure to test entire system Important to understand balance between components

Level of Detail Detail trades off accuracy vs. cost Highest detail is complete trace Lowest is one request, usually most the common request Intermediate approach: weight by frequency

Representativeness Obviously, workload should represent desired application Again, accuracy and cost trade off Need to understand whether detail matters

Timeliness Usage patterns change over time File size grows to match disk size If using “old” workloads, must be sure user behavior hasn’t changed Even worse, behavior may change after test, as result of installing new system “Latent demand” phenomenon

Other Considerations Loading levels Repeatability of workload Full capacity Beyond capacity Actual usage Repeatability of workload

Monitors A monitor is a tool used to observe system activity Proper use of monitors is key to performance analysis Also useful for other system observation purposes

Event-Driven Vs. Sampling Monitors Event-driven monitors notice every time a particular type of event occurs Ideal for rare events Require low per-invocation overheads Sampling monitors check the state of the system periodically Good for frequent events Can afford higher overheads

On-Line Vs. Batch Monitors On-line monitors can display their information continuously Or, at least, frequently Batch monitors save it for later Usually using separate analysis procedures

Issues in Monitor Design Activation mechanism Buffer issues Data compression/analysis Priority issues Abnormal events monitoring Distributed systems

Activation Mechanism When do you collect the data? Several possibilities: When an interesting event occurs, trap to data collection routine Analyze every step taken by system Go to data collection routine when timer expires

Buffer Issues Buffer size should be big enough to avoid frequent disk writes But small enough to make disk writes cheap Use at least two buffers, typically One to fill up, one to record Must think about buffer overflow

Data Compression or Analysis Data can be literally compressed Or can be reduced to a summary form Both methods save space But at the cost of extra overhead Sometimes can use idle time for this But idle time might be better spent dumping data to disk

Priority of Monitor How high a priority should the monitor’s operations have? Again, trading off performance impact against timely and complete data gathering Not always a simple question

Monitoring Abnormal Events Often, knowing about failures and errors more important than knowing about normal operation Sometimes requires special attention System may not be operating very well at the time of the failure

Monitoring Distributed Systems Monitoring a distributed system is not dissimilar to designing a distributed system Must deal with: Distributed state Unsynchronized clocks Partial failures

Tools For Software Measurement Code instrumentation Tracing packages System-provided metrics and utilities Profiling

Code Instrumentation Adding monitoring code to the system under study Usually most direct way to gather data Complete flexibility Strong control over costs of monitoring Requires access to the source Requires strong knowledge of code Strong potential to affect performance

Typical Types of Instrumentation Counters Cheap and fast But low level of detail Logs More detail But more costly Require occasional dumping or digesting Timers

Tracing Packages Allow dynamic monitoring of code that doesn’t have built-in monitors Akin to debuggers Allows arbitrary insertion of code No recompilation required Tremendous flexibility No overhead when you’re not using it Somewhat higher overheads Effective use requires access to source

System-Provided Metrics and Utilities Many operating systems provide users access to some metrics Most operating systems also keep some form of accounting logs Lots of information can be gathered this way

Profiling Many compilers provide easy facilities for profiling code Easy to use Low impact on system Requires recompilation Provides very limited information

Introduction To Experiment Design You know your metrics You know your factors You’ve got your instrumentation and test loads Now what?

Goals in Experiment Design Obtain maximum information with minimum work Typically meaning minimum number of experiments More experiments aren’t better if you have to perform them Well-designed experiments are also easier to analyze

Experimental Replications A run of the experiment with a particular set of levels and other inputs is a replication Often, you need to do multiple replications with a single set of levels and other inputs For statistical validation

Interacting Factors Some factors have effects completely independent of each other Double the factor’s level, halve the response, regardless of other factors But the effects of some factors depends on the values of other factors Interacting factors Presence of interacting factors complicates experimental design

Basic Problem in Designing Experiments Your chosen factors may or may not interact How can you design an experiment that captures the full range of the levels? With minimum amount of work

Common Mistakes in Experimentation Ignoring experimental error Uncontrolled parameters Not isolating effects of different factors One-factor-at-a-time experiment designs Interactions ignored Designs require too many experiments