Test Loads Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Slides:



Advertisements
Similar presentations
Topics to be discussed Introduction Performance Factors Methodology Test Process Tools Conclusion Abu Bakr Siddiq.
Advertisements

1 CS533 Modeling and Performance Evaluation of Network and Computer Systems Capacity Planning and Benchmarking (Chapter 9)
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Numbers, We don’t need no stinkin’ numbers Adam Backman Vice President DBAppraise, Llc.
Lecture 19 Page 1 CS 111 Online Protecting Operating Systems Resources How do we use these various tools to protect actual OS resources? Memory? Files?
G. Alonso, D. Kossmann Systems Group
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Swami NatarajanJune 17, 2015 RIT Software Engineering Reliability Engineering.
SE 450 Software Processes & Product Metrics Reliability Engineering.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Performance Evaluation
Chapter 1 and 2 Computer System and Operating System Overview
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 5): Outliers Fall, 2008.
Monitoring and Pollutant Load Estimation. Load = the mass or weight of pollutant that passes a cross-section of the river in a specific amount of time.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
Experimental Statistics I.  We use data to answer research questions  What evidence does data provide?  How do I make sense of these numbers without.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
1 Functional Testing Motivation Example Basic Methods Timing: 30 minutes.
Analysis of Simulation Results Andy Wang CIS Computer Systems Performance Analysis.
Lecture 18 Page 1 CS 111 Online Design Principles for Secure Systems Economy Complete mediation Open design Separation of privileges Least privilege Least.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
Bottlenecks: Automated Design Configuration Evaluation and Tune.
Ratio Games and Designing Experiments Andy Wang CIS Computer Systems Performance Analysis.
Other Regression Models Andy Wang CIS Computer Systems Performance Analysis.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
CSC 213 – Large Scale Programming Lecture 37: External Caching & (a,b)-Trees.
Slide 11-1 Copyright © 2004 Pearson Education, Inc.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 10, Slide 1 Chapter 10 Understanding Randomness.
IT253: Computer Organization
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Linux Security. Authors:- Advanced Linux Programming by Mark Mitchell, Jeffrey Oldham, and Alex Samuel, of CodeSourcery LLC published by New Riders Publishing.
Task Analysis Methods IST 331. March 16 th
Example: Rumor Performance Evaluation Andy Wang CIS 5930 Computer Systems Performance Analysis.
Lecture 12 Page 1 CS 236, Spring 2008 Virtual Private Networks VPNs What if your company has more than one office? And they’re far apart? –Like on opposite.
Lecture 11 Page 1 CS 111 Online Virtual Memory A generalization of what demand paging allows A form of memory where the system provides a useful abstraction.
February 15, 2004 Software Risk Management Copyright © , Dennis J. Frailey, All Rights Reserved Simple Steps for Effective Software Risk Management.
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
1 Software. 2 What is software ► Software is the term that we use for all the programs and data on a computer system. ► Two types of software ► Program.
Modeling Virtualized Environments in Simalytic ® Models by Computing Missing Service Demand Parameters CMG2009 Paper 9103, December 11, 2009 Dr. Tim R.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Understanding Performance in Operating Systems Andy Wang COP 5611 Advanced Operating Systems.
Marcelo R.N. Mendes. What is FINCoS? A set of tools for data generation, load submission, and performance measurement of CEP systems; Main Characteristics:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Lecture 15 Page 1 CS 236 Online Evaluating Running Systems Evaluating system security requires knowing what’s going on Many steps are necessary for a full.
Introduction Andy Wang CIS Computer Systems Performance Analysis.
Page 1 Monitoring, Optimization, and Troubleshooting Lecture 10 Hassan Shuja 11/30/2004.
Copyright © Curt Hill More on Operating Systems Continuation of Introduction.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Ch. 31 Q and A IS 333 Spring 2016 Victor Norman. SNMP, MIBs, and ASN.1 SNMP defines the protocol used to send requests and get responses. MIBs are like.
OPERATING SYSTEMS CS 3502 Fall 2017
Jonathan Walpole Computer Science Portland State University
Debugging Intermittent Issues
Software Architecture in Practice
Network Performance and Quality of Service
Andy Wang CIS 5930 Computer Systems Performance Analysis
Debugging Intermittent Issues
Page Replacement.
CSE451 Virtual Memory Paging Autumn 2002
MECH 3550 : Simulation & Visualization
Presentation transcript:

Test Loads Andy Wang CIS Computer Systems Performance Analysis

2 Applying Test Loads to Systems Designing test loads Tools for applying test loads

3 Test Load Design Most experiments require applying test loads to system General characteristics of test loads already discussed How do we design test loads?

4 Types of Test Loads Real users Traces Load-generation programs

5 Loads Caused by Real Users Put real people in front of your system Two choices: –Have them run pre-arranged set of tasks –Have them do what they’d normally do Always a difficult approach –Labor-intensive –Impossible to reproduce given load –Load is subject to many external influences But highly realistic

6 Traces Collect set of commands/accesses issued to system under test (or similar system) Replay against your system Some traces of common activities available from others (e.g., file accesses) –But often don’t contain everything you need

7 Issues in Using Traces May be hard to alter or extend Accuracy of trace may depend on behavior of system –If it takes twice as long in your system as in traced system, maybe traced execution would have been different –Only truly representative of traced system and execution

8 Running Traces Need process that reads trace, keeps track of progress, and issues commands from trace when appropriate Process must be reasonably accurate in timing –But must also have little performance impact If trace is large, can’t keep it all in main memory –But be careful of disk overheads

9 Load-Generation Programs Create model for load you want to apply Write program implementing that model Program issues commands & requests synthesized from model –E.g., if model says open file, program builds appropriate open() command

10 Building the Model Tradeoff between ease of creation and use of model vs. its accuracy Base model on everything you can find out about the real system behavior –Which may include examining traces Consider whether model can be memoryless, or requires keeping track of what’s already happened (Markov)

11 Using the Model May require creation of test files or processes or network connections –Model should include how they should be created Shoot for minimum performance impact on system under test of program that implements model

12 Applying Test Loads Most experiments will need multiple repetitions –Details covered later in course Results most accurate if each repetition runs in identical conditions  Test software should work hard to duplicate conditions on each run

13 Example of Applying Test Loads Using Ficus experiments discussed earlier, want performance impact of update propagation for multiple replicas Test load is set of benchmarks involving file access & other activities Must apply test load for varying numbers of replicas

14 Factors in Designing This Experiment Setting up volumes and replicas Network traffic Other load on test machines Caching effects Automation of experiment –Very painful to start each run by hand

15 Experiment Setup Need volumes to read and write, and replicas of each volume on various machines Must be certain that setup completes before we start running experiment

16 Network Traffic Issues If experiment is distributed (like ours), how is it affected by other traffic on network? Is traffic seen on network used in test similar to traffic expected on network you would actually use? If not, do you need to run on isolated network? And/or generate appropriate background network load?

17 Controlling Other Load Generally, want to have as much control as possible over other processes running on test machines Ideally, use dedicated machines But also be careful about background and periodic jobs –In Unix context, check carefully on cron and network-related daemons

18 Caching Effects Many types of jobs run much faster if things are in cache –Other things also change Is caching effect part of what you’re measuring? –If not, do something to clean out caches between runs –Or arrange experiment so caching doesn’t help But sometimes you should measure caching

19 Automating Experiments For all but very small experiments, it pays to automate Don’t want to start each run by hand Automation must be done with care –Make sure previous run is really complete –Make sure you completely reset your state –Make sure the data is really collected!

20 Common Mistakes in Benchmarking Many people have made these You will make some of them, too But watch for them, so you don’t make too many

21 Only Testing Average Behavior Test workload should usually include divergence from average workload –Few workloads always remain at their average –Behavior at extreme points is often very different Particularly bad if only average behavior is used

22 Ignoring Skewness of Device Demands More generally, not including skewness of any component –E.g., distribution of file accesses among set of users Leads to unrealistic conclusions about how system behaves

23 Loading Levels Controlled Inappropriately Not all methods of controlling load are equivalent –More users (more cache misses) –Less think time (fewer disk flushes) –Increased resource demands per user Choose methods that capture effect you are testing for Prefer methods allowing more flexibility in control over those allowing less

24 Caching Effects Ignored Caching occurs many places in modern systems Performance on given request usually very different depending on cache hit or miss Must understand how cache works Must design experiment to use it realistically

25 Inappropriate Buffer Sizes Slight changes in buffer sizes can greatly affect performance in many systems Match reality

26 Sampling Inaccuracies Ignored Remember that your samples are random events Use statistical methods to analyze them Beware of sampling techniques whose periodicity interacts with what you’re looking for –Best to randomize experiment order

27 Ignoring Monitoring Overhead Primarily important in design phase –If possible, must minimize overhead to point where it is not relevant But also important to consider it in analysis

28 Not Validating Measurements Just because your measurement says something is so, it isn’t necessarily true Extremely easy to make mistakes in experimentation Check whatever you can Treat surprising measurements especially carefully

29 Not Ensuring Constant Initial Conditions Repeated runs are only comparable if initial conditions are the same Not always easy to undo everything previous run did –E.g., same state of disk fragmentation as before But do your best –Understand where you don’t have control in important cases

30 Not Measuring Transient Performance Many systems behave differently at steady state than at startup (or shutdown) That’s not always everything we care about Understand whether you should care If you should, measure transients too Not all transients due to startup/shutdown

31 Performance Comparison Using Device Utilizations Sometimes this is right thing to do –But only if device utilization is metric of interest Remember that faster processors will have lower utilization on same load –Which isn’t bad

32 Lots of Data, Little Analysis The data isn’t the product! The analysis is! So design experiment to leave time for sufficient analysis If things go wrong, alter experiments to still leave analysis time

White Slide