Download presentation
Presentation is loading. Please wait.
Published byWillie Darling Modified over 9 years ago
2
Analyzing Performance test data (or how to convert your numbers to information) Carles Roch-Cunill Test Lead for System Performance McKesson Medical Imaging Group carles.roch-cunill@mckesson.com
3
6/1/2009 Agenda - Performance testing as an experimental activity - Very fast review of Scientific Method - Errors, forget them at your own risk - About the meaning of data - Some statistical concepts - Analyzing data - Adjusting your data to a model - Summary
4
6/1/2009 Performance testing as an experimental activity There are two approaches to testing: a) Without added value –This feature does not work –This requirement is not meet b) With added value –This feature does not work, and this module/component/software artifact is the culprit –This requirement is not meet, and it fails for this reason. Usually, things are not so clear, and testers statements fall somehow in the middle. Because Performance testing gathers data that can be analyzed, the performance tester is well positioned to provide added value information to the team.
5
6/1/2009 Performance testing as an experimental activity If you want to provide added value and explain why the requirement is not met you will - Formulate a hypothesis: “My performance degrades due to component X” - Test the hypothesis by developing an appropriate test environment - Gather results - Analyze the results to see if they confirm or reject your hypothesis If you are lucky and your guess (the hypothesis) was good, you will have explained at least a part of the performance behaviour. However, usually there may be other factors that may also influence your performance, so you have catch one low hanging fruit.
6
6/1/2009 Performance testing as an experimental activity You can create different test that will put more emphasis in one of the components of the system. For example, you may want to specifically measure the performance of the data repository tier, or the network, or only the UI. Depending where is your focus, your methodology and your tools will change. In all cases, you need to fix all the parameters but one. For example, if you want to study the influence of the network on your system, you need to do the following: Determine the parameters that characterize the network (latency, bandwidth, utilization…) Determine the parameters that characterize the network (latency, bandwidth, utilization…) Identify if they are independent or not (utilization and latency may not be independent) Identify if they are independent or not (utilization and latency may not be independent) Modify one parameter at a time while keeping the other constant Modify one parameter at a time while keeping the other constant
7
6/1/2009 Very fast review of Scientific Method - An effect has been observed. Example: performance degradation on your application - You try to reproduce it and learn the conditions to reproduce it at will - You may gather some data through testing - To explain the data you formulate a model (hypothesis) - You refine your testing and tailor it around your model - You analyze the new data and check if your model fits the data - If the model fits it, you are on a good footing - If the model partially fits it, you either refine your model or discard it. - If the model does not fits it, you formulate another model - In both cases, new data obtained from other tests may force you to modify/rethink or even dump your model. - Once your data fits the model, you draw conclusions based on the framework provided by the model.
8
6/1/2009 Very fast review of Scientific Method Unstated principles: Simpler is better Simpler is better Same procedure and system, you get the same results. Same procedure and system, you get the same results. A model should not introduce mode questions than it answers A model should not introduce mode questions than it answers Usually, newer models include the older models as particular cases Usually, newer models include the older models as particular cases Models are dynamic. Models are dynamic.
9
6/1/2009 Errors, forget them at your own risk Errors happen… so take them into account There are two main kind of errors: Human Errors: stopping the watch in the wrong moment, confusing digits… Human Errors: stopping the watch in the wrong moment, confusing digits… Instrument error: Your watch is not precise, has a mechanical defect… Instrument error: Your watch is not precise, has a mechanical defect…
10
6/1/2009 Errors, forget them at your own risk In the graph besides. If your error bar is ± 1, we can say the trend is to a larger value. However, if the error bar is ± 3, then we can not say anything about the trend of this data
11
6/1/2009 About the meaning of data Performance generates a lot of data. But what all the data means? To explain this data you need to take into account: Hardware Hardware Network characteristics Network characteristics Network topology Network topology Physical support for Data tier (storage, database..) Physical support for Data tier (storage, database..) The architecture of your application The architecture of your application How your application is coded How your application is coded….
12
6/1/2009 About the meaning of data In addition, you need to analyze the results in the context of the requirement or the question you are trying to answer. For example: “ Event A should not take more than x seconds” In most of the circumstances involving computer systems, you will have an stochastic component in your distribution. Assuming a normal one you will have something like
13
6/1/2009 About the meaning of data But, what exactly the requirement means? Strictly it means:
14
6/1/2009 About the meaning of data However, the requirement it usually interpreted as : For formal point of view the requirement “Event A should not take more than x seconds” would have failed with the above distribution. However the statement “The average of Event A should not take more than x seconds” would pass
15
6/1/2009 About the meaning of data The requirement can also be expressed as percentile In this case the requirement will be stated as “Event A should not take more than X seconds 50% of the time”
16
6/1/2009 Some statistical concepts Once we have defined the question, we can provide the answer. The answer will be obtained through measurements (either manual or automated). The more measurements you take, the better will be your statistics and the better will be your answers. However, the measurements need to be statistically significant. What it means is the measurement is good enough to be included in your statistics. All the measurements that are included in your statistics need to be statistically equivalent
17
6/1/2009 Some statistical concepts How you determine if your data is statistically equivalent? You can apply some complex mathematical analysis or apply common sense. Some rules of thumb: If in a single set of measurements, 20% of your data is very different, you either have a problem in your test system or you are observing different phenomena. If in a single set of measurements, 20% of your data is very different, you either have a problem in your test system or you are observing different phenomena. If you have done several runs, and the 90th percentile of a new test is bigger (smaller) than the maximum (minimum) of the previous tests, then the new data is not statistically similar, and has no statistically significance for your results. If you have done several runs, and the 90th percentile of a new test is bigger (smaller) than the maximum (minimum) of the previous tests, then the new data is not statistically similar, and has no statistically significance for your results. If you are expecting a specific distribution, and you are not getting it, the current set can not be compared (is not statistically equivalent) to the data you were expecting. If you are expecting a specific distribution, and you are not getting it, the current set can not be compared (is not statistically equivalent) to the data you were expecting. Outliers are not statistically equivalent to the rest of the set. Outliers are not statistically equivalent to the rest of the set.
18
6/1/2009 Some statistical concepts Example of 90 th percentile for Test 3 being bigger than the maximum of the other sets of measurements. In this context Test 3 is not statistically equivalent and will be rejected.
19
6/1/2009 Some statistical concepts Outliers are usually defined as Measurement outside the overall pattern of a distribution (Moore and McCabe 1999). Measurement outside the overall pattern of a distribution (Moore and McCabe 1999). A more precise definition is a point the is 1.5 more than the interquartile range above the third quartile of below the first quartile A more precise definition is a point the is 1.5 more than the interquartile range above the third quartile of below the first quartile Usually, the presence of an outlier indicates either an error in the measurement or an incomplete model
20
6/1/2009 Analyzing data While testing a non deterministic system you will always get a distribution of values, all of them valid in principle. While testing a non deterministic system you will always get a distribution of values, all of them valid in principle. For example, if your average in a measure is 3 and you sample again and get 6, this ‘6’ is also correct and you can not discard this number ( unless you do not determine this point is an outlier ). For example, if your average in a measure is 3 and you sample again and get 6, this ‘6’ is also correct and you can not discard this number ( unless you do not determine this point is an outlier ). The good news is you can extract information from this succession of different numbers. The good news is you can extract information from this succession of different numbers.
21
6/1/2009 Analyzing data For example, we may have the following collection of raw data for a measure that generically we will describe as “query database”, in seconds 4.18; 2.1; 1.9; 2.23; 4.5; 4.2; 2.19; 2.21; 4.24; 2.23; 1.99; 2.01; 2.39; 4.19; 2.42; 2.08; 2.27; 3.98; 2.21; 2.45; 4.32; average: 2.9 These results seem to be a mix of two series: 2.1; 1.9; 2.23; 2.19; 2.21; 2.23; 1.99; 2.01; 2.39; 2.42; 2.08; 2.27; 2.21; 2.45 average: 2.2 And 4.18; 4.24; 4.19; 3.98; 4.32; 4.5; 4.2 average: 4.2
22
6/1/2009 Analyzing data What the previous slide is telling us? Averaging all the results tells us nothing. The results point to a hidden effect: the system executes the query in different ways. One possible cause could be that one query joints more tables and thus, it takes more time to return the results So, if you want to answer the question of “What is the time to execute this query” you would need to be more nuanced or would need to know the frequency of these queries, so you would be able to make a weighted average.
23
6/1/2009 Adjusting your data to a model The most common one is the usual Gaussian or normal distribution, where σ is the standard deviation and μ is the average The importance of this distribution lay in the Central Limit Theorem, that indicates the distribution of random variables tend to be a normal distribution when sampled a large number of times. Example: if we assume that latency experience by users in a wireless network only depend on the distance to the hub, μ can be interpreted as the average distance of the user to the hub and σ will indicate how spread are the users around the hub.
24
6/1/2009 Adjusting your data to a model Another example of analysis: The Chi distribution Resembles in first approximation to the Gaussian distribution, however, it refers when a phenomena depends of K independent parameters, and each of them individually would provide a Gaussian distribution. Example: the observed latency time in a ADSL city wide network may depend of the network utilization, and the latency induced by the distance to the nearest hub. If we want to improve the performance of the system, then we need to tackle both problems.
25
6/1/2009 Adjusting your data to a model This would be an example of two uniform distributions
26
6/1/2009 Adjusting your data to a model If your model can not explain well the results, you need to change or improve the model If your model can not explain well the results, you need to change or improve the model A useful model should have predictive capabilities, so you can design new tests to prove/disprove the model. A useful model should have predictive capabilities, so you can design new tests to prove/disprove the model. Negative results (model disproved) can be as useful as a positive results Negative results (model disproved) can be as useful as a positive results The analysis of the performance data can help to prevent future bottlenecks and problems The analysis of the performance data can help to prevent future bottlenecks and problems The analyzed results will have a range of validity. Do not force too many consequences from them The analyzed results will have a range of validity. Do not force too many consequences from them
27
6/1/2009 Summary Performance testers provide information beyond requirement compliance Performance testers provide information beyond requirement compliance Performance testing should be treated like a experimental activity Performance testing should be treated like a experimental activity As experimental activity, scientific method is the most appropriate method of enquiry. As experimental activity, scientific method is the most appropriate method of enquiry. In tune with the scientific method, you need to make assumptions, design your experiment accordingly and reduce the error bars In tune with the scientific method, you need to make assumptions, design your experiment accordingly and reduce the error bars Data should be subject to an statistical analysis Data should be subject to an statistical analysis After the analysis, you should try explain your data with a model After the analysis, you should try explain your data with a model If the models does not a good job explaining your data, you should change/refine the model If the models does not a good job explaining your data, you should change/refine the model Your analysis should help to make the software better. Your analysis should help to make the software better.
28
6/1/2009 Analyzing Performance test data Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.