Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.

Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs

Workload Characterization A probability distribution associated with the input domain that describes how the system is used when it is operational in the field. Also called an operational profile or operational distribution. It is derived by monitoring field usage. We have used it to select (correctness) test cases, predict risk, assess software reliability, predict scalability, select performance test cases, among other things.

Steps in Characterizing the Workload Model the software system. –Identify key parameters that characterize the system’s behavior. –Getting the granularity right. Collect data while the system is operational or from related operational system. Analyze the data and determine the probability distribution.

System Description Automated customer care system, built by another company, that can be accessed by both customer care agents and customers. It contains a large database with a web browser front-end and a cache facility. For this system data was collected for 2 ½ months and analyzed page hits at 15 minute intervals.

Implementation Information The system was implemented as an extension to the http web server daemon, with a mutex semaphore used to implement database locking. The system is single threaded. Queued processes executed spin lock operations until the semaphore was free.

Before Performance Testing Prior to doing performance testing, users were complaining about poor performance, and the database was “hanging” several times a day. The hypothesis was that these problems were capacity-related, and the vendor was contacted but unable to solve the problems.

Performance Testing Goals Help them determine: Which resources were overloaded. Effects of the database size on performance. Effects of single vs. multiple transactions. Effects of the cache hit rate. Effects of the number of http servers.

System Information The web-server database was modeled as an M/D/1 queue. The arrival process was assumed to be Poisson. The cache hit rate was determined to be central to the system’s performance. It ranged between 80-87%, with the average being 85%.

Distribution of User Requests Agent RequestsCustomer Requests Page TypePercentage Static Page50%23% Error Code10%23% Search Form7%30% Search Result8%16% Other Pages25%8%

Computing the Cache Hit Probability Page TypeFrequencyProb OccurCache ProbWted Prob Home27070.22360.99960.2235 Static25150.20770.94070.1954 Error Code13160.10870.69150.0752 Screen Shot10760.08890.60780.0540 Search Res.10350.08550.04630.0040 SearchForm8320.06870.92180.0633 Index4940.04080.97970.0400 Other21320.17610.94840.1670 Total12,1061.00000.8224

System Observations Heavier workload for agents on weekdays, peak hours in the afternoon, with little day-to-day variation. Customer peak hours occurred during the evening. Little change in workload as users become familiar with the system. (Agents are already expert users and execute a well-defined process, while individual customers tend to use the system rarely and therefore also maintain the same usage pattern over time.)

What We Achieved Characterized the agent and customer workload, and used it as a basis for performance testing. Identified performance limits and as a result detected a software bottleneck. Provided recommendations for performance improvement. Increased the understanding of the use of data collection for performance issues.

What We Learned The system was currently running at about 20% utilization. The CISCO routers were not properly load balanced. The spin lock operations consumed CPU time which led to a steep increase in the response time. (We used the SUN SE toolkit to record the number of spin locks).

No Caching 0 1 2 3 4 5 6 7 00.511.522.5 Load (hits/sec) avg CPU cost(sec)/request Customer Agent

No Caching 0 5000 10000 15000 20000 25000 30000 35000 40000 00.511.522.5 Load (hits/sec) avg response time(ms) Customer Agent

All Requests Retrieved From Cache 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 020406080100120140160180 Load (hits/sec) avg response time(ms ) Customer Agent

-10000 0 10000 20000 30000 40000 50000 60000 00.511.522.533.54 Load (hits/sec) avg response time (ms) Customer Agent Simulated 85% Cache Hit Rate

In Particular Delay strongly depends on caching: Found in cache ~ 100ms Retrieved from database ~ 5 secs Current Available capacity: Customer: 2 hits/sec Agent: 2.5 hits/sec Average demand: Customer: 10,000 hits/day = 0.12 hits/sec Agent: 25,000 hits/day = 0.29 hits/sec Busy Hour demand: Customer: 784 hits/hour = 0.22 hits/sec Agent 2228 hits/hour = 0.62 hits/sec

Effect of Database Size Customer – Cache Off 0 10000 20000 30000 40000 50000 60000 00.20.40.60.811.21.41.61.82 Load (hits/sec) avg response time (ms) Small DB (200 MB) Medium DB (400 MB) Large DB (600 MB)

Effect of Database Size Agent – Cache Off 0 5000 10000 15000 20000 25000 30000 35000 00.20.40.60.811.21.41.61.82 Load (hits/sec) avg response time (ms) Small DB (200 MB) Medium DB (400 MB) Large DB (600 MB)

Adding Servers For this system, n servers meant n service queues, each operating independently, and hence less lock contention. This led to a significant increase in the workload that could be handled. However, since each server maintains its own caching mechanism, there was a distinct decrease in the cache hit probability, and an associated increase in the response time. The response time is dominated by the cache hit probability when the load is low; as the load increases the queuing for the database also increases.

Multiple Servers 0 10000 20000 30000 40000 50000 60000 012345678 Load (hits/sec) avg response time (ms) 1 web server 2 web servers 3 web servers 4 web servers 5 web servers 6 web servers

Recommendations Projects should collect traffic data on a daily basis. Performance measurements should be made while the software is being testing in the lab – both for new systems and when changes are being made. Workload-based testing is a very cost effective way to do performance testing.

THE END

No Caching 0 10000 20000 30000 40000 50000 60000 70000 00.511.522.5 hits/sec avg resp time(ms) Error Result Form Static

Cache Off 0 5 10 15 20 25 30 00.511.522.5 hits/sec avg CPU cost(sec)/request Error Result Form Static

Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.

Similar presentations

Presentation on theme: "Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.

Similar presentations

Presentation on theme: "Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs."— Presentation transcript:

Similar presentations

About project

Feedback