Download presentation
Presentation is loading. Please wait.
1
Hardware-based Load Generation for Testing Servers Lorenzo Orecchia Madhur Tulsiani CS 252 Spring 2006 Final Project Presentation May 1, 2006
2
The workload generation problem GOAL: Test how servers will behave on receiving large number of requests from multiple users CHALLENGES: –Workload generator should mimic the requests sent by real users –Should keep the server as busy as a large number of users would
3
Why use hardware? SPEED - Hardware is much faster than software MULTIPLE USERS - Can be simulated by multiple copies of circuit
4
Challenges in hardware implementation Distributions of web data are complex and diverse - different distributions for file sizes, request sizes, links per page, popularity etc. We require sampling from a combination of all these using only a small and fast circuit Implementing communication protocols in hardware (already available from companies like WizNet)
5
Approaches to workload generation Trace: Monitor web traffic and store trace – sample from trace to simulate real traffic. Difficult to generate new trace. Analytical models: Generate data according to distributions used to model data. Complex to implement in hardware. We combine the two! Generate the data-set using analytical models and sample from it in hardware
6
Overview of design Generate data-set using C code – outputs a graph of URLs as hex data Graph loaded in memory at start – memory is shared by multiple circuits (simulating multiple users) Each circuit simply performs a random walk on this graph – can also remember multiple URLs and switch between multiple walks (connections).
7
Distributions modeled - Pareto A Pareto distribution is given by the cumulative frequency function k is the minimum value X can take and t governs how fast the tail decays Its expectation is (if t > 1, infinite otherwise) The file-size and request size distributions are Pareto distributions File-size: k = 8 KB, t = 1.1 Request size: k = 8 KB, t = 0.6
8
Power Law and Zipf’s Law The number of links in a page (out-degree distribution) is a power law of the form (A = 1 ; t = 2.1 ) truncated at max degree = 31 Zipf’s law: The number of requests to a page is inversely proportional to its rank when pages are sorted by popularity
9
Combining the distributions Divide data-set into size bins and generate number of files in each according to file size distribution Determine popularity for each node Determine degree and sample neighbors according to request size distribution A simple random walk on this graph combines all distributions! CAVEAT LECTOR: this assumes independence between the properties
10
Implementation details Shared read-only memory used by multiple (8) copies of random walk circuit Each walk chooses between multiple (8) open connections At each clock cycle, a random walk: –w.p. 0.875 moves to a neighbor file – o.w. picks file at random uniformly
11
Sketch of Random Walk Module MEMORYMEMORY
12
Data-set parameters Graph Size Memory Usage Total data size Average file size 409650 KB2138 MB521 KB 655361 MB3121 MB47.6 KB 104857621 MB21083 MB20.11 KB
13
Circuit properties Device : Virtex-E Maximum delay Walk Module: 3.99 ns Memory Module: 5.82 ns Estimated frequency = (1000/9.81) = 101.93 MHz Number of LUTs per walk: 593 (out of 64,896) Number of slices per walk: 307 (out of 32,448)
16
Possible Improvements Implement memory using block-RAM instead of Verilog registers Add module for packaging URLs into HTTP packets, opening connections Implement timing distribution between requests
17
References 1.P. Bauford and M. Crovella, “Generating Representative Web Workloads for Network and Server Performance evaluation”, Proc. of ACM SIGMETRICS 1998, pp. 151-160 2.M. Arlitt and C. Williamson, “Web Server Workload Characterization: The Search for Invariants”, Proc. of ACM SIGMETRICS 1996, pp. 126-137 3.P. Baldi, P. Frasconi and P. Smyth, “Modelling the Internet and the Web: Probabilistic Methods and Algorithms”, Wiley 2003
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.