Evaluation of Load Balancing Algorithms and Internet Traffic Modeling for Performance Analysis By Arthur L. Blais
Introduction Internet Traffic Modeling Load Balancing Algorithms Conclusion
Internet Traffic Modeling Why is the Internet hard to model? Internet Traffic Characteristics Internet Traffic Models
Why is the Internet hard to model? It’s BIG January 2000: > 72 Million Hosts1 Growing Rapidly > 67% per year Constantly Changing Traffic patterns have high variability 1 Source: www.isc.org
Internet Traffic Patterns Causes of High variability Client Request Rates Server Responses Network Topology
Characteristics of Client Request Rate1 Client Sleep Time Inactive Off Time Active Off Time Embedded References 1 Barford and Crovella, Generating Representative Web Workloads for Network and Server Performance Evaluation, Boston University, BU-CS-97-006, 1997
Client Sleep time
Inactive Off Time Time between requests (Think Time) Uses a Pareto Distribution Shape parameter: a = 1.5 Lower bound: (k) = 1.0 To create a random variable x: u ~ U(0,1) x = k / (1.0-u)^1.0/ a
Inactive Off Time
Active Off Time Time between embedded references Uses a Weibull Distribution alpha: a = 1.46 (scale parameter) beta: b = 0.382 (shape parameter) To create a random variable x: u ~ U(0,1) x = a ( -ln( 1.0 – u ) ^ 1.0/b
Active Off Time
Embedded References Number of objects in the requested document (text, audio, video, bit maps) Uses a Pareto Distribution Shape parameter: a = 2.43 Lower bound: (k) = 1.0 To create a random variable x: u ~ U(0,1) x = k / (1.0-u)^1.0/a
Example HTML Document with Embedded References <html><head> <title>CS522 F99 Home Page</title> </head> <body background="marble1.jpg"> <BGSOUND SRC="rocky.mid"><embed src="rocky.mid" autostart=true hidden=true loop=false></embed> <td ALIGN=CENTER><img SRC="rainbowan.gif" height=15 width=100%></td>
Embedded References
Server Characteristics File Size Distribution Body – Lognormal Distribution Tail – Pareto Distribution Cache Size Temporal Locality Number of Connections System Performance: CPU speed, disk access time, memory, network interface
File Size Distribution - Body Lognormal Distribution Build table with 930 values Range: 92 <= x <= 9020 bytes To create a random variable x: u ~ U(0,1) if ( u <= 93% ) then look up value in table[ u * 1000 ] else use tail distribution
File Size Distribution - Body
File Size Distribution - Tail Pareto Distribution Shape parameter: a = 1.5 Lower Bound: k = 9,020 To create a random variable x: u ~ U(0,1) x = k / (1.0 – u) ^ 1.0/a
File Size Distribution – Tail
Self-similarity Fractal-like characteristics: Fractals look the same at all size scales Statistical Self-similarity: Empirical data has similar variability over a wide range of time scales.
Verification of Self-similarity Methods Observation Variance Time Plot R/S Plot Periodogram Whittle Estimator
Self-similarity - Observation
Variance Time Plot Hurst Parameter H = 1 – b / 2 b: inverse of the slope ½ < H < 1 H = 0.7
Load Balancing vs. Load Sharing System avoids having idle processors by placing new processes on idle processors first Load Balancing System attempts to distribute the load equally across all processors based on some global average. Static Processes are placed and executed on only one processor. Dynamic Processes are initially placed on one processor but at some point in time the process may be migrated to another processor based upon some decision criteria.
Load Balancing Algorithms Stateless Select a processor without consideration of the system state. Round Robin Random State-based Select a processor based upon some knowledge of the system state. Greedy Subset Stochastic
Simulation Entities Request Client Load Balance Manager Server
Request Event Loop
Experimental Design Cooperative Environment For each algorithm (round robin, random, greedy, subset, stochastic) Eight Servers with 1, 4 Connections 8, 16, 32, 64, 128, 256,512 Clients 1, 2, 4 Load Balance Managers
Servers with One Connection
Global vs. Local Info.
Servers with Four Connections
Global vs. Local Info.
Experimental Design Adversarial Environment For each algorithm (greedy, subset, stochastic) Eight Servers with 1, 4 Connections 8, 16, 32, 64, 128, 256,512 Clients 4 Load Balance Managers with 1, 2, 3 Random Load Balance Managers as adversaries
Servers with One Connection
LBM w/ Adversaries
Servers with Four Connection
LBM w/ Adversaries
Analysis of Experimental Results Single Connection Global Greedy: 2.2-27.6x improvement in Response Time Subset: 1.8-3.4x Stochastic: 1.3-2.2x Local Greedy: 2.2-6.6x Subset: 1.4-2.5x Stochastic: 1.1-2.0x
Analysis of Experimental Results – cont. Four Connections Global Greedy: 1.7-4.3x improvement in Response Time Subset: 1.8-3.4x Stochastic: 1.1-2.5x Local Greedy: 1.0-4.1x Subset: 1.0-3.0x Stochastic: 1.0-2.3x
Analysis of Experimental Results – cont. Single Connection w/ Adversaries Greedy: 1.1-4.3x Subset: 1.1-2.1x Stochastic: 1.0-1.9x
Analysis of Experimental Results – cont. Four Connection w/ Adversaries Greedy: 1.0-3.7x Subset: 1.0-2.7x Stochastic: 1.0-2.1x
Conclusions Researched and Developed a Framework for Modeling Internet Traffic for Simulation Experimental Design Analysis of Experimental Results Comparing Five Load Balancing Algorithms