Realistic and Responsive Network Traffic Generation April 22, 2019 Realistic and Responsive Network Traffic Generation Kashi Venkatesh Vishwanath Amin Vahdat University of California, San Diego April 22, 2019
Motivation ZCP Evaluation Did we model the right background traffic? April 22, 2019 Motivation Background traffic source Background traffic sink ZCP Evaluation Home user Webserver Did we model the right background traffic? For 2006, as well as 2010? Does web traffic affect background traffic? April 22, 2019
Traffic Generation Goals Realism Application mix, application characteristics Average bytes and packet arrival rates Burstiness of traffic at various timescales Responsiveness Competing application can alter background traffic Ability to project into alternate scenarios Doubling access link capacity Changing popularity from HTTP to P2P April 22, 2019
Why Another Traffic Generator? Existing traffic generators Aggregate traffic into a single class Limits ability to vary application mix Ignore effect of wide-area links on flows Limits ability to project to alternate scenarios Cannot reproduce burstiness in traffic Competing applications could be sensitive to burstiness No Realistic and Responsive Traffic Generator April 22, 2019
Contributions Swing Realistic and responsive traffic generator Reproduce observed burstiness in traffic Identify and examine critical components Users Applications Network Ability to project traffic into alternate scenarios Change application mix, user behavior, network characteristics Details to follow April 22, 2019
Rest of the Talk Traffic Generation Traffic Validation Responsiveness Is traffic burstiness important? What is next? April 22, 2019
What Does Traffic Depend On? Complex interaction at various layers Users Periods of activity Application popularity etc. Applications HTTP, SMTP Request/Response behavior Network Packet losses Latency etc. Capture and model the interaction between layers from perspective of a single link April 22, 2019
Methodology Overview Swing trace Generate Swing traffic Start with a given trace Program hosts (running real OS, TCP stack) to communicate using the structural model Extract parameters for users, network, and applications Model dumb-bell topology Model target scenario current/alternate April 22, 2019
Modeling Traffic CDFs for model parameters complete April 22, 2019 Modeling Traffic Passive TCP trace measurements Users Applications Network Link capacity Link latency Link loss rate #sessions, intersession Traffic SESSION: #RRE, interRRE CDFs for model parameters complete Next: generate traffic using the CDFs Classification Request-Response Exchange (RRE): #conn,interconn HTTP SMTP P2P NNTP (IP,Port) TCP SEQ, ACK, timestamps REQ/RSP sizes numpairs, thinktime April 22, 2019
Traffic Generation Network Emulator 10mbps, 10ms, 2% loss Physical RSP 1000 RSP 5000 REQ 500 REQ 100 Wait 100ms Physical machines 100 mbps, 10ms, 1% loss rate Record traffic Hosts Swing traffic/trace HTTP SMTP P2P NNTP Sessions REQ/RSP, Numpairs, thinktime Connections RRE April 22, 2019
Results Compare Swing trace to original trace Realism Responsiveness Coarse-grained behavior e.g., throughput Modeled parameters e.g., response size Burstiness of traffic Responsiveness What-if scenarios Is burstiness important? April 22, 2019
Target Traces Mawi 18Mbps CAR ~18Mbps HTTP, SMTP, NAPSTER, NNTP CAIDA OC48 ~200Mbps HTTP, KAZAA, NNTP For this talk, AUCK Auck OC3c ATM ~ 5.5 Mbps HTTP, SQUID, SMTP April 22, 2019 Photo courtesy university of texas library
Experimental Setup 10 minute Auck trace 11 Linux + 1 FreeBSD machines 2300 July 24, 2001 11 Linux + 1 FreeBSD machines Multiplexing 1000 virtual hosts HTTP, SQUID, and TCPOTHER Top host – 4% of traffic Top 100 hosts – 60% of traffic Top 1000 hosts – 98% of traffic April 22, 2019
Results Compare Swing trace to original trace Realism Responsiveness Coarse-grained behavior e.g., throughput Modeled parameters e.g., response size Burstiness of traffic Responsiveness What-if scenarios Is burstiness important? April 22, 2019
Validation: Burstiness Coarse-grained behavior insufficient How do traffic demands peak? At different timescales? Various techniques: Index of dispersion of counts/intervals Power spectral density Wavelet-based Multi-Resolution Analysis (MRA) April 22, 2019
Energy Plot Example More bursty 1ms 256ms Coarser timescale April 22, 2019
Importance of Network Modeling More bursty Users Application Network Capacity Latency Loss rates Coarser timescale April 22, 2019
Importance of Network Modeling More bursty Users Application Network Capacity Latency Loss rates Coarser timescale April 22, 2019
Importance of Network Modeling More bursty Users Application Network Capacity Latency Loss rates Coarser timescale April 22, 2019
Importance of Network Modeling More bursty Users Application Network Capacity Latency Loss rates Important to model network (capacity, latency, loss rates) Coarser timescale April 22, 2019
Results Compare Swing trace to original trace Realism Responsiveness Coarse-grained behavior e.g., throughput Modeled parameters e.g., response size Burstiness of traffic Responsiveness What-if scenarios Is burstiness important? April 22, 2019
Results Compare Swing trace to original trace Realism Responsiveness Coarse-grained behavior e.g., throughput Modeled parameters e.g., response size Burstiness of traffic Responsiveness What-if scenarios Is burstiness important? April 22, 2019
Does burstiness matter? Bittorrent: Sensitivity to Bursty Background Traffic Does burstiness matter? Two kinds of background traffic Both ~ 5.5mbps Differ in burstiness 100 bittorrent nodes 46MB file download 10mbps, 50ms access links April 22, 2019
Bittorrent: Sensitivity to Bursty Background Traffic Two kinds of background traffic Both ~ 5.5mbps Differ in burstiness 100 bittorrent nodes 46MB file download 10mbps, 50ms access links Background traffic matters Burstiness of background traffic matters April 22, 2019
Future Work Improve model Improve accuracy of model parameter predictions Single model for all applications Causality of network events Reasons for impact of background traffic on end-to-end application behavior Application sensitivity to specific levels of burstiness Generalize to larger topologies April 22, 2019
Contributions Realistic and Responsive Traffic Generation Reproduce burstiness (sub-RTT) for live traffic Extract and reproduce essential properties Users, Applications and Network Impact of burstiness on competing applications April 22, 2019
Questions? More information and code release (soon) http://www.cs.ucsd.edu/~kvishwanath April 22, 2019
Extra slides from here on April 22, 2019
Limitations Symmetry of routes Parameters of our model Network characteristics Extract all prevailing wide area conditions using a single packet trace Responsiveness of UDP traffic April 22, 2019
Bidirectional traces April 22, 2019
Sensitivity to interconn, interRRE April 22, 2019
Responsiveness to latency and response size Latency double RSPDouble April 22, 2019
Validation HTTP SQUID mbps pps mbps pps AUCK 3.33 591 0.55 58 SWING 3.24 551 57 Modeled parameters Response size median IQR median IQR 747 3371 1649 5224 735 3357 1423 5225 April 22, 2019
UDP Currently support two simple models Bulk transfer Constant bit rate transfer For more complex, adaptive UDP protocols Need to model application’s adaptability April 22, 2019
Related work Harpoon, SURGE Tmix RAMP Tcplib All applications in a single class, no network Tmix Replay of tcp connections RAMP Simulation based, only for HTTP and FTP Tcplib Ability to generate traffic but not to populate model April 22, 2019
Importance of burstiness At large timescales (mins-hours) LRD Self-similarity Buffer size requirement of LRD traffic At small timescales (ms-secs) Loss rate at buffers for bursty traffic Need to generate bursty traffic to understand its importance! April 22, 2019
Changing application mix Baseline comparison from an earlier graph April 22, 2019
Changing application mix Increase SQUID traffic by 20 times What should overall energy plot look like? SQUID is different from overall AUCK April 22, 2019
Changing application mix Increase SQUID traffic by 20 times What should overall energy plot look like? Can project traffic demands to alternate scenarios April 22, 2019
Validate Responsiveness? Cannot, by definition But … Think baby steps, not giant leaps Better than what you can do today April 22, 2019
Fixed CDFs for trace duration? Yes at the moment Distributions stationary for a few minutes Dynamically change in future Changes to link bandwidth Shift is application popularity April 22, 2019
Generic Parameterization Study lots of traces Classify shape of CDF based on link-class Curve-fitting/Analytical distributions Use existing results from literature to mix and match April 22, 2019