Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley Yan Chen
Problem Formulation Given n end hosts on an overlay network and O(n 2 ) paths, how to select a minimal subset of paths to monitor so that the loss rates/latency of all other paths can be inferred. Key idea: select a basis set of k paths that completely describe all O(n 2 ) paths (k «O(n 2 )) –Select and monitor k linearly independent paths to compute the loss rates of basis set –Infer the loss rates of all other paths End hosts Overlay Network Operation Center topology measurements
’ Real links (solid) and overlay paths (dotted) going through them Virtualization Virtual links 1’2’ ’ 2’ 4 k =1 k = 2 k = 3 3’ 4’ Intuition through Topology Virtualization Virtual links: minimal path segments whose loss rates uniquely identified Can fully describe all paths 5
Efficiency and Adaptation Internet has moderate hierarchical structure [TGJ+02] For reasonably large n, (e.g., 100), k = O(nlogn) Tolerant to topology measurement errors Incremental topology change detection and update of monitoring paths –End host join/leave –Routing changes
Areas and Domains # of hosts US (40).edu33.org3.net2.gov1.us1 Interna- tional (11) Europe (6) France1 Sweden1 Denmark1 Germany1 UK2 Asia (2) Taiwan1 Hong Kong1 Canada2 Australia1 Experiments on Planet Lab 51 hosts, each from different organizations –51 × 50 = 2,550 paths Simultaneous loss rate measurement –300 trials –In each trial, send a 40-byte UDP pkt to every other host Simultaneous topology measurement –Traceroute Experiments: 6/24 – 6/27 –100 experiments in peak hours
Loss rate distribution Accuracy –On average k = 872 out of 2550 –Absolute error |p – p’|: Average for all paths, for lossy paths –Small relative error and good lossy path inference Topology measurement error tolerance –On average 245 out of 2550 paths have no or incomplete routing information –No router aliases resolved loss rate [0, 0.05) lossy path [0.05, 1.0] (4.1%) [0.05, 0.1)[0.1, 0.3)[0.3, 0.5)[0.5, 1.0)1.0 %95.9%15.2%31.0%23.9%4.3%25.6% Tomography-based Overlay Monitoring Results
Performance Improvement with Overlay With single-node relay Loss rate improvement –Among 10,980 lossy paths: –5,705 paths (52.0%) have loss rate reduced by 0.05 or more –3,084 paths (28.1%) change from lossy to non-lossy Throughput improvement –Estimated with –60,320 paths (24%) with non-zero loss rate, throughput computable –Among them, 32,939 (54.6%) paths have throughput improved, 13,734 (22.8%) paths have throughput doubled or more Implications: use overlay path to bypass congestion or failures
X UC Berkeley UC San Diego Stanford HP Labs Adaptive Overlay Streaming Media Implemented with Winamp client and SHOUTcast server Congestion introduced with a Packet Shaper Skip-free playback: server buffering and rewinding Total adaptation time < 4 seconds
Pros and Cons About Planet Lab + Easy batch processing via SSH - No root privileges –Many measurement tools don’t work! - Limited tools –Only ping and traceroute –but people are adding more, like scriptroute - Linux-only platform –New applications (multiplayer games, live media) are mostly on Windows platform - Limited programming language choices –Only C/C++ and perl, no Java
Backup Slides
Adaptive Streaming Media Architecture