Scalable and Deterministic Overlay Network Diagnosis Yao Zhao, Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter Steenkiste (CMU) with Ningning Hu (CMU), Oliver Spatscheck.
1 Locating Internet Bottlenecks: Algorithms, Measurement, and Implications Ningning Hu (CMU) Li Erran Li (Bell Lab) Zhuoqing Morley Mao (U. Mich) Peter.
User-level Internet Path Diagnosis Ratul Mahajan, Neil Spring, David Wetherall and Thomas Anderson Designed by Yao Zhao.
Towards Unbiased End-to-End Network Diagnosis Name: Kwan Kai Chung Student ID: Date: 18/3/2007.
1 Estimating Shared Congestion Among Internet Paths Weidong Cui, Sridhar Machiraju Randy H. Katz, Ion Stoica Electrical Engineering and Computer Science.
Lo Presti 1 Network Tomography Francesco Lo Presti Dipartimento di Informatica - Università dell’Aquila.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Tomography-based Overlay Network Monitoring UC Berkeley Yan Chen, David Bindel, and Randy H. Katz.
Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
EE 4272Spring, 2003 Chapter 10 Packet Switching Packet Switching Principles  Switching Techniques  Packet Size  Comparison of Circuit Switching & Packet.
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring Yan Chen, David Bindel, Hanhee Song, Randy H. Katz Presented by Mahesh Balakrishnan.
NetQuest: A Flexible Framework for Internet Measurement Lili Qiu Joint work with Mike Dahlin, Harrick Vin, and Yin Zhang UT Austin.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
1 Network Tomography Venkat Padmanabhan Lili Qiu MSR Tab Meeting 22 Oct 2001.
Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley.
1 A Suite of Schemes for User-level Network Diagnosis without Infrastructure Yao Zhao, Yan Chen Lab for Internet and Security Technology, Northwestern.
Efficient Hop ID based Routing for Sparse Ad Hoc Networks Yao Zhao 1, Bo Li 2, Qian Zhang 2, Yan Chen 1, Wenwu Zhu 3 1 Lab for Internet & Security Technology,
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring University of California at Berkeley David Bindel, Hanhee Song, and Randy H.
Tomography-based Overlay Network Monitoring UC Berkeley Yan Chen, David Bindel, and Randy H. Katz.
Vassilios V. Dimakopoulos and Evaggelia Pitoura Distributed Data Management Lab Dept. of Computer Science, Univ. of Ioannina, Greece
Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley.
Improving the Accuracy of Continuous Aggregates & Mining Queries Under Load Shedding Yan-Nei Law* and Carlo Zaniolo Computer Science Dept. UCLA * Bioinformatics.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Toward Optimal Network Fault Correction via End-to-End Inference Patrick P. C. Lee, Vishal Misra, Dan Rubenstein Distributed Network Analysis (DNA) Lab.
RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.
Network Tomography (A presentation for STAT 593E) Mingyan Li Radha Sampigethaya.
Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC.
Experimental Design for Practical Network Diagnosis Yin Zhang University of Texas at Austin Joint work with Han Hee Song and Lili.
Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)
Computer vision: models, learning and inference Chapter 10 Graphical Models.
1 Server-based Characterization and Inference of Internet Performance Venkat Padmanabhan Lili Qiu Helen Wang Microsoft Research UCLA/IPAM Workshop March.
Traffic Matrix Estimation for Traffic Engineering Mehmet Umut Demircin.
PROMISE: Peer-to-Peer Media Streaming Using CollectCast Presented by: Randeep Singh Gakhal CMPT 886, July 2004.
MATE: MPLS Adaptive Traffic Engineering Anwar Elwalid, et. al. IEEE INFOCOM 2001.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Network Planète Chadi Barakat
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
Particle Filtering in Network Tomography
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
IEEE Globecom 2010 Tan Le Yong Liu Department of Electrical and Computer Engineering Polytechnic Institute of NYU Opportunistic Overlay Multicast in Wireless.
On AS-Level Path Inference Jia Wang (AT&T Labs Research) Joint work with Z. Morley Mao (University of Michigan, Ann Arbor) Lili Qiu (University of Texas,
Performance Evaluation of ATM Shortcuts in Overlaid IP/ATM Networks Jim Kurose Don Towsley Department of Computer Science Univ. of Massachusetts, Amherst.
1 Passive Network Tomography Using Bayesian Inference Lili Qiu Joint work with Venkata N. Padmanabhan and Helen J. Wang Microsoft Research Internet Measurement.
Inference, monitoring and recovery of large scale networks CSE Department PennState University Institute for Networking and Security Research Faculty:
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan.
Internet Performance Measurements and Measurement Techniques Jim Kurose Department of Computer Science University of Massachusetts/Amherst
N. Hu (CMU)L. Li (Bell labs) Z. M. Mao. (U. Michigan) P. Steenkiste (CMU) J. Wang (AT&T) Infocom 2005 Presented By Mohammad Malli PhD student seminar Planete.
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
1 Network Tomography Using Passive End-to-End Measurements Venkata N. Padmanabhan Lili Qiu Helen J. Wang Microsoft Research DIMACS’2002.
ASSIGNMENT, DISTRIBUTION AND QOS PROVISIONING IN COMMUNICATION NETWORKS.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Reliable Multicast Routing for Software-Defined Networks.
NetQuest: A Flexible Framework for Large-Scale Network Measurement Lili Qiu University of Texas at Austin Joint work with Han Hee Song.
Bing Wang, Wei Wei, Hieu Dinh, Wei Zeng, Krishna R. Pattipati (Fellow IEEE) IEEE Transactions on Mobile Computing, March 2012.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
1 Network Tomography Using Passive End-to-End Measurements Lili Qiu Joint work with Venkata N. Padmanabhan and Helen J. Wang.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
PATH DIVERSITY WITH FORWARD ERROR CORRECTION SYSTEM FOR PACKET SWITCHED NETWORKS Thinh Nguyen and Avideh Zakhor IEEE INFOCOM 2003.
-1/16- Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks C.-K. Toh, Georgia Institute of Technology IEEE.
Monitoring Persistently Congested Internet Links
Vivaldi: A Decentralized Network Coordinate System
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Department of Computer Science Northwestern University.
任課教授:陳朝鈞 教授 學生:王志嘉、馬敏修
End-user Based Network Measurement and Diagnosis
Presentation transcript:

Scalable and Deterministic Overlay Network Diagnosis Yao Zhao, Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer Science Northwestern University David Bindel Computer Science Division Dept. of EECS University of California at Berkeley

When something breaks in the Internet, the Internet's very decentralized structure makes it hard to figure out what went wrong and even harder to assign responsibility. ̶̶ ̶ “Looking Over the Fence at Networks: A Neighbor's View of Networking Research”, by Committees on Research Horizons in Networking, National Research Council, 2001.

Motivation Internet diagnosis very important To end users To overlay network service providers (e.g., Akamai) To Internet service providers (ISP) But a very challenging problem due to the privacy of network administration Solution E2E measurements by end users -- overlay networks

Related Works Router based approaches [SOSP03] Mostly ICMP based, ICMP rate limiting Unscalable for simultaneous diagnosis Cannot deterministically separate forward/backward path loss Statistical approaches [MINC, INFOCOM03] Non-deterministic: fundamentally under-constrained system Inference based on temporal correlation in a multicast tree Have to compromise for unicast, then sensitive to cross traffic Optimization based on assumptions: # of lossy links small Random sampling, linear programming, and Bayesian inference. Unscalable: iterative refinement slow to converge for large networks

Problem Formulation Given an overlay of N end hosts and O(N 2 ) paths, to what granularity can we deterministically diagnosis the network fault? Assumptions: Topology measurable Can only measure the E2E path, not the link

Outlines Architecture and algebraic model Identifying virtual links Evaluation with simulations Internet experiments

Our Approach Monitor a basis set of O(n·logn) paths that fully describe the O(n 2 ) paths Decompose the paths into minimal deterministically identifiable segments Compute the loss rate for each segment for diagnosis End hosts Overlay Network Operation Center topology measurements Trouble spots location Diagnosis results: Qwest access link: > Peering between UUNET and AOL: >

Linear algebraic model Path loss rate p, link loss rate l D C B p1p1

Putting All Paths Together … =

Identifiable and Unidentifiable Vectors in the row space of G are identifiable Otherwise, unidentifiable (1,-1,0) (1,1,0) Row(path) space (identifiable) x1x1 x3x3 A D C B p1p1 p2p2 (1,1,1) (0,0,1) x2x2

Outlines Architecture and algebraic model Identifying virtual links Evaluation with simulations Internet experiments

Definition of Virtual Links Uniquely identified shortest path segments Identifiable Consecutive Undecomposable ’ 2’ 3’ 4’ 4 5 a b d ce 4 paths, 5 links5 virtual links

One More Example 6 paths, 8 links 4 virtual links: Corresponding to links 1, 2, and respectively ’ 3’ 6’ 4 5’ ’ 4’

Computing Virtual Links in Undirected Graph (1,0,0) (1,1,0) Row space x1x1 x2x2 (1,1,1) (0,1,0) x3x3 Check if a vector is a virtual link QR decomposition: O(l·k) to check if a vector of length l is in row space of G O(l 2 ) potential virtual links in a path of length l Total complexity O(l·k·l 2 ·k)=O(l 3 ·k 2 ) Small constant: only 4.2 sec for 135-node network

Undirected vs. Directed Graphs Directed graph Any linear combination => Theorem: In a directed graph, no end-to-end path contains an identifiable subpath.

Rescue: Good Path Algorithm Identifying virtual links in undirected graphs Use topology only For directed graphs: additional info needed Path loss rate Use the link property constraint to break the deadlock All the links in a good path are good links, i.e. no or little loss. Most of the paths on the Internet are good paths

System Flowchart Monitors O(n·logn) paths that can fully describe all the O(n 2 ) paths (SIGCOMM04) Inherit load balancing, monitoring adaptation, etc. Measure topology to get G Select a basis of G,, for monitoring Good path algorithm on Reduced paths G’ Reduced paths G’’ Select a basis of G’’: Find all lossy virtual links in G Estimated loss rates for all paths in G Good path algorithm on G Stage 2: online update the measurements and diagnosis Stage 1: set up scalable monitoring system for diagnosis Optimization steps: find the minimal basis for identifiability test

Outlines Architecture and algebraic model Identifying virtual links Evaluation with simulations Internet experiments

Metrics Avg length of lossy virtual links in all lossy paths Diagnosis granularity The avg number of potential lossy links in a lossy path Example (Path 1 w/ lossy VL 1 of length 5, path 2 and 3 w/ lossy VL 2 of length 2) Avg lossy VL length: (5+2)/2 = 3.5 Avg diagnosis granularity: (5+2+2)/3 = 3 Accuracy Absolute error |p – p’ | Relative error

Simulation Methodology Topology type Three types of BRITE router-level topologies Mecator topology Topology size 1000 ~ or 184k nodes Fraction of end hosts on the overlay network 10% ~ 50% Link loss rate distribution LLRD 1 and LLRD 2 models Loss model Bernoulli and Gilbert

Sample of Simulation Results (Barabasi+Gilbert)

Results using Mercator Topology # of end hosts on OL Avg LP # of LP # of links in LP Avg LP Length Avg VLL in LP Avg BVLL in LP (4.86)2.56(3.54)2.97(4.18) (4.5)1.76(2.36)2.21(3.11) (4.21)1.6(2.07)1.99(2.74)

Gibbs Sampling (Infocom03) D Observed packet transmission and loss at the clients  Ensemble of loss rates of links in the network Goal Determine the posterior distribution P(  |D) Approach Use Markov Chain Monte Carlo with Gibbs sampling to obtain samples from P(  |D) Draw conclusions based on the samples

Comparison with Bayesian Inference using Gibbs Sampling (1)

Comparison with Bayesian Inference using Gibbs Sampling (2)

Outlines Architecture and algebraic model Identifying virtual links Evaluation with simulations Internet experiments

Methodology Planetlab 135 end hosts Topology measured by Traceroute Avg path length is 17.2 Path loss rate by active UDP probing byte UDP packets per measured path in 90 sec Small overhead: 17.9kb if even measuring all paths

Diagnosis Results Total end-to-end paths18,090 Avg Path Length17.2 After removing 79.5% good paths w/ 80.5% good links … Avg lossy path (>5% loss rate) length11.5 (9.0) Avg lossy virtual link length4.3 (3.1) Avg Granularity4.0 (2.7) Loss rate [0, 0.05) lossy path [0.05, 1.0] (15.8%) [0.05, 0.1)[0.1, 0.3)[0.3, 0.5)[0.5, 1.0)1.0 % The numbers in () are those after removing sequential link chains.

Speed Results On a Pentium-IV 3.2GHz PC Average setup time (selecting 5,706 paths for monitoring): seconds Diagnosis of 2,858 lossy paths: 4.2 seconds

Validation Cross Validation Divide 5720 paths into two sets (2860 each) Get 571 virtual links from the first set Check consistency with the second path set 99.1% paths in the second set are consistent with virtual links computed by the first set.

IP Spoofing based Validation UDP: S:a, D:c, TTL=255 a c b UDP: S:a, D:b, TTL=255 UDP: S:c, D:b, TTL=2 ICMP: S:r3, D:c, TTL=255 r1r1 r2r2 r3r3

IP Spoofing based Consistency Checking Use the function of source routing of IP Spoofing to create new path segments Validation is the same as cross validation Results: 1000 new path including part of segments in potential lossy paths 94.1% loss spoofed paths are consistent with 361 out of 1664 lossy virtual links 5.9% paths are inconsistent with 45 virtual links

Conclusions Propose the first deterministic and scalable overlay diagnosis system based on a linear algebraic approach Diagnosis with virtual links: Identifiable, consecutive and minimal path segments Directed topology indecomposable to VL Good path algorithms for rescue Both simulation and Internet experiments show fast & accurate diagnosis w/ optimal granularity

Backup Slides

Previous Work “Computing the unmeasured: An algebraic approach to Internet mapping,” INFOCOM’01 Can’t work on directed graph “User-level internet path diagnosis,” SOSP’03 Need the support of routers Not accurate “Multicast-based inference of network-internal loss characteristics,” IEEE Transactions in Information Theory, Multicast support or unicast approximation “Server-based inference of Internet link lossiness,” INFOCOM'03 Can only determine whether a link is lossy or not

Distribution of Length of lossy Virtual Links

IP Spoof Based Diagnosis