Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC.

Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC Berkeley

2 Outline Background and Motivation MILS in Undirected Graph MILS in Directed Graph Evaluation Conclusions

3 End-to-End Network Diagnosis 93 hours?

4 Linear Algebraic Model Path loss rate p i, link loss rate l j : A D C B 1 2 3 p1p1 p2p2 Usually an underconstrained system

5 Unidentifiable Links Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable –The property of a link (or link sequence) can be computed from the linear system if and only if the corresponding vector is identifiable Otherwise, Unidentifiable A D C B 1 2 3 p1p1 p2p2 [ 0 0 1 ] [ 1 0 0 ] ?

6 Virtual Link Motivation Biased statistic assumptions are introduced to infer unidentifiable Links 0.1 0 Loss rate = 0.1 if linear optimization Loss = 0 if unicast tomography & RED Loss rate?

7 Least-biased End-to-end Network Diagnosis (LEND) Basic Assumptions –End-to-end measurement can infer the end-to- end properties accurately –Link level properties are independent Problem Formulation –Given end-to-end measurements, what is the finest granularity of link properties can we achieve under basic assumptions? Basic assumptions More and stronger statistic assumptions Virtual link Diagnosis granularity? Better accuracy

8 Least-biased End-to-end Network Diagnosis (LEND) Contributions –Define the minimal identifiable unit under basic assumptions (MILS) –Prove that only E2E paths are MILS with a directed graph topology (e.g., the Internet) –Propose good path algorithm (incorporating measurement path properties) for finer MILS Basic assumptions More and stronger statistic assumptions Virtual link Diagnosis granularity? Better accuracy

10 Minimal Identifiable Link Sequence Definition of MILS –The smallest path segments with loss rates that can be uniquely identified through end- to-end path measurements –Related to the sparse basis problem NP-hard Problem Properties of MILS –The MILS is a consecutive sequence of links –A MILS cannot be split into MILSes (minimal) –MILSes may be linearly dependent, or some MILSes may contain other MILSes

11 Examples of MILSes in Undirected Graph Real links (solid) and all of the overlay paths (dotted) traversing them 1 2 3 1’ 2’ 3’ 4’ 4 5 MILSes a b c d e 3’+2’-1’-4’ → link 3

13 Identify MILSes in Undirected Graphs Preparation –Active or passive end-to-end path measurement –Optimization Measure O(nlogn) paths and infer the n(n-1) end-to-end paths [SIGCOMM04]

14 Preparation Identify MILSes –Enumerate each link sequence to see if it is identifiable –Computational complexity: O( r × k × l 2 ) r: the number of paths (O(n 2 )) k: the rank of G (O(nlogn)) l: the length of the paths –Only takes 4.2 seconds for the network with 135 Planetlab hosts and 18,090 Internet paths Identify MILSes in Undirected Graphs

15 What about Directed Graphs? Directed Graph Are Essentially Different to Undirected Graph Theorem: In a directed graph, no end-to-end path contains an identifiable subpath if only considering topology information [1 0 0 0 0 0] ? Sum=1 Sum=0

16 Good Path Algorithm Consider Only Topology –Works for undirected graph Incorporate Measurement Path Property –Most paths have no loss PlanetLab experiments show 50% of paths in the Internet have no loss –All the links in a path of no loss are good links (Good Path Algorithm)

17 Good Path Algorithm Symmetric Property is broken when using good path algorithm

18 Other Features of LEND Dynamic Update for Topology and Link Property Changes –End hosts join or leave, routing changes or path property changes –Incremental update algorithms very efficient Combine with Statistical Diagnosis –Inference with MILSes is equivalent to inference with the whole end-to-end paths –Reduce computational complexity because MILSes are shorter than paths Example: applying statistical tomography methods in [Infocom03] on MILSes is 5x faster than on paths

19 Outline Motivation MILS in Undirected Graph MILS in Directed Graph Evaluation Conclusions

20 Evaluation Metrics Diagnosis Granularity –Average length of all the lossy MILSes in lossy path Accuracy –Simulations Absolute error and relative error –Internet experiments Cross validation IP spoof based consistency check Speed –Running time for finding all MILSes and loss rate inference

21 Methodology Planetlab Testbed –135 end hosts, each from different institute –18,090 end-to-end paths Topology Measured by Traceroute –Avg path length is 15.2 Path Loss Rate by Active UDP Probing with Small Overhead Areas and Domains# of hosts US (77).edu50.org14.net2.com10.us1 Inter- national (58) Europe25 Asia25 Canada3 South America3 Australia2

22 Diagnosis Granularity # of End-to-end Paths18,090 Avg Path Length15.2 # of MILSes1009 Avg length of MILSes 2.3 virtual links (3.9 physical links) Avg diagnosis granularity 2.3 virtual links (3.8 physical links) Loss rate [0, 0.05) lossy path [0.05, 1.0] (15.8%) [0.05, 0.1)[0.1, 0.3)[0.3, 0.5)[0.5, 1.0)1.0 %84.217.215.624.915.826.5

23 Distribution of Length of MILSes Most MILSes are pretty short Some MILSes are longer than 10 hops –Some paths do not overlap with any other paths Most MILSes are short A few MILSes are very long

24 Other Results MILS to AS Mapping –33.6% lossy MILSes comprise only one physical link 81.8% of them connect two ASes Accuracy –Cross validation (99.0%) –IP spoof based consistency check (93.5%) Speed –4.2 seconds for MILS computations –109.3 seconds for setup of scalable active monitoring [SIGCOMM04]

25 Conclusion Link-level property inference in directed graphs is completely different from that in undirected graphs With the least biased assumptions, LEND uses good path algorithm to infer link level loss rates, achieving –Good inference accuracy –Acceptable diagnosis granularity in practice –Online monitoring and diagnosis Continuous monitoring and diagnosis services on PlanetLab under construction

26 Thank You! For more info: http://list.cs.northwestern.edu/lend/ Questions?

28 Motivation End-to-End Network Diagnosis Under-constrained Linear System –Unidentifiable Links exist To simplify presentation, assume undirected graph model A R B

29 Linear Algebraic Model (2) … =

30 Identifiable and Unidentifiable Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable Otherwise, Unidentifiable A D C B 1 2 3 p1p1 p2p2 (1,1,0) Row(path) space (identifiable) x1x1 x2x2 (1,1,1) (0,0,1) x3x3

31 Examples of MILSes in Undirected Graph 12 1 23 1’ Real links (solid) and all of the overlay paths (dotted) traversing them 1’2’ 1 2 3 1’ 2’ Rank(G)=1 Rank(G)=3 Rank(G)=4 3’ 4’ a 4 a b c 3’ 5 MILSes a b c d e 3’+2’-1’-4’ → link 3

32 Identify MILSes in Undirected Graphs Preparation Identify MILSes –Compute Q as the orthonormal basis of R (G T ) (saved by preparation step) –For a vector v in R (G T ), ||v|| = ||Q T v|| x1x1 x2x2 x3x3 v1v1 v2v2

33 Flowchart of LEND System Step 1 –Monitors O(n·logn) paths that can fully describe all the O(n 2 ) paths (SIGCOMM04) –Or passive monitoring Step 2 –Apply good path algorithm before identifying MILSes as in undirected graph Measure topology to get G Active or passive monitoring Iteratively check all possible MILSes Compute loss rates of MILSes Good path algorithm on G Stage 2: online update the measurements and diagnosis Stage 1: set up scalable monitoring system for diagnosis

34 Evaluation with Simulation Metrics –Diagnosis granularity Average length of all the lossy MILSes in lossy path (in the unit of link or virtual link) –Accuracy Absolute error |p – p’ |: Relative error

35 Simulation Methodology Topology type –Three types of BRITE router-level topologies –Mecator topology Topology size –1000 ~ 20000 or 284k nodes Number of end hosts on the overlay network –50 ~ 300 Link loss rate distribution –LLRD1 and LLRD2 models Loss model –Bernoulli and Gilbert

36 Sample of Simulation Results # of end host on OL # of paths Avg PL # of links # of LP # of links in LP Avg MILS length Avg diagnosis granularity 5024508.86379810429032.23(3.03)2.24(3.07) 10099008.809802355119931.71(2.27)2.05(2.95) 200398008.80223521470643351.49(1.92)1.77(2.38) Mercator (284k nodes) with Gilbert loss model and LLRD1 loss distribution

37 Related Works Pure End-to-End Approaches –Internet Tomography Multicast or unicast with loss correlation – Uncorrelated end-to-end schemes Router Response Based Approach –Tulip and Cing

38 MILS to AS Mapping IP-to-AS mapping constructed from BGP routing tables Consider the short MILSes with length 1 or 2 –Consist of about 44% of all lossy MILSes. –Most lossy links are connecting two dierent ASes 1 AS2 ASes3 ASes>3 ASes Len 1 MILSes (33.6%)6.1%27.5%00 Len 2 MILSes (9.8%)2.6%5.8%1.3%0 Len > 2 MILSes (56.6%)6.8%17.8%21.8%10.2%

39 Accuracy Validation Cross Validation (99.0% consistent) IP Spoof based Consistency Checking. UDP: Src: A, Dst: C, TTL=255 A C B UDP: Src: A, Dst: B, TTL=255UDP: Src: C, Dst: B, TTL=2ICMP: Src: R 3, Dst: C, TTL=255 R1R1 R2R2 R3R3 IP Spoof based Consistency: 93.5%

Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC.

Similar presentations

Presentation on theme: "Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC.

Similar presentations

Presentation on theme: "Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC."— Presentation transcript:

Similar presentations

About project

Feedback