Presentation is loading. Please wait.

Presentation is loading. Please wait.

Router-level Internet Topology Discovery Mehmet H. Gunes.

Similar presentations


Presentation on theme: "Router-level Internet Topology Discovery Mehmet H. Gunes."— Presentation transcript:

1 Router-level Internet Topology Discovery Mehmet H. Gunes

2 Internet Topology Discovery 2 Web of interconnected networks Grows with no central authority Autonomous Systems optimize local communication efficiency The building blocks are engineered and studied in depth Global entity has not been characterized Most real world complex-networks have non-trivial properties. Global properties can not be inferred from local ones Engineered with large technical diversity Range from local campuses to transcontinental backbone providers Internet

3 Understand topological and functional characteristics of the Internet Essential to design, implement, protect, and operate underlying network technologies, protocols, services, and applications Need for Internet measurements arises due to commercial, social, and technical issues Realistic simulation environment for developed products, Improve network management Robustness with respect to failures/attacks Comprehend spreading of worms/viruses Know social trends in Internet use Scientific discovery Scale-free (power-law), Small-world, Rich-club, Dissasortativity,… Internet Measurements 3 Internet Topology Discovery

4 Types of Internet topology maps Autonomous System (AS) level maps Router level maps A router level Internet map consists of Nodes: End-hosts and routers Links: Point-to-point or multi-access links Router level Internet topology discovery A process of identifying nodes and links among them Internet Topology Measurement 4 Internet Topology Discovery Lumenta Jan 06CAIDA Jan 08CAIDA Jan 00

5 Outline Introduction Internet Topology Measurement Topology Discovery Issues Impact of IP Alias Resolution Topology Discovery Resolving Anonymous Routers Graph-based Induction Technique Resolving Alias IP Addresses Analytical and Probe-based Alias Resolution Resolving Genuine Subnets Dynamic Subnet Inference Summary 5 Internet Topology Discovery

6 Internet topology measurement studies Involves topology collection / construction / analysis Current state of the research activities Distributed topology data collection studies/platforms iPlane, Skitter, Dimes, DipZoom, … 20M path traces with over 20M nodes (daily) Topology discovery issues 1.Sampling 2.Anonymous routers 3.Alias IP addresses 4.Genuine subnets Internet Topology Measurement Background 6 Internet Topology Discovery

7 Direct probing Indirect probing A DBC Internet Topology Measurements Probing IP B TTL=64 IP B IP D TTL=64 IP D Vantage Point A DBC IP B IP D TTL=2IP D TTL=1 IP C 7 Internet Topology Discovery

8 Probe packets are carefully constructed to elicit intended response from a probe destination traceroute probes all nodes on a path towards a given destination TTL-scoped probes obtain ICMP error messages from routers on the path ICMP messages includes the IP address of intermediate routers as its source Merging end-to-end path traces yields the network map S DABC Destination Internet Topology Measurement Topology Collection (traceroute) TTL=1 IP A TTL=2 IP B TTL=3 IP C TTL=4 IP D Vantage Point 8 Internet Topology Discovery

9 Internet Topology Measurement: Background Internet Topology Mapping 9 S L U H C N W A s.2 l.1 s.3 u.1 l.3 u.3 h.1 k.3 h.2 h.3 a.3 u.2 k.1 c.4 a.1 a.2 w.3 c.3 w.1 c.2 n.1 n.3 w.2 l.2 K c.1 k.2 d h.4 Trace to Seattle h.4 l.3 s.2 Trace to NY h.4 a.3 w.3 n.3 Internet2 backbone

10 Internet Topology Measurement: Background Internet Topology Mapping 10 S L U C N A s.2 l.1 s.3 u.1 l.3 h.1 k.3 h.2 a.3 u.2 k.1 c.4 a.1 a.2 w.3 c.3 w.1 c.2 n.1 n.3 w.2 l.2 K c.1 k.2 h.3 d h.4 s.1 e f n.2 H W u.3

11 Internet Topology Measurement Topology Collection Internet2 backbone Traces d - H - L - S - e d - H - A - W - N - f e - S - L - H - d e - S - U - K - C - N - f f - N - C - K- H - d f - N - C - K - U - S - e S L U K C H A W N e d f 11 Internet Topology Discovery

12 12 Sampling to discover networks Infer characteristics of the topology Different studies considered Effect of sample size [Barford 01] Sampling bias [Lakhina 03] Path accuracy [Augustin 06] Sampling approach [Gunes 07] Utilized protocol [Gunes 08] ICMP echo request TCP syn UDP port unreachable Topology Sampling Issues Internet Topology Discovery

13 13 Sampling techniques Path sampling Diameter Edge sampling Capacity Node sampling Degree characteristics Sampling approach (n,n) – traceroute based topology Returns the Internet map among n vantage points (k,m) – traceroute based topology where k<<m (k=n) Returns the Internet map between k sources and m destinations Topology Sampling Approaches Path sampling vs Node sampling (k,m)-sampling vs (n,n)-sampling Internet Topology Discovery

14 ICMP path traces from skitter 1 st collection cycle of each year (from 1999 to 2008) Skitter had updates to destination IP addresses major update in the system in 2004 Processing Alias IP addresses Analytical Alias Resolver (AAR) [Gunes-06] Analytical and Probe Based Alias Resolver (APAR) [Gunes-09] Anonymous routers Graph Based Induction (GBI) [Gunes-08] Historical Perspective on Responsiveness Data Set 14 Internet Topology Discovery

15 Historical Perspective on Responsiveness Anonymous node ratio YearTraces (million) 19993.5 200014.8 200113.4 200219.1 200324.3 200422.9 200521.0 200618.4 200717.5 200810.7 Reached (%) 86.5 83.5 73.6 50.4 54.3 53.0 46.4 37.2 30.6 23.2 Nodes (million) Anonymous (%) 0.259.0 0.780.6 2.172.7 1.551.2 1.942.0 2.464.1 6.885.9 6.487.4 4.985.3 2.877.2 15 Internet Topology Discovery

16 Historical Perspective on Responsiveness Anonymous node ratio after processing YearNodes (million) Anonymous (million) 19990.20.1 20000.70.6 20012.11.5 20021.50.8 20031.90.8 20042.41.5 20056.85.8 20066.45.6 20074.94.2 20082.82.2 Nodes (thousand) Anonymous (thousand) 170.2 180.3 5754.0 3693.0 7034.2 450.5 867.3 737.1 799.9 615.7 Anonymous (%) 1.1 1.8 0.7 0.8 0.6 1.0 8.5 9.7 12.5 9.4 InitialFinal 16 Internet Topology Discovery

17 Historical Perspective on Responsiveness Unique substrings YearUnique substrings 19991 200024 200157 200241 200379 200486 2005225,456 2006207,067 2007305,331 2008231,633 2345 1--- 24--- 57--- 41--- 79--- 86--- 151,13363,6626,3604,301 137,82959,1715,8284,239 212,26373,26314,0195,779 148,18263,94413,7335,772 %67%27%4%2 17 Internet Topology Discovery

18 Historical Perspective on Responsiveness Summary End system responsiveness is in considerable decline %86 to %23 ICMP rate limiting increased especially since 2004 ~%0 to %7 Overall router responsiveness has reduced ~%98 to ~%90 Most anonymous regions are single hop (%67) then two hop (%27) 18 Internet Topology Discovery

19 536,743 destination IP addresses from skitter and iPlane projects Between 7-11 April 2008 Probes ICMP echo request TCP SYN UDP to random ports Direct probes ping Indirect probes traceroute Current Practices in Responsiveness Data Set 19 Internet Topology Discovery

20 Current Practices in Responsiveness Direct probes ProbeResponsive (%) ICMP81.9 TCP67.3 UDP59.9 Anonymous (%) 18.1 32.7 40.1 Router (%) End-host (%) 84.677.9 70.462.8 64.750.3 537 K IPs 320 K217 K 20 Internet Topology Discovery

21 Current Practices in Responsiveness Direct probes (domain) ProbeAnonymous (%) ICMP18.1 TCP32.7 UDP40.1.net (%).com (%).edu (%).org (%).gov (%) 7.713.611.14.57.1 23.327.416.822.717 36.538.342.735.637.2 537 K IPs5 K1.7 K25.5 K10.1 K0.5 K 21 Internet Topology Discovery

22 Current Practices in Responsiveness Indirect probes ProbeReached (%) Nodes (thousand) Anonymous (%) ICMP93.11,00568.7 TCP73.496572.3 UDP45.01,47986.0 Nodes (thousand) Anonymous (%) 459.7 3512.5 419.4 InitialFinal 22 Internet Topology Discovery 306 K traces

23 Current Practices in Responsiveness Summary Nodes that respond to indirect probes might not respond to direct probes Nodes are most responsive to ICMP probes (%82) least responsive to UDP probes (%60) End hosts are less responsive than routers Responsiveness is similar for different domains 23 Internet Topology Discovery

24 Anonymous Router Resolution Problem Anonymous routers do not respond to traceroute probes and appear as a  in path traces Same router may appear as a  in multiple traces. Anonymous nodes belonging to the same router should be resolved. Anonymity Types 1. Ignore all ICMP packets 2. ICMP rate-limiting 3. Ignore ICMP when congested 4. Filter ICMP at border 5. Private IP address 24 Internet Topology Discovery

25 Anonymous Router Resolution Problem Internet2 backbone S L U K C H A W N e d Traces d -  - L - S - e d -  - A - W -  - f e - S - L -  - d e - S - U -  - C -  - f f -  - C -  -  - d f -  - C -  - U - S - e 25 Internet Topology Discovery f

26 Anonymous Router Resolution Problem UKCN LHAW S d e f Sampled network d e f S U L C A W Resulting network 26 Internet Topology Discovery Traces d -  - L - S - e d -  - A - W -  - f e - S - L -  - d e - S - U -  - C -  - f f -  - C -  -  - d f -  - C -  - U - S - e

27 Each interface of a router has an IP address. A router may respond with different IP addresses to different queries. Alias Resolution is the process of grouping the interface IP addresses of each router into a single node. Inaccuracies in alias resolution may result in a network map that includes artificial links/nodes misses existing links Alias Resolution:.5.33.18.13.7 Denver 27 Internet Topology Discovery

28 28 S L U C N W A s.2 l.1 s.3 u.1 l.3 u.3 h.1 k.3 h.2 a.3 u.2 k.1 c.4 a.1 a.2 w.3 c.3 w.1 c.2 n.1 n.3 w.2 l.2 K c.1 k.2 h.3 d h.4 s.1 e f n.2 H Traces d - h.4 - l.3 - s.2 - e d - h.4 - a.3 - w.3 - n.3 - f e - s.1 - l.1 - h.1 - d e - s.1 - u.1 - k.1 - c.1 - n.1 - f f - n.2 - c.2 - k.2 - h.2 - d f - n.2 - c.2 - k.2 - u.2 - s.3 - e IP Alias Resolution Problem Internet Topology Discovery

29 29 IP Alias Resolution Problem Internet Topology Discovery UKCN LHAW S d e f Sampled network Sample map without alias resolution s.3 s.1 s.2 l.3 l.1 u.1 u.2 k.1 c.1n.1 n.2 k.2 c.2 w.3 a.3 h.2 h.4 h.1 e d f n.3 Traces d - h.4 - l.3 - s.2 - e d - h.4 - a.3 - w.3 - n.3 - f e - s.1 - l.1 - h.1 - d e - s.1 - u.1 - k.1 - c.1 - n.1 - f f - n.2 - c.2 - k.2 - h.2 - d f - n.2 - c.2 - k.2 - u.2 - s.3 - e

30 30 Genuine Subnet Resolution Problem Alias resolution IP addresses that belong to the same router Subnet resolution IP addresses that are connected over the same medium IP2IP3 IP4 IP1 IP6IP5 IP2 IP3 IP1 IP2IP3 IP1 Internet Topology Discovery

31 31 Impact of IP Alias Resolution Effects on Topological Characteristics Generate synthetic network topology (Random, Power-law, Transit-Stub), Annotate it to add interface addresses, Emulate traceroute to collect path traces, Build sample topologies with different alias resolution success rate s r t r.1 r.2 r.3 Consider an example A path from s to t: s.2 – r.2 – t.1 A path from t to s: t.1 – r.3 – s.2 Case 1: resolve aliases @ r Case 2: do not resolve aliases @ r s.1 s.2 t.1 t.2 s r t r.2 r.3 s.2 t.1 s t r.2 s.2 t.1 r.3 Internet Topology Discovery

32 Alias Resolution: Experimental Procedure Apply alias resolution with different success rate 0%, 20%, 40%, 60%, 80%, and 100% success rates. Generate various synthetic graphs to represent the Internet Random: Waxman (WA), Power-law: Barabasi-Albert (BA) and Inet, Hierarchical: Transit-Stub (TS) Analyze changes in topological characteristics Topology Size,● Characteristic Path Length, Node Degree, ● Hop Distribution, Degree Distribution, ● Betweenness, Joint Degree Distribution, ● Clustering. Analyze a genuine Internet sample Utilize state-of-the art alias resolution tools.

33 33 Impact of IP Alias Resolution Graph Size With no alias resolution, average artificial nodes 57% edges 62% Impact of imperfect alias resolution increases with sample size. Internet Topology Discovery

34 Observed degree: degrees with imperfect alias resolution True degree: degrees with perfect alias resolution. Frequency distribution: number of nodes at each node degree Effects on Topological Characteristics : Node Degree Degree of these nodes are underestimated since their aliases are not resolved. WA – 0% success Degree of these nodes are overestimated due to non-resolved neighbors. WA – 40% successWA – 80% success Frequency distribution 34 Internet Topology Discovery

35 Observed degree: degrees with imperfect alias resolution True degree: degrees with perfect alias resolution. Frequency distribution: number of nodes at each node degree Effects on Topological Characteristics : Node Degree WA – 0% successWA – 40% successWA – 80% success Frequency distribution With improving alias resolution, some of the underestimation cases change to overestimation Alias resolution problems at a node may introduce a significantly large number of artificial nodes 35 Internet Topology Discovery

36 Effects on Topological Characteristics : Degree Distribution The probability P(k) that a randomly chosen node has degree k. Imperfect alias resolution, especially, distorts the power-law characteristic of BA- and Inet-based samples, impacts especially low degree ranges (3-13) of TS-based samples, impacts especially high degree ranges (20-up) of WA-based samples. Degree-related characteristics do not always improve with an increasing success rate 36 Internet Topology Discovery

37 Effects on Topological Characteristics : Joint Degree Distribution The probability P(k1, k2) that a node of degree k1 and a node of degree k2 are connected. Assortativity coefficient: The tendency of a network to connect nodes of the same or different degrees. Positive values indicate assortativity most of the links are between similar degree nodes. Negative values indicate disassortativity most of the links are between dissimilar degree nodes. 0 indicates non-assortativity. seem to be assortative with 0% alias resolution, but is non-assortative 37 Internet Topology Discovery

38 Effects on Topological Characteristics : Characteristic Path Length & Hop Distribution Characteristic Path Length The average of the shortest path lengths between all node pairs. Reduces with the increasing alias resolution success rate. On average 30% for BA, Inet and WA-based sample topologies. Hop Distribution The average percentage of the nodes reached at each hop As alias resolution improves, less number of hops are required to reach others. 24%, 60%, 78%, and 83% of the nodes are reachable within 7 hops with 0%, 40%, 80% and 100% alias resolution, respectively 38 Internet Topology Discovery

39 39 Impact of IP Alias Resolution Betweenness & Clustering Betweenness Centrality As the alias resolution success rate increases The average betweenness reduces The normalized betweenness increases Clustering Coefficient All samples yield a clustering coefficient of 0 with 0% alias resolution success rate It almost always increases with the improving alias resolution. Internet Topology Discovery

40 Effect of alias resolution on a genuine topology. Changes in observed topological characteristics Ally is the current state-of-the-art probe based approach. APAR is our analytical approach. 40 Impact of IP Alias Resolution Impact on a Genuine Topology InitialAllyAPARAlly & APAR Number of Nodes4085308026592376 Number of Edges7313550241323727 Average Degree3.583.573.113.14 Assortativity Coefficient0.390.140.080.09 Characteristic Path Length9.778.678.908.57 Normalized Betweenness0.00210.00250.00300.0032 Clustering Coefficient0.00610.02750.0740.0566 Internet Topology Discovery

41 Accuracy of the alias resolution process may significantly distort, almost all, topological characteristics that considered in this study. Internet measurement studies should employ all the means possible to increase the accuracy/ completeness of the alias resolution process. 41 Impact of IP Alias Resolution Impact on a Genuine Topology Internet Topology Discovery Path length related characteristics are closer to that of TS samples. Degree related characteristics are mostly similar to that of BA samples

42 Outline Introduction Internet Topology Measurement Topology Discovery Issues Impact of IP Alias Resolution Topology Discovery Resolving Anonymous Routers Graph-based Induction Technique Resolving Alias IP Addresses Analytical and Probe-based Alias Resolution Resolving Genuine Subnets Dynamic Subnet Inference Summary 42 Internet Topology Discovery

43 Anonymous routers do not respond to traceroute probes and appear as  in traceroute output Same router may appear as  in multiple traces. 43 Anonymous Router Resolution Problem y: S – L – H – x x: H – L – S – y y: S –  – H – x x: H –  – S – y S L H y x S L H y x y S 11 22 H x Internet Topology Discovery Current daily raw topology data sets include ~ 20 million path traces with ~ 20 million occurrences of  s along with ~ 500K public IP addresses The raw topology data is far from representing the underlying sampled network topology

44 Anonymous Router Resolution Previous Approaches Internet2 backbone S L U K C H A W N e d f Traces d -  - L - S - e d -  - A - W -  - f e - S - L -  - d e - S - U -  - C -  - f f -  - C -  -  - d f -  - C -  - U - S - e 44 Internet Topology Discovery

45 45 Anonymous Router Resolution Previous Approaches Basic heuristics IP: Combine anonymous nodes between same known nodes [Bilir 05] Limited resolution NM: Combine all anonymous neighbors of a known node [Xin 06] High false positives More theoretic approaches Graph minimization approach [Yao 03] Combine  s as long as they do not violate two accuracy conditions: (1) Trace preservation condition and (2) distance preservation condition High complexity O(n 5 ) – n is number of  s ISOMAP based dimensionality reduction approach [Xin 06] Build an n x n distance matrix then use ISOMAP to reduce it to a n x 5 matrix Distance: (1) hop count or (2) link delay High complexity O(n 3 ) – n is number of nodes UK C N L HA W S x y z Sampled network x y z S U L C A W After resolution x y z S U L C A W H x y z S U L C A W Resulting network Internet Topology Discovery

46 46 Anonymous Router Resolution Problem Complexity Graph minimization For an observed topology, accept the minimal underlying network as the underlying topology. [Yao 03] Mergeability relation Reduction from Graph Coloring Given a graph, find minimum number of colors such that no two neighboring vertices have the same color Build a graph G = (V, E) Add a vertex for each node in the network topology Add edges between non-mergeable vertices all pair of vertices representing non-anonymous nodes; all pair of vertices that appear in the same trace; etc. Find minimum set of colors for G and merge nodes that have the same color in the network topology

47 47 Graph Based Induction (GBI) Approach Graph based induction A graph data mining technique Find frequent substructures in a graph data Commonly used in mining biological and chemical graph data Use of GBI for anonymous router resolution Observe common graph structures due to anonymous routers Develop localized algorithms with manageable computational and storage overhead Trace Preservation Condition Merge anonymous nodes as long as they cause no loops in path traces Internet Topology Discovery

48 48 Anonymous Router Resolution Anonymity Types Type 1: Do not send any ICMP responses Type 2: Filtered ICMP responses at border routers Type 3: Rate limit ICMP responses Type 4: Do not send ICMP responses when congested Type 5: ICMP responses with private source IP address Internet Topology Discovery

49 49 Graph Based Induction Common Structures Parallel nodes A x C y2 y1 y3    A x C y2 y1 y3  Star DA wx C y E z  DA wx C y E z    Complete Bipartite A C x y D w F v E z  A C x y D w F v E z       Clique A C x y D w E z  A C x y D w E z       Internet Topology Discovery

50 50 Graph Based Induction Parallel nodes Algorithm For each  -substrings (a,  i,c), represent it as a tuple (a||c,  i ) a||c is the tuple identifier and a<c Read path traces and build the sorted list L of two tuples Subsequently read tuples are compared to the ones in the list based on tuple identifiers and duplicates are excluded from L Handling anonymity due to ICMP rate limiting or congestion A second scan of path traces looking for substrings of the form (a,b,c) corresponding to (a,  i,c) in L a c b a c b     Internet Topology Discovery

51 Generate a new graph G* = (V*,E*) For each  -substring of type (a,  e, b), V* ← V* U {a, b} E* ← E* U {e(a,b)} First identify 4-cliques and grow them by adding nodes that are connected to at least 4 nodes of the structure Helps in tolerating few missing links in large cliques Then, process all 3-cliques 51 Graph Based Induction Clique-like a c d e a c d e a c d e        Internet Topology Discovery

52 First search for a small size, i.e., K 2,3, complete bipartite structure in G* and then grow it to a larger one Take each pair of nodes and look whether they are in a K 2,3 Identifying a K 2,3, look for larger complete bipartite graphs K 2,m and then K n,m that contain the identified K 2,3. Then, process all K 2,2 ’s 52 Graph Based Induction Complete Bipartite A C D F E A C D F E In G C D F E In G* In G A        Internet Topology Discovery

53 53 Graph Based Induction Star Combine anonymous neighbors of a known node under trace preservation condition Starting from ones with smallest number of anonymous neighbors DA w C y E z DA w C y E z Note: Operate on G and not on G*     Internet Topology Discovery

54 54 Evaluations Effectiveness iPlane data set 229,425 IP addresses and ~ 9 M anonymous nodes, ~ 18M traces from 190 vantage points toward ~ 90K destinations. ISOMAP dimensionality reduction approach takes ~10 18 operations Graph minimization approach takes ~10 30 operations # Anonymous# Resolved Parallel nodes8,972,9398,171,360 ICMP rate limiting801,579585,887 Clique-like215,692533 Complete bipartite215,15961,968 Star153,19154,581 Final98,6108,874,329 4.5 x 10 9 operations Internet Topology Discovery

55 Evaluations Accuracy Graph edit distance: # of node splits (# of false positives in resolution) # of node merges (# of false negatives in resolution) Experimental setup Genuine AMP topology with 2376 routers and 3770 links Synthetic transit-stub topology with 50K nodes and 138.5K links Samples from these topologies From AMP topology: (10,500) and (10,1000) path traces From TS topology: (10,1000), (10,2000), and (10,3000) path traces 2%4%6%8%10%12%14% Initial3,7984,5769,09310,51911,04516,38319,079 IP1352295016667189671,252 NM32581892723195021,190 GBI2337146215274430633 Average Graph Edit Distance 55 Internet Topology Discovery

56 56 Summary Anonymous Router Resolution DA C E GBI DA C E Underlying   DA C E Collected   DA C E Neighbor Matching  Internet Topology Discovery Responsiveness reduced in the last decade NP-hard problem Graph Based Induction Technique Practical approach for anonymous router resolution Takes ~6 hours to process data sets of ~20M path traces Identifies common structures Handles all anonymity types Helpful in resolving multiple anonymous routers in a locality

57 Outline Introduction Internet Topology Measurement Topology Discovery Issues Impact of IP Alias Resolution Topology Discovery Resolving Anonymous Routers Graph-based Induction Technique Resolving Alias IP Addresses Analytical and Probe-based Alias Resolution Resolving Genuine Subnets Dynamic Subnet Inference Summary 57 Internet Topology Discovery

58 IP Alias Resolution Problem a c d b e a sub-graph a1 c1 b2 b1 c2 with no alias resolution w zy x A set of collected traces w, …,b1, a1, c1, …, x z, …,d1, a2, e1, …, y x, …,c2, a3, b2, …, w y, …,e2, a4, d2, …, z xw a3 a2 e1 d2 d1 e2 yz a4 Sample map from the collected path traces 1 3 4 1 1 1 1 2 2 2 2 2 Internet Topology Discovery 58 A router may appear with different IP addresses in different path traces Need to resolve IP addresses belonging to the same router

59 IP Alias Resolution Problem a c1 b2 b1 c2 partial alias resolution (only router a is resolved) x w e1 d2d1 e2 y z partial alias resolution (only router a is not resolved) a2 c d b e w zy x a3 a4 a1 59 Internet Topology Discovery a c d b e sub-graph w zy x 1 3 4 1 1 1 1 2 2 2 2 2

60 60 IP Alias Resolution Previous Approaches Dest = A B Dest = B A, ID=100 Dest = B B, ID=99 B, ID=103 A B A B Source IP Address Based Method [Pansiot 98] Relies on a particular implementation of ICMP error generation. IP Identification Based Method (ally) [Spring 03] Relies on a particular implementation of IP identifier field, Many routers ignore direct probes. DNS Based Method [Spring 04] Relies on similarities in the host name structures sl-bb21-lon-14-0.sprintlink.net sl-bb21-lon-8-0.sprintlink.net Works when a systematic naming is used. Record Route Based Method [Sherwood 06] Depends on router support to IP route record processing Internet Topology Discovery

61 Analytical Alias Resolution Approach Leverage IP address assignment convention to infer IP aliases Identify symmetric path segments within the collected set of path traces Infer IP aliases Use a number of checks to Remove false positives Increase confidence in the identified IP aliases Internet Topology Discovery 61

62 IP address Assignment Practices Point-to-point Links For a point-to-point link use either /30 subnet or /31 subnet The interface IP addresses on the link are consecutive and are within /30 subnet or /31 subnet use ↔ to represent subnet relation between two IP addresses Use subnet relation ( ↔ ) to infer IP aliases AB 192.168.1.4/30192.168.1.5192.168.1.6 192.168.1.4192.168.1.5192.168.1.4/31 /30 network /31 network 62 Internet Topology Discovery

63 IP address Assignment Practices Multi-access Links A similar relation between IP addresses belonging to the same multi-access link holds Example: Consider two IP addresses A:129.119.1.10 and B: 129.119.1.13 A and B are not together in a /30 or a /31 subnet However, they are together in /29 subnet 129.119.1.8/29 A: 129.119.1.00001010 B: 129.119.1.00001101 AB.10.13 129.119.1.8/29 subnet Internet Topology Discovery 63

64 64 Analytical Alias Resolution Sample traceroute pairs MIT UTD 18.7.21.1 18.168.0.27 129.110.95.1 129.110.5.1 206.223.141.73 192.5.89.89 206.223.141.70 192.5.89.10 198.32.8.34 198.32.8.85 198.32.8.66 198.32.8.65 198.32.8.84 198.32.8.33 192.5.89.9 206.223.141.69 192.5.89.90 206.223.141.74 18.168.0.25 no response 18.7.21.84 no response Aliases 129.110.5.1 - 206.223.141.74 206.223.141.73 - 206.223.141.69 206.223.141.70 - 198.32.8.33 … Internet Topology Discovery

65 65 There is possibility of incorrect subnet assumption, Two /30 subnets assumed as a /29, incorrect alignment of path traces. IP 4 and IP 8 are thought of as aliases. To prevent false positives, some conditions are defined Trace preservation, Distance preservation (probing component of APAR), Completeness, Common neighbor. APAR Analytical and Probe-based Alias Resolution a sample network a cd b ef IP 1 IP 2 IP 9 IP 3 IP 4 IP 8 IP 7 Internet Topology Discovery

66 Analytical Alias Resolution Main Idea Use traceroute collected path traces only No probing is required at this point Study the relations between IP addresses in different traces Infer subnets: Use the IP address assignment convention to infer Point-to-point (/30 or /31) subnets, or Multi-access (/x where x<30) subnets from the path traces Infer IP aliases: Align path segments to infer IP aliases from the detected subnets 66 Internet Topology Discovery

67 Analytical Alias Resolution: Potential Issues Problems with inferring subnets accurately False positive: two separate subnets with consecutive /30 subnet numbers may be inferred as one /29 subnet False negative: a /29 subnet may be inferred as two separate /30 subnets Problems with inferring IP aliases accurately False positives and false negatives possible due to incorrectly formed subnets Both false positives and false negatives introduce inaccuracies to the resulting topology map 67 Internet Topology Discovery

68 Analytical Alias Resolution Potential Solutions How to verify the accuracy of formed subnets Accuracy condition: Two or more IP addresses from the same subnet cannot appear in a loop-free trace (unless they are consecutive) Check if a newly formed subnet violates this condition for any pair of available IP addresses from this subnet in any other path trace Completeness condition: To infer a /x subnet among a set of IP addresses that belong the address range, require that some fraction (e.g., 50%) of these addresses appear in our data set Needed to increase our confidence on the inferred subnet Processing order: Start with subnets with higher completeness ratio 68 Internet Topology Discovery

69 Analytical Alias Resolution Potential Solutions How to verify the accuracy of inferred IP aliases No loop condition: No inferred IP aliases should introduce any routing loops in any of the path traces Example: Consider two traces (…, a, b, c, d, …) (…, e, f, g, h, b, i, …)(reverse trace) Assume a subnet relation (g ↔ c) Inferred alias pair: (b,g) ----- CAUSES LOOP! 69 Internet Topology Discovery

70 Analytical Alias Resolution Potential Solutions How to verify the accuracy of inferred IP aliases Common neighbor condition: Given two IP addresses s and t that are candidate aliases belonging to a router R, one of the following cases should hold: 1.s and t have a common neighbor in some path trace 2.There exists an alias pair (b,o) such that b is a successor (or predecessor) of s o is a predecessor (or successor) of t 3.involved traces are aligned such that they form two subnets, one at each side of router R Distance condition: Given two IP addresses s and t that are candidate aliases for a router R, s and t should be at similar distance to a vantage point Adds an active probing component to the solution 70 Internet Topology Discovery

71 AMP: ally (1,884 pairs) and APAR (2,034 pairs) iPlane: ally (39,191 pairs) and APAR (50,206 pairs) 71 Evaluations Coverage Comparisons 1,003 Causing LoopAlly APARAlly disagree 864 986 45 34 AllyAPAR Ally disagree Causing loop Source IP based 11,070 2,514 8,206 3,058 6,179 iPlane10,67822,886 ? Complete ally requires (275K) 2 probes Internet Topology Discovery

72 72 Summary Analytical and Probe-base Alias Resolution IP alias resolution task has a considerable effect on most of the analyzed topological characteristics In general, false negatives have more impact than false positives. APAR benefits from IP address assignment of links, focuses on structural connections between routers, more effective on data sets that include symmetric path segments collected from large number of vantage points requires no/minimal probing overhead. complements probe-based approaches Internet Topology Discovery

73 Outline Introduction Internet Topology Measurement Topology Discovery Issues Impact of IP Alias Resolution Topology Discovery Resolving Anonymous Routers Graph-based Induction Technique Resolving Alias IP Addresses Analytical and Probe-based Alias Resolution Resolving Genuine Subnets Dynamic Subnet Inference Summary 73 Internet Topology Discovery

74 74 Genuine Subnet Resolution Problem Subnet resolution Identify IP addresses that are connected over the same medium Improve the quality of resulting topology map IP2 IP3 IP1 IP2IP3 IP1 Internet Topology Discovery (observed topology)(inferred topology)(underlying topology) CD AB CD AB CD AB CD AB

75 Subnet Resolution: Advantages Improve the quality of resulting topology map vs Increase the scope of the map (observed topology)(inferred topology)(genuine topology) CD AB CD AB CD AB CD AB CD AB CD AB 75 Internet Topology Discovery

76 Subnet Resolution: Advantages Improve alias resolution process Reduce the number of probes in ally based alias resolution ally tool requires O(n 2 ) probes to resolve aliases among n IP addresses. We could determine ally probes based on subnets This approach reduces the number of probes to O(n.s) where s is the average of number of IP addresses in a subnet. Trace: IP a ……...IP b ……... IP c ……... IP d IP e IP f IP g IP h IP i IP k IP l subnets 76 Internet Topology Discovery

77 77 Subnet Resolution: Approach Importance of IP Alias Resolution 129.110.0.0/16 /30 /31 /24 /28 /29.2.1.3.4.5.6 129.110.12.0/29 129.110.4.0/24 129.110.6.0/28 129.110.17.0/24 129.110.12.0/29 129.110.219.0/24 129.110.1.0/30 129.110.2.0/31

78 78 Genuine Subnet Resolution Trace Preservation 129.110.0.0/16 129.110.1.1 129.110.1.2 129.110.2.0 129.110.2.1 129.110.4.1 129.110.4.83 129.110.4.217 129.110.12.1 129.110.12.2 129.110.12.6 129.110.17.1 129.110.17.135 129.110.219.1 129.110.0.0/16 129.110.0.0/21 /30 /31 /24 /28 /29 129.110.4.0/24 129.110.6.0/28 129.110.17.0/24 129.110.12.0/29 129.110.219.0/24 129.110.1.0/30 129.110.2.0/31 129.110.4.1 129.110.1.2 129.110.2.1 129.110.12.2 129.110.12.0/29 129.110.17.0/24 129.110.4.0/24 129.110.0.0/22 Internet Topology Discovery

79 79 129.110.1.1 129.110.1.2 129.110.2.0 129.110.2.1 129.110.4.1 129.110.4.83 129.110.4.217 129.110.12.1 129.110.12.2 129.110.12.6 129.110.17.1 129.110.17.135 129.110.219.1 Genuine Subnet Resolution Distance Preservation V.P. /30 /31 /24 /28 /29 129.110.4.0/24 129.110.6.0/28 129.110.17.0/24 129.110.12.0/29 129.110.219.0/24 129.110.1.0/30 129.110.2.0/31 23342124554532334212455453 129.110.2.0/30 129.110.4.0/24 129.110.12.0/29 129.110.17.0/24 129.110.0.0/16 129.110.1.0/31 Internet Topology Discovery

80 80 Genuine Subnet Resolution Dynamic Subnet Inference Approach Inferring Subnets Cluster IP addresses into maximal subnets up to a given size (e.g. /24) Perform accuracy and distance analysis on candidate subnets and break them down as necessary. IP1 IP2 IP3 IP4 IP5 IP6 IP7 IP8 IP9 Completeness: Ignore candidate subnets that have less than one quarter of their IP addresses present. /25 /29 /26 /30 /31 /27 A /27 subnet can have up to 2 5 IP addresses. /24 Internet Topology Discovery

81 Internet2 backbone topology on Apr 29, 2007 Inferred 116 verifiable subnets 95 exact size 12 smaller (observed IPs formed a smaller subnet) 9 bigger (false positives) 81 Evaluations Internet2 backbone verification 150 subnets 547 routers 793 IPs R1 H1 1 R4 9 R2 R3 2 6 R5 10 11 /29 R10 2 R11 10 11 /28 R2 6 R6 1 R1 /29 Internet Topology Discovery

82 82 Identified a new step (i.e., subnet inference) to improve topology mapping studies. Introduced a technique to infer subnets and demonstrated its effectiveness Detect connectivity between nodes An inferred /24 subnet had only a single link between two of its 73 observed IP addresses. Using subnets, we may reduce the number of ally probes for alias IP resolution e.g. 362K to 35.5K. Summary Genuine Subnet Resolution Internet Topology Discovery

83 Outline Introduction Internet Topology Measurement Topology Discovery Issues Impact of IP Alias Resolution Topology Discovery Resolving Anonymous Routers Graph-based Induction Technique Resolving Alias IP Addresses Analytical and Probe-based Alias Resolution Resolving Genuine Subnets Dynamic Subnet Inference Summary 83 Internet Topology Discovery

84 84 Summary The Internet is man-made, so why do we need to measure it? Because we still don’t really understand it Sometimes things go wrong Measurement for network operations Detecting and diagnosing problems What-if analysis of future changes Measurement for scientific discovery Creating accurate models that represent reality Identifying new features and phenomena Researchers have been sampling and analyzing Internet topology Building network graph from raw-data is not easy. There are several issues due to sampling Resolving anonymous routers, IP aliases, and genuine subnets Huge computational and probing overhead due to very large data size Internet Topology Discovery

85 Questions ? Internet Topology Discovery 85


Download ppt "Router-level Internet Topology Discovery Mehmet H. Gunes."

Similar presentations


Ads by Google