Chelebi: Subnet-level Internet Mapper Mehmet H. Gunes University of Nevada, Reno
Goal Build an efficient system that produces a map of the Internet such that – Alias IP addresses that belong to the same router, – Star (*) occurrences that stand for the same router, – IPs that belong to the same subnet are identified. Subnet-level Internet mapping 2
Outline Goal – Subnet-level Internet Mapping Issues – Anonymous Routers Resolution Structural Graph Indexing – Subnet Inference Distance Preservation – Alias IP Addresses Resolution Ally, Analytical & Probe based (APAR) Chelebi – Mapping System – Outer Space 3D Visualization Subnet-level Internet mapping 3
Anonymous Routers Anonymous routers do not respond to traceroute probes and appear as in traceroute output – Same router may appear as in multiple traces. Subnet-level Internet mapping : Anonymous Routers 4 y: S – L – H – x x: H – L – S – y y: S – – H – x x: H – – S – y S L H y x S L H y x y S 11 22 H x Current daily raw topology data sets include ~ 20 million path traces with ~ 20 million occurrences of s along with ~ 500K public IP addresses The raw topology data is far from representing the underlying sampled network topology
Anonymous Router Resolution Subnet-level Internet mapping : Anonymous Routers 5 UKCN LHAW S d e f Sampled network d e f S U L C A W Resulting network Traces d - - L - S - e d - - A - W - - f e - S - L - - d e - S - U - - C - - f f - - C - - - d f - - C - - U - S - e
Previous Approaches Basic heuristics – IP: Combine anonymous nodes between same known nodes [Bilir 05] Limited resolution – NM: Combine all anonymous neighbors of a known node [Jin 06] High false positives Subnet-level Internet mapping : Anonymous Routers 6 UK C N L HA W S x y z Sampled network x y z S U L C A W After resolution x y z S U L C A W H x y z S U L C A W Resulting network
Previous Approaches More theoretic approaches – Graph minimization [Yao 03] Combine s as long as they do not violate two accuracy conditions: (1) Trace preservation condition and (2) distance preservation condition High complexity O(n 5 ) – n is number of s – ISOMAP based dimensionality reduction [Jin 06] Build an nxn distance matrix then use ISOMAP to reduce it to a nx5 matrix Distance: (1) hop count or (2) link delay High complexity O(n3) – n is number of nodes – Semisupervised Spectral Clustering [Shavitt 08] A node will not be chosen to be an unknown root if it shares two or more neighbors with an unknown root. Nodes that share two or more neighbors are usually very close to each other, and it is difficult to distinguish between them even manually. After splitting them into unknowns, these nodes will have at least one common unknown node. – This makes the task of cleanly separating the unknowns impossible Subnet-level Internet mapping : Anonymous Routers 7
Structural Graph Indexing (SGI) Structural Graph Indexing – A graph data mining technique Index all pre-defined substructures in a graph data Use of SGI for anonymous router resolution – Apply SGI to collected path traces – Merge anonymous routers using identified structures Trace Preservation Condition – Don’t merge anonymous routers within the same trace Subnet distance as tie-breaker Subnet-level Internet mapping : Anonymous Routers 8
Common Structures due to ARs 9 A x C y2 A x C Parallel -substring y1 y3 y1 y3 DA wx C y E z DA wx C y E z Star A C x y D w F v E z A C x y D w F v E z Complete Bipartite A C x y D w E z A C x y D w E z Clique Subnet-level Internet mapping : Anonymous Routers
Graph Indexing based Resolution Indexing Phase parallel star bipartite clique Subnet-level Internet mapping : Anonymous Routers 10 Resolution Phase parallel clique bipartite star
Outline Goal – Subnet-level Internet Mapping Issues – Anonymous Routers Resolution Structural Graph Indexing – Subnet Inference Distance Preservation – Alias IP Addresses Resolution Ally, Analytical & Probe based (APAR) Chelebi – Mapping System – Outer Space 3D Visualization Subnet-level Internet mapping 11
Subnet Inference Subnet-level Internet mapping : Subnet Inference 12 IP2 IP3 IP1 IP2IP3 IP1 (observed topology)(inferred topology)(underlying topology) CD AB CD AB Subnet resolution Identify IP addresses that are connected over the same medium Improve the quality of resulting topology map CD AB CD AB
Subnet Inference Approach Subnet-level Internet mapping : Subnet Inference V.P. /30 /31 /24 /28 / / / / / / / / / / / / / /31
Subnet Inference Approach Inferring Subnets Cluster IP addresses into maximal subnets up to a given size (e.g. /22) Distance analysis on candidate subnets to break them down as necessary IP1 IP2 IP3 IP4 IP5 IP6 IP7 IP8 IP9 Completeness: Ignore candidate subnets that have less than one quarter of their IP addresses present after additional probing /25 /29 /26 /30 /31 /27 A /27 subnet can have up to 2 5 IP addresses. /22 Subnet-level Internet mapping : Subnet Inference 14
Inference with Distance Matrix Obtain distance of each IP from 8 vantage points (VP) Only one IP at a subnet might be at a distance ‘hop-1’ per VP IPs after per-destination and per-packet load-balancers – Get minimum hop (seen at any ICMP Paris Traceroute) of an IP per VP – IP hops after a LB has lower trust Two rounds of computations Compensate for diamond asymmetry if per-destination LB Subnet-level Internet mapping : Subnet Inference 15 VP:12345…672 IP105400…7 IP200350…7 IP325040…6 …
Outline Goal – Subnet-level Internet Mapping Issues – Anonymous Routers Resolution Structural Graph Indexing – Subnet Inference Distance Preservation – Alias IP Addresses Resolution Ally, Analytical & Probe based (APAR) Chelebi – Mapping System – Outer Space 3D Visualization Subnet-level Internet mapping 16
IP Alias Resolution 17 S L U C N W A s.2 l.1 s.3 u.1 l.3 u.3 h.1 k.3 h.2 a.3 u.2 k.1 c.4 a.1 a.2 w.3 c.3 w.1 c.2 n.1 n.3 w.2 l.2 K c.1 k.2 h.3 d h.4 s.1 e f n.2 H Traces d - h.4 - l.3 - s.2 - e d - h.4 - a.3 - w.3 - n.3 - f e - s.1 - l.1 - h.1 - d e - s.1 - u.1 - k.1 - c.1 - n.1 - f f - n.2 - c.2 - k.2 - h.2 - d f - n.2 - c.2 - k.2 - u.2 - s.3 - e Subnet-level Internet mapping : IP Aliases
IP Alias Resolution 18 UKCN LHAW S d e f Sampled network Sample map without alias resolution s.3 s.1 s.2 l.3 l.1 u.1 u.2 k.1 c.1n.1 n.2 k.2 c.2 w.3 a.3 h.2 h.4 h.1 e d f n.3 Traces d - h.4 - l.3 - s.2 - e d - h.4 - a.3 - w.3 - n.3 - f e - s.1 - l.1 - h.1 - d e - s.1 - u.1 - k.1 - c.1 - n.1 - f f - n.2 - c.2 - k.2 - h.2 - d f - n.2 - c.2 - k.2 - u.2 - s.3 - e Subnet-level Internet mapping : IP Aliases
19 Previous Approaches Dest = A B Dest = B A, ID=100 Dest = B B, ID=99 B, ID=103 A B A B Source IP Address Based Method [Pansiot 98] – Relies on a particular implementation of ICMP error generation. IP Identification Based Method (ally) [Spring 03] – Relies on a particular implementation of IP identifier field, – Many routers ignore direct probes. DNS Based Method [Spring 04] – Relies on similarities in the host name structures sl-bb21-lon-14-0.sprintlink.net sl-bb21-lon-8-0.sprintlink.net – Works when a systematic naming is used. Record Route Based Method [Sherwood 06] – Depends on router support to IP route record processing Subnet-level Internet mapping : IP Aliases
Analytical Alias Resolution 20 MIT UTD no response no response Aliases … Subnet-level Internet mapping : IP Aliases
Analytical & Probe-based Alias Resolution There is possibility of – incorrect subnet assumption, Two /30 subnets assumed as a /29, – incorrect alignment of path traces. IP 4 and IP 8 are thought of as aliases. To prevent false positives, some conditions are defined – Trace preservation, – Distance preservation (probing component of APAR), – Completeness, – Common neighbor. 21 a sample network a cd b ef IP 1 IP 2 IP 9 IP 3 IP 4 IP 8 IP 7 Subnet-level Internet mapping : IP Aliases
Outline Goal – Subnet-level Internet Mapping Issues – Anonymous Routers Resolution Structural Graph Indexing – Subnet Inference Distance Preservation – Alias IP Addresses Resolution Ally, Analytical & Probe based (APAR) Chelebi – Mapping System – Outer Space 3D Visualization Subnet-level Internet mapping 22
Chelebi Mapping System Subnet-level Internet mapping : Chelebi Mapping System 23 Chelebi Server Route Views AS IP query IP range DNS server DNS query DNS names PlanetLab Node Paris ICMP trace Path Traces Region 1Region 2Region 3Region 8 PlanetLab Node Paris ICMP trace Path Traces … PlanetLab Node
Chelebi Mapping System Request the IP address range from the server – user defined, RouteViews etc. Distributed bulk reverse DNS lookup by VPs, – if a name assigned to an IP it is more likely that it is in use Gather alive IPs data at server and redistribute IPs to be probed – Each destination is traced from each of the 8 regions Partial trace each destination IP by VPs and send data to server – Stack ADT for “to be probed” list new incoming IPs probed first – Stopping condition is if the last two IPs have already been probed by the same VP Subnet-level Internet mapping 24
Outer Space 3D Visualization – Multiple zoom levels Autonomous System-level Router-level Subnet-level Subnet-level Internet mapping : Chelebi Mapping System 25 idea
Questions Subnet-level Internet mapping 26