Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Similar presentations


Presentation on theme: "University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks."— Presentation transcript:

1 University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks

2 Outline Introduction Anonymous router resolution – Problem – Previous approaches Anonymity types Anonymity resolution via graph-based induction (GBI) Conclusions 2 CS 790g: Complex Networks

3 Internet Topology Measurement: Internet topology measurement studies Involves topology collection / construction / analysis Current state of the research activities Distributed topology data collection studies/platforms – Skitter, AMP, iPlane, Dimes, DipZoom, … – 20M path traces with over 20M nodes Issues in topology construction 1.Verifying accuracy of path traces 2.IP alias resolution 3.Subnet inference 4.Anonymous router resolution CS 790g: Complex Networks 3

4 Topology Collection (traceroute) Probe packets are carefully constructed to elicit intended response from a probe destination traceroute probes all nodes on a path towards a given destination – TTL-scoped probes obtain ICMP error messages from routers on the path – ICMP messages includes the IP address of intermediate routers as its source Merging end-to-end path traces yields the network map Internet Topology Discovery 4 S DABC Destination TTL=1 IP A TTL=2 IP B TTL=3 IP C TTL=4 IP D Vantage Point Details

5 Outline Introduction Anonymous router resolution – Problem – Previous approaches Anonymity types Anonymity resolution via graph-based induction (GBI) Conclusions 5 CS 790g: Complex Networks

6 Anonymous routers do not respond to traceroute probes and appear as  in traceroute output – Same router may appear as  in multiple traces. Internet Topology Discovery 6 y: S – L – H – x x: H – L – S – y y: S –  – H – x x: H –  – S – y S L H y x S L H y x y S 11 22 H x Current daily raw topology data sets include ~ 20 million path traces with ~ 20 million occurrences of  s along with ~ 500K public IP addresses The raw topology data is far from representing the underlying sampled network topology Problem

7 7 Internet2 backbone Traces x - H - L - S - y x - H - A - W - N - z y - S - L - H - x y - S - U - K - C - N - z z - N - C - K- H - x z - N - C - K - U - S - y S L U K C H A W N y x z CS 790g: Complex Networks Problem

8 Internet2 backbone S L U K C H A W N y x z Traces x -  - L - S - y x -  - A - W -  - z y - S - L -  - x y - S - U -  - C -  - z z -  - C -  -  - x z -  - C -  - U - S - y CS 790g: Complex Networks 6 Problem

9 Internet Topology Discovery 9 UKCN LHAW S d e f Sampled network d e f S U L C A W Resulting network Traces d -  - L - S - e d -  - A - W -  - f e - S - L -  - d e - S - U -  - C -  - f f -  - C -  -  - d f -  - C -  - U - S - e Problem

10 Basic heuristics – IP: Combine anonymous nodes between same known nodes [Bilir 05] Limited resolution – NM: Combine all anonymous neighbors of a known node [Jin 06] High false positives More theoretic approaches – Graph minimization approach [Yao 03] Combine  s as long as they do not violate two accuracy conditions: (1) Trace preservation condition and (2) distance preservation condition High complexity O(n 5 ) – n is number of  s – ISOMAP based dimensionality reduction approach [Jin 06] Build an n x n distance matrix then use ISOMAP to reduce it to a n x 5 matrix Distance: (1) hop count or (2) link delay High complexity O(n 3 ) – n is number of nodes 10 UK C N L HA W S x y z Sampled network x y z S U L C A W After resolution x y z S U L C A W H x y z S U L C A W Resulting network CS 790g: Complex Networks Previous Approaches

11 Outline Introduction Anonymous router resolution – Problem – Previous approaches Anonymity types Anonymity resolution via graph-based induction (GBI) Conclusions 11 CS 790g: Complex Networks

12 Anonymity Types Type 1: Do not send any ICMP responses Type 2: Rate limit ICMP responses Type 3: Do not send ICMP responses when congested Type 4: Filtered ICMP responses at border routers Type 5: ICMP responses with private source IP address 12 CS 790g: Complex Networks

13 Graph Based Induction (GBI) - Our Approach Graph based induction – A graph data mining technique Find frequent substructures in a graph data Commonly used in mining biological and chemical graph data Use of GBI for anonymous router resolution – Observe common graph structures due to anonymous routers – Develop localized algorithms with manageable computational and storage overhead – Trace Preservation Condition Merge anonymous nodes as long as they cause no loops in path traces 13 CS 790g: Complex Networks

14 Common Structures 14 A x C y2 A x C Parallel  -substring y1 y3 y1 y3     DA wx C y E z DA wx C y E z Star     A C x y D w F v E z A C x y D w F v E z Complete Bipartite        A C x y D w E z A C x y D w E z Clique        CS 790g: Complex Networks

15 Parallel  -substring Algorithm For each  -substrings (a,  i,c), represent it as a tuple (a||c,  i ) – a||c is the tuple identifier and a<c Read path traces and build the sorted list L of two tuples Subsequently read tuples are compared to the ones in the list based on tuple identifiers and duplicates are excluded from L Handling anonymity due to ICMP rate limiting or congestion A second scan of path traces looking for substrings of the form (a,b,c) corresponding to (a,  i,c) in L 15 a c b a c b     CS 790g: Complex Networks

16 Clique Generate a new graph G* = (V*,E*) – For each  -substring of type (a,  e, b), V* ← V* U {a, b} E* ← E* U {e(a,b)} First identify 4-cliques and grow them by adding nodes that are connected to at least 4 nodes of the structure – Helps in tolerating few missing links in large cliques Then, process all 3-cliques 16 a c d e a c d e a c d e        CS 790g: Complex Networks

17 Complete Bipartite First search for a small size, i.e., K 2,3, complete bipartite structure in G* and then grow it to a larger one – Take each pair of nodes and look whether they are in a K 2,3 – Identifying a K 2,3, look for larger complete bipartite graphs K 2,m and then K n,m that contain the identified K 2,3. Then, process all K 2,2 ’s 17 A C D F E A C D F E In G C D F E In G* In G A        CS 790g: Complex Networks

18 Star Combine anonymous neighbors of a known node under trace preservation condition – Starting from ones with smallest number of anonymous neighbors 18 DA w C y E z DA w C y E z Note: Operate on G and not on G*     CS 790g: Complex Networks

19 Outline Introduction Anonymous router resolution – Problem – Previous approaches Anonymity types Anonymity resolution via graph-based induction (GBI) Conclusions CS 790g: Complex Networks 19

20 Summary Internet Topology Discovery20 DA C E GBI DA C E Underlying   DA C E Collected   DA C E Neighbor Matching  Responsiveness reduced in the last decade NP-hard problem Graph Based Induction Technique Practical approach for anonymous router resolution Identifies common structures Handles all anonymity types Helpful in resolving multiple anonymous routers in a locality Uses subnet info to reduce the false postives

21 References M. H. Gunes and K. Sarac. Resolving anonymous routers in internet topology measurement studies. In IEEE INFOCOM, Apr. 2008. S. Bilir, K. Sarac, and T. Korkmaz. Intersection characteristics of end-to-end Internet paths and trees. IEEE International Conference on Network Protocols (ICNP), Boston, MA, USA, November 2005. A. Broido and K. Claffy. Internet topology: Connectivity of IP graphs. Proceedings of SPIE ITCom Conference, Denver, CO, USA, August 2001. B. Cheswick, H. Burch, and S. Branigan. Mapping and visualizing the Internet. ACM USENIX,San Diego, CA, USA, June 2000. B. Yao, R. Viswanathan, F. Chang, and D. Waddington. Topology inference in the presence of anonymous routers. IEEE INFOCOM, San Francisco, CA, USA, March 2003. P. Tan, M. Steinbach, and V. Kumar. Introduction to data mining. Addison-Wesley, Reading, MA, USA, 2005. X. Jin, W.-P. K. Yiu, S.-H. G. Chan, and Y. Wang. Network topology inference based on end-to-end measurements. IEEE Journal on Selected Areas in Communications special issue on Sampling the Internet, 24(12):2182{2195, Dec. 2006. D. Cook and L. Holder. Mining graph data. John Wiley & Sons, 2006. T. Matsuda, H. Motoda, and T.Washio. Graph-based induction and its applications. Advanced Engineering Informatics, 16(2):135{1434, April 2002. Michihiro Kuramochi, George Karypis, "Frequent Subgraph Discovery," Data Mining, IEEE International Conference on, pp. 313, First IEEE International Conference on Data Mining (ICDM'01), 2001. Michihiro Kuramochi, George Karypis, "An Efficient Algorithm for Discovering Frequent Subgraphs," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 9, pp. 1038-1051, September, 2004. Inokuchi, A., Washio, T., and Motoda, H. 2003. Complete Mining of Frequent Patterns from Graphs: Mining Graph Data.Mach. Learn. 50, 3 (Mar.2003),321-354.DOI=http://dx.doi.org/10.1023/A:1021726221443 Inokuchi, A., Washio, T., and Motoda, H. 2004. A General Framework for Mining Frequent Subgraphs from Labeled Graphs.Fundam. Inf. 66, 1-2 (Nov. 2004), 53-82.

22 QUESTIONS


Download ppt "University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks."

Similar presentations


Ads by Google