Near-Deterministic Inference of AS Relationships Udi Weinsberg A thesis submitted toward the degree of Master of Science in Electrical and Electronic Engineering.

Slides:



Advertisements
Similar presentations
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Advertisements

Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.
Censorship Resistance: Decoy Routing Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See.
UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Quantitative Analysis of BGP Route Leaks Benjamin Wijchers Benno Overeinder.
Optimization of Pearl’s Method of Conditioning and Greedy-Like Approximation Algorithm for the Vertex Feedback Set Problem Authors: Ann Becker and Dan.
By Hitesh Ballani, Paul Francis, Xinyang Zhang Slides by Benson Luk for CS 217B.
Best Practices for ISPs
Inferring Autonomous System Relationships in the Internet Lixin Gao Dept. of Electrical and Computer Engineering University of Massachusetts, Amherst
Inferring Autonomous System Relationships in the Internet Lixin Gao.
Inferring Autonomous System Relationships in the Internet Lixin Gao Presented by Santhosh R Thampuran.
1 Internet Path Inflation Xenofontas Dimitropoulos.
Something We Always Wanted to Know about ASs: Relationships and Taxonomy Dmitri Krioukov X. Dimitropoulos, M. Fomenkov, B. Huffaker, Y.
Part II: Inter-domain Routing Policies. March 8, What is routing policy? ISP1 ISP4ISP3 Cust1Cust2 ISP2 traffic Connectivity DOES NOT imply reachability!
Progress in inferring business relationships between ASs Dmitri Krioukov 4 th CAIDA-WIDE Workshop.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada ISP-Friendly Peer Matching without ISP Collaboration Mohamed Hefeeda (Joint.
MEDUSA – New Model of Internet Topology Using k-shell Decomposition Shai Carmi Shlomo Havlin Bloomington 05/24/2005.
Tutorial 5 Safe Routing With BGP Based on: Internet.
CS 164: Global Internet Slide Set In this set... More about subnets Classless Inter Domain Routing (CIDR) Border Gateway Protocol (BGP) Areas with.
Jellyfish, and other Interesting creatures Of the Internet Scott Kirkpatrick, Hebrew University with Avishalom Shalit, Sorin Solomon, Shai Carmi, Eran.
Stable Internet Routing Without Global Coordination Jennifer Rexford Princeton University Joint work with Lixin Gao (UMass-Amherst)
Slide -1- February, 2006 Interdomain Routing Gordon Wilfong Distinguished Member of Technical Staff Algorithms Research Department Mathematical and Algorithmic.
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
Inherently Safe Backup Routing with BGP Lixin Gao (U. Mass Amherst) Timothy Griffin (AT&T Research) Jennifer Rexford (AT&T Research)
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Economic Incentives in Internet Routing Jennifer Rexford Princeton University
Internet Routing (COS 598A) Today: Interdomain Topology Jennifer Rexford Tuesdays/Thursdays 11:00am-12:20pm.
Stable Internet Routing Without Global Coordination Jennifer Rexford AT&T Labs--Research
University of Massachusetts, Amherst 1 On the Evaluation of AS Relationship Inferences Jianhong Xia and Lixin Gao Department of Electrical and Computer.
Stable Internet Routing Without Global Coordination Jennifer Rexford AT&T Labs--Research
Stable Internet Routing Without Global Coordination Jennifer Rexford AT&T Labs--Research Joint work with Lixin Gao.
Building a Strong Foundation for a Future Internet Jennifer Rexford ’91 Computer Science Department (and Electrical Engineering and the Center for IT Policy)
CS401 presentation1 Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility Takahiro Hara Presented by Mingsheng Peng (Proc. IEEE.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Inter-domain Routing Outline Border Gateway Protocol.
Information-Centric Networks07b-1 Week 7 / Paper 2 NIRA: A New Inter-Domain Routing Architecture –Xiaowei Yang, David Clark, Arthur W. Berger –IEEE/ACM.
Constructing Inter-Domain Packet Filters to Control IP Spoofing Based on BGP Updates Zhenhai Duan, Xin Yuan Department of Computer Science Florida State.
Impact of Prefix Hijacking on Payments of Providers Pradeep Bangera and Sergey Gorinsky Institute IMDEA Networks, Madrid, Spain Developing the Science.
9/15/2015CS622 - MIRO Presentation1 Wen Xu and Jennifer Rexford Department of Computer Science Princeton University Chuck Short CS622 Dr. C. Edward Chow.
Network Aware Resource Allocation in Distributed Clouds.
Finding dense components in weighted graphs Paul Horn
How Secure are Secure Inter- Domain Routing Protocols? SIGCOMM 2010 Presenter: kcir.
On AS-Level Path Inference Jia Wang (AT&T Labs Research) Joint work with Z. Morley Mao (University of Michigan, Ann Arbor) Lili Qiu (University of Texas,
Advanced Networking Lab. Given two IP addresses, the estimation algorithm for the path and latency between them is as follows: Step 1: Map IP addresses.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
MEDUSA – New Model of Internet Topology Using k-shell Decomposition Shai Carmi Shlomo Havlin Bloomington 05/24/2005.
TDTS21: Advanced Networking Lecture 7: Internet topology Based on slides from P. Gill and D. Choffnes Revised 2015 by N. Carlsson.
David Wetherall Professor of Computer Science & Engineering Introduction to Computer Networks Hierarchical Routing (§5.2.6)
An Efficient Algorithm for Enumerating Pseudo Cliques Dec/18/2007 ISAAC, Sendai Takeaki Uno National Institute of Informatics & The Graduate University.
On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering.
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Advancements in the Inference of AS Relationships Xenofontas Dimitropoulos (Fontas) (CAIDA/GaTech) Dmitri Krioukov Bradley Huffaker k claffy George Riley.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
CSE534- Fundamentals of Computer Networking Lecture 12-13: Internet Connectivity + IXPs (The Underbelly of the Internet) Based on slides by D. Choffnes.
Jellyfish, and other Interesting creatures Of the Internet Scott Kirkpatrick, Hebrew University with Avishalom Shalit, Sorin Solomon, Shai Carmi, Eran.
Inferring AS Relationships. The Problem  One view  AS relationships  BGP route tables  The other view  BGP route tables  AS relationships  Available.
Inter-domain Routing Outline Border Gateway Protocol.
1 Internet Routing 11/11/2009. Admin. r Assignment 3 2.
1 On the Impact of Route Monitor Selection Ying Zhang* Zheng Zhang # Z. Morley Mao* Y. Charlie Hu # Bruce M. Maggs ^ University of Michigan* Purdue University.
Cohesive Subgraph Computation over Large Graphs
Inferring Autonomous System Relationships in the Internet Lixin Gao Dept. of Electrical and Computer Engineering University of Massachusetts, Amherst.
No Direction Home: The True cost of Routing Around Decoys
Intra-Domain Routing Jacob Strauss September 14, 2006.
The Internet: A System of Interconnected Autonomous Systems
Measured Impact of Crooked Traceroute
Effective Replica Allocation
COS 461: Computer Networks
Stable and Practical AS Relationship Inference with ProbLink
Presentation transcript:

Near-Deterministic Inference of AS Relationships Udi Weinsberg A thesis submitted toward the degree of Master of Science in Electrical and Electronic Engineering Under the guidance of Dr. Yuval Shavitt and Eran Shir.

Outline Introduction and Theory Problem and Algorithm Experimental Results Conclusion

Introduction and Theory

Introduction Today's Internet consists of thousands of networks administrated by various Autonomous Systems (AS). Large Provider - AS7018 – AT&T Small Provider - AS1680 – NetVision Educational Network – AS378 – ILAN ASes are assigned with one or more blocks of IP prefixes and communicate routing information to each other using Border Gateway Protocol (BGP).

Type-of-Relationship ASes use a set of local policies for selecting the best route for each reachable prefix. These policies are based on the Type-of- Relationship (ToR) that exists between ASes. ToRs are used to calculate paths between ASes. ToRs are regarded as proprietary information. Deducing them is an important yet difficult problem

Type-of-Relationship (2) Three major commercial relationships between neighboring ASes: Customer-to-Provider (C2P) Peer-to-Peer (P2P) Sibling-to-Sibling (S2S)

Customer-to-Provider (C2P) Customer AS pays a Provider AS for traffic that is sent between the two. Provider AS is usually larger than the customer. Provider Customer C2P

Peer-to-Peer (P2P) Two ASes freely exchange traffic between themselves and their customers. Do not exchange traffic from or to their providers or other peers. Peer P2P

Sibling-to-Sibling (S2S) Two ASes administratively belong to the same organization. Freely exchange traffic between their providers, customers, peers, or other siblings Sibling S2S

Valley Free Routing BGP paths must comply with the following Valley-Free hierarchical pattern: An uphill segment of zero or more c2p or s2s links, Followed by zero or one p2p links, Followed by a downhill segment of zero or more p2c or s2s links.

Valley Free Routing (1)

Valley Free Routing (2)  

Problem and Algorithm

The Problem Given the AS graph (ASes as vertices with interconnecting edges), find the type-of- relationship between all adjacent ASes. Inferring ToR = Classifying edges. ?? Provider CustomerProvider Customer Peer

Related Works Current relationships inference algorithms use one of two techniques: Using heuristic assumptions Comparing AS degree to determine the “larger” AS. Optimizing some aspects of the ToR assignments Minimizing number of paths that are not valley-free Not allowing cycles in the resulting directed AS graph

The Gap Using heuristic assumptions throughout the relationships inference process causes the erroneous ToRs to be spread over all interconnecting ASes links. Optimization models fail to capture the true Internet hierarchy.

Work Goal Improve on existing methods by providing a near-deterministic inference algorithm for solving the ToR problem. We use the Internet Core, a sub-graph that consists of the globally top-level providers of the Internet Their interconnecting edges are already classified.

Near-Deterministic Inference Theoretically, given an accurate core with no relationships errors, the algorithm deterministically infers most of the remaining AS relationships using the AS-level paths relative to this core Without incurring additional inference errors! In real-world scenarios, where the core and AS- level paths can contain errors, the algorithm introduces minimal inference errors.

Why Near-? For the remaining set of relationships that cannot be inferred deterministically, a heuristic inference method is deployed. This group is relatively small, so it is still possible to provide a strict bound on the inference error.

Algorithm - Definitions Input S – a set of AS-level routing paths. G(V G,E G ) – the set of vertices that represent all ASes, and the interconnecting edges that need to be classified. Core(V C,E C ) – the vertices and interconnecting edges that represent the core of G, and is assumed to contain all the top-level ASes. Output E G – Edges of input graph with votes for ToRs.

Deterministic Algorithm Prior to starting the relationships inference algorithm, we infer S2S relationships. We use S2S data collected from CAIDA Obtained from IRR databases (RIPE, ARIN, APNIC). Pre-processing

Deterministic Algorithm Assuming that the input core consists of the global top-level ASes. Use the valley-free model of Internet routing. All paths that pass through the core are split into three segments: A segment of zero or more uphill C2P edges towards the core, At most one P2P edge in the core, A downhill segment of zero or more P2C edges from the core. Phase 1 Code

Deterministic Algorithm Paths that do not traverse the core, fail to provide us with a direct method for classification. There are paths that partly overlap other paths that traverse the core. For each of the remaining paths: Edges that precede a C2P edge must reside in an uphill segment, and be of type C2P. Edges that follow a P2C edge must be in a downhill segment, and be of type P2C. Phase 2 Code C2P

Deterministic Algorithm The data we use might be noisy and reflect transient routing effects. Especially when performing relationships inference over a long time frame. To avoid incorrect inferences resulting from these effects, we use voting technique: The above methods vote for the ToR of each traversed edge. Once the algorithm is finished, we count the votes and assign each edge with the type that received a relative votes count that passes a given threshold. Voting Code Graph

Non-Deterministic Algorithm The deterministic algorithm fails to classify several types of edges. We use heuristic assumptions to classify these edges.

Non-Deterministic Algorithm Edges that appear in paths that do not traverse the core, and reside between a c2p edge and a p2c edge. A c2p or p2c edges should participate in, at least, one path that pass through the core. The path may have a p2p relationship between its two top- level vertices P2P Peers

Non-Deterministic Algorithm Edges that have a similar number of votes for two or more types of relationships: The result of changes in the commercial relationship over the measurements period. More complex peering agreements that can cause the same edge to behave differently as seen from different view points in the Internet. Internet Exchange Points. Compare AS degrees to resolve ambiguities. Voting Ties

Non-Deterministic Algorithm Edges might appear in non- valley-free paths. Result of valid paths that pass a malformed core, Or invalid paths that pass an accurate core. These invalid paths occur in only a small fraction of paths less than 1% on average from the investigated paths per week. Valleys

Core Graph Construction We use three core construction methods, that result in cores that vary in size and density: Greedy Max Clique K max -Core CAIDA Peers

Core Graph Construction (1) Greedy Max Clique Tauro et at. proposed the Jellyfish model. The core is a clique of high-degree vertices. The first vertex in the core is the one with the highest degree. Sorting vertices in non-increasing degree order. A vertex is added to the vertex only if it forms a clique with the vertices already in the core. The resulting core is a clique but not necessarily the maximal clique of the graph.

Core Graph Construction (2) K max -Core (kCore) Carmi et at. proposed the Medusa model. Use a k-pruning algorithm to decompose the Internet AS graph and extract a nucleus The K max -Core, which is a very well connected globally distributed subgraph. This algorithm extracts a core by looking at the entire graph (global approach). The nucleus plays a critical role in BGP routing, since its vertices lie in a large fraction of the paths that connect different ASes.

Core Graph Construction (3) Taken from

Core Graph Construction (4) CAIDA Peers Constructed from ASes and edges that exhibit P2P relationship under the inference method of Dimitropoulos et al. Used the Automated AS ranking provided by CAIDA and constructed a graph that contains all the edges classified as P2P. Selected the largest connected component that contains some of the largest tier-1 ASes.

Algorithm – Recall in brief Construct AS-level graph and extract the Core. Classify all edges in paths relative to the core: Uphill to the core. Downhill from the core. Classify all edges in remaining paths, that now have some classified edges. Count votes to decide on types. Classify remaining paths using heuristics: Single edge between P2C and C2P is probably a P2P Break voting ties using AS degree.

Experimental Results

Data Sources Combined data from RouteViews and DIMES Maximize the size and density of the topology. RouteViews collects BGP advertisements using several routers. DIMES performs ~2 million daily active traceroute measurements from hundreds of Agents. The raw DIMES data was filtered in order to reduce inference mistakes. Filtering

Data Sources Topology On a weekly average, we filtered approximately 5,100 DIMES edges that were measured only once, which is over 15% of the edges measured by DIMES. Around half of these edges appear in RouteViews.

Sensitivity Analysis Core Construction The smallest GMC core results in the lowest deterministic inference percentage while the largest CAIDA Peers core have the highest percentage.

Sensitivity Analysis kCore provides an excellent overall inference percentage. Over 95% deterministically inferred and around 75% matching CAIDA). CAIDA Peers core seems to result in the best overall performance However, almost all 6,000 edges marked as P2P in CAIDA are in connected. This is very unlikely to be the case, and causes a bias. Core Construction Comparing Cores

Size Sensitivity Analysis Robustness to Core Size For more than 20 vertices in the core the algorithm classification success and similarity to CAIDA do not significantly change, while the number of deterministically classified edges increases.

Size Sensitivity Analysis Non-Valley-Free paths The increase in the number of deterministically classified edges comes with an increase in the percentage of non-valley- free paths.

Time Sensitivity Analysis Increasing Time Frame Using data from a single week results in over 90% of the edges being classified for all core types.

Time Sensitivity Analysis Matching CAIDA At any time frame, the algorithms agree on over 92% of the edges.

Mistake Sensitivity Analysis Heuristically Classified Edges While the algorithm's performance decreases as we increase the randomness of the core, the overall degradation is not as high as one would expect.

Mistake Sensitivity Analysis Type of Heuristically Classified Edges As more errors are injected, the algorithm needs to use more heuristics.

P2P Analysis Validating the DIMES Promise While on average the p2p relationships comprise 4-5% of the total number of edges, it goes up to around 12% of the edges that appear only in DIMES. Approximately 40% of the p2p edges inferred by our algorithm, do not appear in the RouteViews.

Conclusion

Conclusion (1) The common weakness of previously proposed AS relationships inference algorithms is their lack of guarantee on inference errors introduced during the process. This work improves on existing methods by providing a near-deterministic algorithm that, given a classified error-free input core, does not introduce additional inference errors.

Conclusion (2) The proposed algorithm provides accurate inferences Robust under changes in the core's size and creation technique. A core containing as little as 20 almost fully-connected ASes is sufficient for good inference results. Heuristic methods can still play an important role in inferring the remaining relationships. Using single week’s data, the algorithm runs for only about 2 hours and yields over 95% deterministically inferred relationships.

Thank You!

Backup Slides…

Voting Threshold Validation On average, over 94% of the edges have votes for exactly one relationship type, and almost 99% of the edges have over 80% of the votes for a single relationship type.

Deterministic Algorithm Phase 1

Deterministic Algorithm Phase 2

Deterministic Algorithm Voting

Data Sources The raw DIMES data was filtered in order to reduce inference mistakes and inclusion of false links: Included edges that were seen from at least two agents. Trimmed all traces that exhibit known traceroute problems. Routing loops Destination impersonation Filtering

Internet Exchange Points An Internet exchange point (IXP) is a physical infrastructure that allows different ASes to exchange traffic between them Edges connecting an IXP to adjacent ASes can exhibit different ToR depending on the AS-level path they participate in. Definition

Internet Exchange Points We identify edges between ASes and IXPs and create corresponding virtual edges A virtual edge is an edge that connects two ASes that have indirect peering via an IXP. The algorithm then infers relationships in the paths using these virtual edges Instead of using the original edges between the ASes and IXPs. Analysis

Sensitivity Analysis Comparing Cores Less than 6% of the edges were differently classified using two cores in each week. The difference between kCore and GMC is much smaller.