RESOLVING IP ALIASES USING DISTRIBUTED SYSTEMS
Internet Topology Mapping Challenges Related Works Intended Works PRESENTATION CONTENT Introduction Internet Topology Mapping Challenges Related Works Intended Works Conclusion
WHY MAPPING THE INTERNET? INTRODUCTION WHY MAPPING THE INTERNET? Studying the Internet Detecting problems Studying virus spread
HOW IS INTERNET MEASURED Traceroute Time To Live (TTL) IPA IPB IPC IPD Vantage Point Destination TTL=4 TTL=3 TTL=1 TTL=2 A B C D S
TRACEROUTES Archipelago (Ark) RIPE MLAB IPlane
CHALLENGES
CHALLENGES Load Balancing Unresponsive Routers IP Aliases NAT Boxes
Per destination Per packet Random Paris traceroute Y D A B C LOAD BALANCING Per destination Per packet Random Paris traceroute Y: “time exceeded” Dest = D TTL = 2 Y D A B: “time exceeded” Dest = D TTL = 1 B C
UNRESPONSIVE ROUTERS
Routers have multiple interfaces Each interface has its own IP address IP ALIASES Routers have multiple interfaces Each interface has its own IP address
Multiple routers can use one IP address IPv4 running out NAT BOXES Multiple routers can use one IP address IPv4 running out Internal Policies
RELATED WORKS
RELATED WORKS Iffinder & Mercator Ally RadarGun Midar APAR Kapar
IFFINDER AND MERCATOR Fingerprint method Sending UDP packets to unused port number of routers Receive ICMP PORT UNREACHABLE message from routers Messages contain IP address of the source (router) Does the probing address match the address in the message?
ALLY Another fingerprint method Uses IP IDs (16-bit number stored in the IP ID field in IP header) of routers Sends messages to two or more alias candidates Inspects the response messages Problems (false negatives) Sometimes routers don’t send messages back Some routers increment IP ID’s very fast Some routers don't increment IP ID’s Some routers use different IP IDs for different interfaces O(n^2)
RADARGUN 30 probes in time limits for each IP, O(n) Saves timestamps and IP ID values of response and inspects them Ignore if receive response from less than 25% of the 30 probes all IP IDs of particular router is either zero or the same value time series is nonlinear (counter advancing too quickly or generates random values) If two addresses share an IP ID counter, then their time series should have nearby IP ID values when overlapping in time
MIDAR Same ignoring conditions with RadarGun MONOTONIC ID-BASED ALIAS RESOLUTION MIDAR Same ignoring conditions with RadarGun Threshold higher than RadarGun Monotonic Bounds Test MBT checks that two time series A and B meet the monotonicity requirement by individually checking that each sample of B meets the monotonicity requirement with respect to the samples of A, and vice versa
APAR Analytical alias resolution ANALYTICAL AND PROBE-BASED ALIAS RESOLVER APAR Analytical alias resolution Analyzing IP addresses in traces to identify candidate subnets Resolving IP aliases depending on the subnets inferred
Nodes in the same subnet should appear one hop away from each other APAR CONTINUED Rules Accuracy Nodes in the same subnet should appear one hop away from each other Completeness Ignore candidate subnets with less than half of IPs found Processing Order Processing priority is given to subnets with more path traces No loops Common Neighbors Distance (candidate aliases should be at same distance from vantage points)
Trace ( h1:h4 ) - a b q m g Trace ( h2:h1 ) - c d o a Trace ( h3:h1 ) - e f k o a
KAPAR Optimized version of APAR Avoided storing the complete set of paths in memory Identifies /24 or bigger subnets in the same traces that cannot exist Gives unique ID to each trace, stores list of all IPs in the traces Works with sets of aliases obtained from other sources (results of fingerprints) Uses stricter subnet formation (does probes for IPs not found in AS) Uses stricter common neighbor condition
INTENTION
PRELIMINARIES AS (Autonomous Systems) Within the Internet, an autonomous system (AS) is a collection of connected Internet Protocol (IP) routing prefixes under the control of one or more network operators on behalf of a single administrative entity or domain that presents a common, clearly defined routing policy to the Internet (Border Gateway Protocol) BGP Announcements
INTENTION Parse IPs in traces files Detect which ASes they belong according to BGP announcement Collect and separate all traces that belong to the same ASes Create files for each ASes that store traces belonging to that particular AS (more than 50,000 files) Distribute all AS files among cluster nodes Process each AS file independently Use best IP Alias Resolution technique that fits our case
TECHNOLOGIES INTENDED TO BE USED HIGHLY AVAILABLE, SCALABLE DISTRIBUTED SYSTEMS TECHNOLOGIES INTENDED TO BE USED
WHAT WE TALKED ABOUT CONCLUSION
MAPPING INTERNET Why mapping is important What challenges are there What has been done What we intend to do