Download presentation
Presentation is loading. Please wait.
Published byNathaniel Haynes Modified over 6 years ago
1
De-anonymizing the Internet Using Unreliable IDs
By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Yinzhi Cao, Ionut Trestian
2
Problem A free but troublesome network Problems we try to solve:
To what extent can we use IP addresses to track hosts? Can we use the binding information between hosts and IP addresses to strengthen network security?
3
Host-Tracking Graph Formally, we define the host-tracking graph G : H × T → IP, where H is the space of all hosts on the Internet, T is the space of time, and IP is the IP-address space.
4
Host-Tracking Graph
5
Host Representation Since we lack strong authentication mechanisms, we consider leveraging application-level identifiers such as user IDs, messenger login IDs, social network IDs, or cookies.
6
Goals We would like to generate two outputs
The first being an identity-mapping table that represents the mappings from unreliable IDs to hosts The second being the host-tracking graph that tracks each host’s activity across different IP addresses over time
7
Tracking Host Activities
8
Application-ID Grouping
To quantitatively compute the probability of two independent user IDs u1 and u2 appearing consecutively, let us assume that each host’s connection (hence the corresponding user login) to the Internet is a random, independent event.
9
Host-Tracking Graph Construction
10
Resolving Inconsistency
Proxy Identification To find both types of proxies/NATs, HostTracker gradually expands all the overlapped conflict binding windows associated with a common IP address. Guest Removal
11
Input Dataset A month-long user-login trace collected at a large Web- service provider in October, 2008 (about 330 GB). Each entry has 3 fields: (1) an anonymized user ID (550 million) (2) the IP address that was used to perform the login (220 million) (3) the timestamp of the login event For validation: A month-long software-update log collected by a global software provider during the same period of October, 2008. a unique hardware ID for each remote host that performs an update, the IP address of the remote host the software update timestamp.
12
Tracked events
13
Tracked Hosts vs. Active Hosts
14
Validation results
15
Tracked User Population
16
Signup Date
17
Email sending behavior
18
Applications – Detecting Malicious Activity
In a previous study we identified 5.6 million malicious IDs that are used to conduct spam campaigns Intersection between malicious IDs and tracked IDs (220 million) is small (50k)
19
Signup Date - Revisited
20
Host Tracking – Security
21
Country Code Comparison
22
Seed Size Analysis
23
Conclusions Although accesses provide only a limited view of the Internet one can use other information for tracking – social network IDs, cookies etc Hard to evade HostTracker and maintain attack effectiveness at the same time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.