De-anonymizing the Internet Using Unreliable IDs

De-anonymizing the Internet Using Unreliable IDs
By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Yinzhi Cao, Ionut Trestian

Problem A free but troublesome network Problems we try to solve:
To what extent can we use IP addresses to track hosts? Can we use the binding information between hosts and IP addresses to strengthen network security?

Host-Tracking Graph Formally, we define the host-tracking graph G : H × T → IP, where H is the space of all hosts on the Internet, T is the space of time, and IP is the IP-address space.

Host-Tracking Graph

Host Representation Since we lack strong authentication mechanisms, we consider leveraging application-level identifiers such as user IDs, messenger login IDs, social network IDs, or cookies.

Goals We would like to generate two outputs
The first being an identity-mapping table that represents the mappings from unreliable IDs to hosts The second being the host-tracking graph that tracks each host’s activity across different IP addresses over time

Tracking Host Activities

Application-ID Grouping
To quantitatively compute the probability of two independent user IDs u1 and u2 appearing consecutively, let us assume that each host’s connection (hence the corresponding user login) to the Internet is a random, independent event.

Host-Tracking Graph Construction

Resolving Inconsistency
Proxy Identification To find both types of proxies/NATs, HostTracker gradually expands all the overlapped conflict binding windows associated with a common IP address. Guest Removal

Input Dataset A month-long user-login trace collected at a large Web- service provider in October, 2008 (about 330 GB). Each entry has 3 fields: (1) an anonymized user ID (550 million) (2) the IP address that was used to perform the login (220 million) (3) the timestamp of the login event For validation: A month-long software-update log collected by a global software provider during the same period of October, 2008. a unique hardware ID for each remote host that performs an update, the IP address of the remote host the software update timestamp.

Tracked events

Tracked Hosts vs. Active Hosts

Validation results

Tracked User Population

Signup Date

Email sending behavior

Applications – Detecting Malicious Activity
In a previous study we identified 5.6 million malicious IDs that are used to conduct spam campaigns Intersection between malicious IDs and tracked IDs (220 million) is small (50k)

Signup Date - Revisited

Host Tracking – Security

Country Code Comparison

Seed Size Analysis

Conclusions Although accesses provide only a limited view of the Internet one can use other information for tracking – social network IDs, cookies etc Hard to evade HostTracker and maintain attack effectiveness at the same time

De-anonymizing the Internet Using Unreliable IDs

Similar presentations

Presentation on theme: "De-anonymizing the Internet Using Unreliable IDs"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

De-anonymizing the Internet Using Unreliable IDs

Similar presentations

Presentation on theme: "De-anonymizing the Internet Using Unreliable IDs"— Presentation transcript:

Similar presentations

About project

Feedback