Download presentation
Presentation is loading. Please wait.
Published byBrian Wilkinson Modified over 11 years ago
1
Google-based Traffic Classification Aleksandar Kuzmanovic Northwestern University IEEE Computer Communications Workshop (CCW 08) October 23, 2008 http://networks.cs.northwestern.edu
2
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) 2 Traffic Classification Problem – traffic classification Current approaches (port-based, payload signatures, numerical and statistical etc.) Our approach –Use information about destination IP addresses available on the Internet A. Kuzmanovic Google-based Traffic Classification
3
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) 3 Getting External Information Use Google! Can we systematically exploit search engines to harvest endpoint information available on the Internet? Huge amount of endpoint information available on the web A. Kuzmanovic Google-based Traffic Classification
4
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) 4 Websites run logging software and display statistics Some popular proxy services also display logs Popular servers (e.g., gaming) IP addresses are listed Blacklists, banlists, spamlists also have web interfaces Even P2P information is available on the Internet since the first point of contact with a P2P swarm is a publicly available IP address Where Does the Information Come From? Servers Clients P2P Malicious A. Kuzmanovic Google-based Traffic Classification
5
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) URL Hit text URL Hit text URL Hit text …. Rapid Match Domain name Keywords Domain name Keywords …. IP tagging IP Address xxx.xxx.xxx.xxx Website cache Search hits 5 Methodology – Web Classifier and IP Tagging A. Kuzmanovic Google-based Traffic Classification
6
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) 6 165.124.182.169 Tagged IP Cache Traffic Classification Mail server 193.226.5.150Website 68.87.195.25Router 186.25.13.24Halo server Hold a small % of the IP addresses seen Look at source and destination IP addresses and classify traffic A. Kuzmanovic Google-based Traffic Classification
7
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) When no sampling is done UEP outperforms BLINC UEP maintains a large classification ratio even at higher sampling rates BLINC stays in the dark 2% at sampling rate 100 UEP retains high classification capabilities with sampled traffic UEP retains high classification capabilities with sampled traffic 7 Working with Sampled Traffic A. Kuzmanovic Google-based Traffic Classification
8
I. Trestian Unconstrained Endpoint Profiling (Googling the Internet) Summary Shift research focus from mining operational network traces to harnessing information that is already available on the web Deep packet inspection and legal issues: –Federal Wiretap Act: thou shalt not intercept the contents of communications. Violations can result in civil and criminal penalties. The worst offenses may be investigated by the FBI, Secret Service, DEA, and IRS as felony prosecutions. –Only 2 exceptions: The provider protection exception Consent 8 A. Kuzmanovic Google-based Traffic Classification
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.