Detecting P2P Traffic from the P2P Flow Graph Jonghyun Kim Khushboo Shah Stephen Bohacek Electrical and Computer Engineering
Outline Introduction and Objectives Flow Data Identification Methods ◦ Class A-1 : Degree-Based P2P Detection ◦ Class A-2 : Known Port ◦ Class B-1 : Repeated Communication ◦ Class B-2 : P2P Port-Based Identification ◦ Class B-3 : Triggered P2P Detection Results Conclusion Future Work
Introduction Why detection of P2P Traffic? ◦ Helpful for network capacity planning, provisioning, traffic shaping/policing, etc. How to detect P2P Traffic? ◦ Port based ◦ Signature based ◦ Behavior based ◦ Machine learning based ◦ Host graph based
Objectives No deep packet inspection Simpler, but still be effective P2P flow graph based
Flow Data SIP : source IP DIP : destination IP SP : source port DP : destination port PR : protocol (tcp or udp) ST : flow start time EID : event ID (info for signature matching)
Flow Data time SYN B SIPSPPRDPDIP TCP Mathematical expression Pictorial view Each flow has components. A ST
Identification Methods flow 1 Class B methods connect flow1 to flow 2 flow 2 Class A methods detect flow 1 (an initial P2P flow) P2P flow graph by methods
Class A-1 : Degree-based P2P Detection A X7X7 TCP X 13 X UDP UDP X1X1 X3X TCP TCP X 10 X 11 X2X TCP TCP TCP X9X9 X8X8 X4X4 X5X5 X6X6 UDP UDP TCP TCP TCP t T T X4X4 X5X5 X6X6 X8X8 In-degree hosts X9X9 Out-degree hosts X1X1 X2X2 X3X3 X7X7 X 10 X 11 X 12 X
Class A-1 : Degree-based P2P detection Out-degree In-degree Detector P2P active time ( ID is not considered)
Class A-2 : Known Port P2P active Time Detector
Identification Methods flow 1 Take a look at Class B methods flow 2 Done with Class A methods P2P flow graph by methods
Class B-1 : Repeated Communication between Known P2P Peers A TCP X A X A X
Class B-1 : Repeated Communication between Known P2P Peers Detector given an initial P2P flow Detector given a set of P2P flows P2P peers =
Class B-2 : P2P Port Identification and Port-Based P2P Detection
A X7X7 TCP X 13 X UDP UDP X1X1 X3X TCP TCP X 10 X 11 X2X TCP TCP TCP
Class B-2 : P2P Port Identification and Port-Based P2P Detection A X7X7 TCP X 13 X UDP UDP X1X1 X3X TCP TCP X 10 X 11 X2X TCP TCP TCP
Class B-2 : P2P Port Identification and Port-Based P2P Detection T T TCP or UDP … Incoming … TCP or UDP outgoing IP P2P port
Class B-2 : P2P Port Identification and Port-Based P2P Detection Detector given an P2P flow
Class B-3 : Triggered P2P Detection 1 sec A X …… Nearby flows tend to be P2P flows
Class B-3 : Triggered P2P Detection Detector given an P2P flow P2P peers =
Summary Class A : Conservativeness ↑ T : time window offset T T T ↓, R ↑ R peers R : threshold for # of peers connected
Summary Class A : Class B : : K th iteration : until convergence
Results : Number of P2P flows Detected C1C2C Combination Fraction of flows KPF 480, 250 AC 15,100 GH ∞ TGH ∞ x 10 7 Combination # of flows C1C2C
Results : Vertex Degree Single P2P flow F2 F3 F4 F5 F6 F7 F8 F1 : by GH 1 type1 = any type2 = UDP type3 = TCP, DIP = internal IP type4 = TCP, DIP = external IP Degree = 8
Results : Vertex Degree Degree CCDF type1 type2 type3 type4 type1 = any type2 = UDP type3 = TCP, DIP = internal IP type4 = TCP, DIP = external IP
:4226 Results : Vertex Degree :6881 Single P2P flow
Results : Large Connected Component : by GH 1 Single P2P flow : by GH 2
Results : Large Connected Component TypeMeanMedian 1 49,476,74869,689, ,179,53469,689, ,217,66269,689, ,932,282115, x # of flows reachable CCDF type1 = any type2 = UDP type3 = TCP, DIP = internal IP type4 = TCP, DIP = external IP … 7 x
Visualization of P2P Flow Graph TA link small connected components GH link large connected component
Conclusion Even if Class A methods detect the small number of P2P flows by setting parameters conservatively, Class B recursive methods identify almost the rest of P2P flows. There exists the large connected component (LCC) in P2P flow graph, so the identification of a single P2P flow in LCC leads to all flow detection in LCC.
Future Work Real-time Identification Complexity Analysis
Thanks
< Port white list : well-known port : NFS : MMS : Symantec AntiVirus : msft-gc : World of Warcraft : Yahoo! Messenger : AOL Instant Messenger : NAT Port Mapping Protocol : HTTP alternate
BitTorrent Gnutella Edonkey FastTrack Freenet Soulseek Known P2P port : 6881~6889, 6969, 2710 : 6346~6349 : 2323, 3306, 4242, 4500, 4501, 4661~4674, 4677, 4678, 7778 : 1214, 1215, 1331 : 19114, 8081 : 2234, 5534