Download presentation
Published byGerald Latch Modified over 9 years ago
1
KISS: Stochastic Packet Inspection for UDP Traffic Classification
Dario Bonfiglio, Alessandro Finamore, Marco Mellia, Michela Meo, Dario Rossi
2
Traffic classification
Internet Service Provider Look at the packets… Tell me what protocol and/or application generated them
3
Deep Packet Inspection (DPI)
Typical approach: Deep Packet Inspection (DPI) PPLive Bittorrent Internet Service Provider ? ? Port: Port: ? Payload: Payload: “bittorrent” Gtalk eMule ? ? Port: Port: 4662/4672 Payload: RTP protocol Payload: E4/E5
4
Deep Packet Inspection (DPI)
It fails more and more: P2P Encryption Proprietary solutions Many different flavours Typical approach: Deep Packet Inspection (DPI) PPLive Bittorrent Internet Service Provider ? ? Port: Port: ? Payload: Payload: “bittorrent” Gtalk eMule ? ? Port: Port: 4662/4672 Payload: RTP protocol Payload: E4/E5
5
Possible Solution: Behavioral Classifier
Phase 1 Phase 2 Phase 3 Verify Traffic (Known) (Training) (Operation) Feature Decision Statistical characterization of traffic (given source) Look for the behaviour of unknown traffic and assign the class that better fits it Check for possible classification mistakes
6
Phase 1 : Statistical characterization
Verify Traffic (Known) Feature Decision Statistical characterization of bits in a flow c 2 Test Do NOT look at the SEMANTIC and TIMING … but rather look at the protocol FORMAT
7
Chunking and [ ] c c c Expected distribution Observed (uniform)
2 Expected distribution (uniform) Observed distribution UDP header First N payload bytes C chunks each of b bits c 2 1 C [ ] , … , Vector of Statistics The provides an implicit measure of entropy or randomness of the payload c 2
8
Chi square statistics
9
Chi square statistics Deterministic Counter Deterministic
24 Chunks == 12 payload bytes, 4bit x Chunk Deterministic Deterministic Deterministic Deterministic Random Deterministic Counter Time
10
Protocol format as seen from the
2 RTP eMule DNS
11
Phase 2 : Decision process
Verify Traffic (Known) Feature Decision Statistical characterization of bits in a flow c 2 Test Decision process Minimum distance / maximum likelihood
12
C-dimension space [ ] ? c Hyperspace Classification Regions c
2 1 C [ ] , … , c 2 i j Hyperspace Classification Regions Class ? My Point Class Euclidean Distance Support Vector Machine
13
Example
14
Phase 3 : Performance c Phase 1 Phase 2 Phase 3
Verify Traffic (Known) Feature Decision Statistical characterization of bits in a flow c 2 Test Decision process Minimum distance / maximum likelihood Performance evaluation How accurate is all this?
15
Real traffic traces Internet Training False Negatives False Positives
Fastweb Trace other Complement of known traffic 1 day long trace RTP eMule DNS > 90% of tot. volume Oracle (Manual DPI) 20 GByte of UDP traffic Known + Other Training q Known Traffic False Negatives Unknown traffic False Positives
16
Definition of false positive/negative
Traffic Oracle (DPI) eMule RTP DNS Other Classifing “known” Classifing “other” KISS KISS true positives true negatives false negatives false positives
17
Results (local) Case A Case B Rtp 0.08 0.23 Edk 13.03 7.97 Dns 6.57
Euclidean Distance SVM Case A Case B Rtp 0.08 0.23 Edk 13.03 7.97 Dns 6.57 19.19 Case A Case B - 0.05 0.98 0.54 0.12 2.14 Known traffic (False Neg.) [%] Case A Case B other 13.6 17.01 Case A Case B - 0.18 Other (False Pos.) [%]
18
Real traffic trace FN are always below 3%!!!
RTP errors are oracle mistakes (do not identify RTP v1) DNS errors are due to impure training set (for the oracle all port 53 is DNS traffic) EDK errors are (maybe) Xbox Live (proper training for “other”) FN are always below 3%!!!
19
P2P-TV applications P2P-TV applications are becoming popular
They heavily rely on UDP at the transport protocol They are based on proprietary protocols They are evolving over time very quickly Tot. Vectors % FN Joost 33514 1.9 PPLive 84452 - SopCast 84473 0.1 Tvants 27184 Tot. Vectors % FP Other 1.2M 0.3
20
Pros and Cons KISS is good because… but… Blind approach
Completely automated Works with many protocols Works even with small training Statistics can start at any point Robust w.r.t. packet drops Bypasses some DPI problems but… Learn (other) properly Needs volumes of traffic May require memory (for now) Only UDP (for now) Only offline (for now)
21
Papers D. Bonfiglio, M. Mellia, M. Meo, D. Rossi, P. Tofanelli “Revealing skype traffic: when randomness plays with you”, ACM SIGCOMM Computer Communication Review "4", Vol. 37, pp , ISSN: , October 2007 D. Rossi, M. Mellia, M. Meo, “Following Skype Signaling Footsteps”, IT-NEWS QoS-IP The Fourth International Workshop on QoS in Multiservice IP Networks, Venice, Febbruary D. Rossi, M. Mellia, M. Meo, “A Detailed Measurement of Skype Network Traffic”, 7th International Workshop on Peer-to-Peer Systems (IPTPS '08), Tampa Bay, Florida, 25-26/2/2008 D. Bonfiglio, M. Mellia, M. Meo, N. Ritacca, D. Rossi, “Tracking Down Skype Traffic”, IEEE Infocom, Phoenix, AZ, 15,17 April 2008 D.Bonfiglio, A. Finamore, M. Mellia, M. Meo, D. Rossi, “KISS: Stochastic Packet Inspection for UDP Traffic Classification”, submitted to InfoCom09
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.