Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Understanding Network Traffic through Whole Packet Analysis Abdulrahman Hijazi Hajime Inoue Ashraf Matrawy P.C. van Oorschot Anil Somayaji.

Similar presentations


Presentation on theme: "Towards Understanding Network Traffic through Whole Packet Analysis Abdulrahman Hijazi Hajime Inoue Ashraf Matrawy P.C. van Oorschot Anil Somayaji."— Presentation transcript:

1 Towards Understanding Network Traffic through Whole Packet Analysis Abdulrahman Hijazi Hajime Inoue Ashraf Matrawy P.C. van Oorschot Anil Somayaji

2 Towards Understanding Network Traffic through Whole Packet Analysis2 Agenda Introduction Project in a nutshell ADHIC NetADHICT Overview In progress Results Performance Multimedia & encrypted traffic P2P No-headers Limitations Applications

3 Towards Understanding Network Traffic through Whole Packet Analysis3 Introduction Complexity of modern computer networks Common network analysis strategies Predetermined classifiers (port, address, …) Protocol dissectors (wireshark, …) High-level view of network structure through packets clustering Header information Payload Better distinguishes: p2p, worms, … Performance issue

4 Towards Understanding Network Traffic through Whole Packet Analysis4 Introduction We developed a packet clustering technique that: finds semantically interesting clusters adapts to the changing nature of traffic patterns does not require explicit a priori information does not rely on any specific fields in the packets can run in sub-linear time (packets length) Two innovations: (p,n)-grams: n-bytes substrings at p byte offset ADHIC (Approximate Divisive HIerarchical Clustering) Two key features: Network traffic redundancy Optimal clustering is not required

5 Towards Understanding Network Traffic through Whole Packet Analysis5 Project in a Nutshell NetADHICT: our implementation of ADHIC It can analyze data as it is received by a network interface, or offline using libpcap files. Observed data is used to generate & update a (p,n)-gram decision tree. This tree serves as a classifier tree reflecting the high- level structure of network traffic at a given time. Deduced structure corresponds to the typical network traffic division (TCP vs. UDP; web vs. non-web), which is arrived at using automatically generated context related (p,n)-grams.

6 Towards Understanding Network Traffic through Whole Packet Analysis6 ADHIC Using sampled measure of similarity, ADHIC recursively subdivides traffic into binary classes until resulting traffic is: below certain threshold or too similar or dissimilar Produced binary tree consists of: internal decision nodes with one (p,n)-gram per node leaf nodes that constitute final clusters Classification rule is based on matching (p,n)-grams. Traffic at each terminal cluster is a result of a Boolean equation constructed by following the path from root to leaf.

7 Towards Understanding Network Traffic through Whole Packet Analysis7 ADHIC

8 Towards Understanding Network Traffic through Whole Packet Analysis8 ADHIC ADHIC adapts to changing traffic by performing the following two tree operations: Splitting, when: a leaf contains more than preset threshold of traffic and there is a (p,n)-gram that matches a percentage between certain range (e.g. 40%-60%). Deletion, when: a subtree has not matched a minimum threshold Both of these statistics are measured over a preset period of time called: maturation window.

9 Towards Understanding Network Traffic through Whole Packet Analysis9 NetADHICT: Overview Licensed under GNU GPL It usually starts by separating IP from non-IP, then later in lower nodes it sequesters specific protocols. NetADHICT segregates packets by protocol and other characteristics (e.g. length). (p,n)-grams corresponding to special header or payload fields allow unconventional classification measures. NetADHICT was tested against four week-long traces from our CCSL lab.

10 Towards Understanding Network Traffic through Whole Packet Analysis10 NetADHICT: Overview

11 Towards Understanding Network Traffic through Whole Packet Analysis11 NetADHICT: Overview

12 Towards Understanding Network Traffic through Whole Packet Analysis12 NetADHICT: In progress

13 Towards Understanding Network Traffic through Whole Packet Analysis13 NetADHICT: In progress

14 Towards Understanding Network Traffic through Whole Packet Analysis14 NetADHICT: In progress

15 Towards Understanding Network Traffic through Whole Packet Analysis15 NetADHICT: In progress Examples of interesting segregation through (p,n)- grams: (51, 0x00 0x00): part of ARP’s Ethernet frame trailer (64, 0x00 0x0f): part of EIGRP’s non-IP header (22, 0x2c 0x06) and (54, 0x01 0x01): part of IMAPS’s TTL & protocol ID and “NOP, NOP” options field respectively (37, 0xc1 0x0c): HSRP’s 2 nd byte of dest port & 1 st byte of UDP length (174, 0x00 0x00): part of NetBIOS-DGM’s payload

16 Towards Understanding Network Traffic through Whole Packet Analysis16 Results: Performance Single protocol cluster: clusters that the traditional classifier reports as containing packets of only one protocol.

17 Towards Understanding Network Traffic through Whole Packet Analysis17 Results: Performance NetADHICT does well with most traffic types. Structured packets (e.g. non-IP, UDP, …) are segregated through header and/or payload (p,n)-grams. Unstructured packets (e.g. TCP) are more segregated through header (p,n)-grams including fields like the five tuples and others (e.g. packet length, QoS field, TTL, options, padding, …). NetADHICT also clusters same protocol packets running on different port numbers together (e.g. HTTP on 80 and 8080).

18 Towards Understanding Network Traffic through Whole Packet Analysis18 Results: Multimedia & Encrypted Traffic In addition: multimedia (e.g. MS-Streaming) & encrypted (e.g. SSH, HTTPS, IMAPS) traffic are both: Segregated from unencrypted traffic: NetADHICT either segregates them through header (p,n)- grams or shunts them to default clusters Distinguished from each other: NetADHICT finds suitable header (p,n)-grams to separate different encrypted traffic from each other.

19 Towards Understanding Network Traffic through Whole Packet Analysis19 Results: P2P Many P2P applications feature using constantly changing non-standard port numbers in the same network session. In all the experiments done, NetADHICT was able to: cluster the P2P UDP tracker packets together through a non-IP-header (p,n)-gram. cluster all other related TCP packets (data and control) to the tree’s global default cluster and its adjacent cluster. Even when the running port of all the P2P packets was maliciously changed to the standard HTTP port number (i.e. 80), packets were clustered exactly like before.

20 Towards Understanding Network Traffic through Whole Packet Analysis20 Results: P2P

21 Towards Understanding Network Traffic through Whole Packet Analysis21 Results: P2P Two observations: NetADHICT rarely uses ports to cluster traffic. NetADHICT managed to segregate P2P traffic by characterizing other network traffic as having patterns that were absent in the P2P traffic. Conclusion: So long as most well-behaved traffic can be appropriately clustered, evasive protocols can be identified.

22 Towards Understanding Network Traffic through Whole Packet Analysis22 Results: No-Headers NetADHICT can also do semantically meaningful clustering even without looking at the IP header (first 38 bytes). Although performance is occasionally degraded, decision trees made with no header information are qualitatively similar to those done using all packet information. The main difference is in NetADHICT’s inability to separate different encrypted traffic when headers are restricted.

23 Towards Understanding Network Traffic through Whole Packet Analysis23 Results: No-Headers

24 Towards Understanding Network Traffic through Whole Packet Analysis24 Limitations Analysis challenge: Difficulty (work and time) in analyzing clusters both manually and automatically Privacy issues: Our algorithm looks at both headers and payloads Sophisticated design: Large configuration space, making it difficult to choose an optimal set of parameters

25 Towards Understanding Network Traffic through Whole Packet Analysis25 Applications Network administration: understand overall structure of network traffic and further assist in monitoring its changes. Network security: isolate malicious traffic from normal traffic, (featuring no outdated signatures, long training, or false alarms). Quality of Service: actively manage bandwidth by giving each leaf cluster an equal share of the bandwidth. Other applications: ADHIC has no built-in knowledge of networking!

26 Towards Understanding Network Traffic through Whole Packet Analysis26 Thank you


Download ppt "Towards Understanding Network Traffic through Whole Packet Analysis Abdulrahman Hijazi Hajime Inoue Ashraf Matrawy P.C. van Oorschot Anil Somayaji."

Similar presentations


Ads by Google