Download presentation
Presentation is loading. Please wait.
Published byKimberly Dawson Modified over 9 years ago
1
11 Packet Sampling for Worm and Botnet Detection in TCP Connections Reporter: 林佳宜 Email: M98570015@mail.ntou.edu.tw 2010/10/25
2
References Lothar Braun, Gerhard Münz, Georg Carle. "Packet Sampling for Worm and Botnet Detection in TCP Connections." In Proc. of IEEE/IFIP Network Operations and Management Symposium (NOMS) 2010 2
3
3 Outline Introduction Sampling algorithm TCP connection Bloom filters Conclusion
4
Introduction Signature-based network intrusion detection systems NIDS Ex: Snort packet headers and packet payload are checked The permanent growth of network traffic single system are often not sufficient to analyse the entire traffic random packet is likely loss Present a new “sampling algorithm” selects packets carrying the first N payload bytes of every TCP connection can deployment in high-speed networks the required amount of memory is constant 4
5
Previous work Detect First 1000 bytes of payload Anomalous Payload-Based Network Intrusion Detection. (RAID) 2004 Classify a flow as belonging to a P2P application after looking at the first ten packets Accurate, Scalable In-Network Identification of P2P Traffic Using Application Signatures (www) 2004 Most application signatures appear within the first five packets of a flow A Hybrid Approach for Accurate Application Traffic Identification (E2EMon) 2006 5
6
TCP connections For NIDS performing session analysis the traffic volume in both direction is usually not symmetric separate limits for both directions result in an earlier cut-off of one direction IDS cannot properly continue session analysis as one direction of traffic is missing Beginning of a flow is that first packets sufficient information to classify the entire flow as harmful or benign 6
7
Analysis is based on traffic 93 different worms and bots run in a controlled environment The detection is done with two rule sets shipped with Snort 2.8.4.1 emergingthreats.net (on June 11th 2009) 4030 alarms are raised 3511 by TCP packets 451 by UDP packets 68 by ICMP packets The TCP alarms constitute the majority of alarms. 7
8
TCP Alarms 95 alarms are raised by the shipped 3416 by setemergingthreats rule set 8
9
Calculate the position We calculate the position within the TCP connection at which an alarm is raised using TCP sequence acknowledgement numbers TCP payload length 30% of all alarms are only need the handshake packets 96% are found within less than 3kB of payload The last alarms are found at about 682kB belong to a generic shellcode rule 9
10
Number of payload bytes 10
11
Accurate TCP Connection Tracking TCP connections are identified by the addresses and ports Start and end of a regular TCP connection are defined SYN, FIN, and RST packets that are exchanged a buffer is needed to withhold a packet Exceptional situations resulting in an increased number of TCP connections TCP port and network scans SYN flooding attacks 11
12
Simplified TCP Connection Tracking[1/2] The first simplification concerns the detection of connection establishments SYN packet shortly followed by a second packet without SYN flag Both packets have to be exchanged between the same endpoints identified by tuples of IP address and port number The two packets have to be observed within a small time interval determined three seconds to be an appropriate value 12
13
Simplified TCP Connection Tracking[2/2] The second simplification concerns the connection reassembly: do not perform any packet reordering nor do we remove duplicated packets Leave these tasks to the subsequent packet analysis step e.g., an NIDS 13
14
Selecting and dropping packets Sample those packets containing the first N bytes of payload They count the payload lengths in the order of packet arrival But without regarding sequence numbers In the presence of packet reordering and duplicates selecting packets which should not be sampled (false positives) dropping packets which should be sampled (false negatives). 14
15
Bloom filters Bloom filter is a probabilistic data structure composed of a bit array and a set of hash functions Initially, every bit in the array is set to zero If a new element is tobe inserted into the set hash functions must be calculated for this element All corresponding bits are then set to one False positives are possible due to collisions in the hash functions query id true, but not in the set 15
16
Memory consumption calculated depending on the number of used hash functions (l) the collision probability of the hash functions (p) the number of stored elements (k) 16
17
Bloom filters variants They use two Bloom filters variants summarized The first variant is called Time-out Bloom filter[1] Its array is composed of timestamps instead of bits. The second Bloom filter variant is called Count-Min Sketch (CMS)[2] 17
18
Sampling algorithm Selects the first packets of a TCP connection until a maximum of N bytes of payload has been exported uses two Time-out Bloom filters one CMS to store the required connection states Each TCP connection is identified source IP address (SA), destination IP address (DA), source port (SP), and destination port (DP) 18
19
The three filters The first Time-out Bloom filter stores the timestamps of all observed SYN packets The CMS stores the number of payload bytes which need to be exported for an established TCP connection The second Time-out Bloom filter stores the point in time because the maximum number of payload bytes was reached or because a FIN or RST packet was observed. 19
20
Traces Used in Evaluation Using two traffic traces a student residence at the University of Twente their research group at the Technische Universit¨at Munich The two traces will be called Twente trace and Munich trace 20
21
Amount of sampled traffic Twente trace, between 476,817 ( N = 1kB) and 512,334 (N = 25kB) packets are sampled Munich trace, between 314,063 and 326,742 TCP packets are sampled 21
22
Empirical Analysis of Sampling Errors The number of hash functions influences the computing cost a small number of hash functions is desired A large number of hash functions reduces the probability of collisions A good trade-off between computational complexity and collision probability found that l = 3 hash functions 22
23
Sampling errors Sampling of payload in the Twente trace 1kB only 30,130 (7.35%) and 30,694 (7.36%) error 25kB only 59,789 (11.6%) and 73,925 (14.4%) error Bloom filters increases the sampling errors by 0.1% to 24%, depending on the traffic 23
24
Evading detection Attacker to circumvent packet selection locate the malicious part of payload beyond the first N bytes of a TCP connection inserting packets with identical addresses and ports but invalid sequence numbers performing a TCP scan in order to poison the values stored sending empty TCP ACK packets To make evasion more difficult can dynamically vary N over time 24
25
25 Conclusion We demonstrate the usability of our sampling strategy for botnet Using of a simplified TCP connection tracking mechanism and Bloom filters to store the required connection states the algorithm can be efficiently implemented in software or hardware It would be possible to develop a similar sampling algorithm for UDP traffic
26
Questions 26
27
1. S. Kong, T. He, X. Shao, and X. Li, “Time-out Bloom Filter: A New Sampling Method for Recording More Flows,” in Proc. of International Conference on Information Networking (ICOIN) 2006, Sendai, Japan, Jan. 2006. 2. G. Cormode and S. Muthukrishnan, “An Improved Data Stream Summary: The Count-Min Sketch and its Applications,” Journal of Algorithms, vol. 55, no. 1, 2005. 27
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.