D 陳怡安 R 解巽評 R 高榮泰 IEEE/ACM TRANSACTIONS ON NETWORKING OCTOBER 2006 Cristian Estan, George Varghese, Member, IEEE, and Michael Fisk 指導教授: 林永松 教授
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links1/64 Introduction Related Work Counting Algorithm & Analysis Measurement Results Conclusion
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links2/64 Introduction Related Work Counting Algorithm & Analysis Measurement Results Conclusion
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links3/64 This paper presents a family of bitmap algorithms that address the problem of counting the number of distinct header patterns (flows) seen on a high-speed link. The authors’ new probabilistic algorithms use little memory and are fast.
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links4/64 Detect port/IP scans Identify DoS attacks Estimate spreading rate of a worm Packet scheduling Counting is especially hard when processing must be done within a packet arrival time
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links5/64 Naïve solution – use hash tables (like NetFlow) Best known prior algorithm – probabilistic counting This paper approach – use bitmaps & probabilistic algorithm
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links6/64 General purpose-Multiresolution bitmap Whole family of counting algorithms that further improve performance by taking advantage of particularities of the specific counting application. Adaptive bitmap Triggered bitmap
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links7/64 A flow is defined by an identifier given by the values of certain header fields. Ex: define a flow by source and destination IP addresses The problem we wish to solve is counting the number of distinct flow identifiers (flow IDs) seen in a specified measurement interval.
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links8/64 An intrusion detection system looking for port scans could count for each active source address the flows Flows defined by destination IP and port and suspect any source IP that opens more than three flows in 12 s of scanning.
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links9/64 Cost of large memory Power consumption Need solutions that: 1. Use small amount of memory 2. Have high accuracy
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links10/64 Introduction Related Work Counting Algorithm Family Algorithm Analysis Measurement Results Conclusion
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links11/64 Flajolet, Martin (1985) probabilistic counting Memory use similar to multiresolution bitmap Whang et al (1990) introduce direct bitmap You, Chang (1996) use virtual bitmap Duffield, Lund, Thorup (2002) Accurate solutions based on counting TCP SYN flags
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links12/64 Introduction Related Work Counting Algorithm & Analysis Measurement Results Conclusion
Active Flow Counting Algorithms Direct Bitmap Virtual Bitmap Multiresolution Bitmap Adaptive Bitmap Triggered Bitmap
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links14/64 HASH(green)= Set bits in the bitmap using hash of the flow ID of incoming packets
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links15/64 HASH(blue)= Different flows have different hash values
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links16/64 HASH(green)= Packets from the same flow always hash to the same bit
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links17/64 HASH(violet)= Collisions OK, estimates compensate for them
Bitmap Algorithms for Counting Active Flows on High-Speed Links 18/64 b is the bitmap size The probability that a flow hashes to a given bit: 1/b n is the number of given flows, the probability of no flow hashes to a given bit is Expected number of bits not set is: The estimation for number of active flows is: Observation: The estimation goes BAD when z goes near 0!! (1)
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links19/64 increases, and standard deviation increases and Z decreases!!
Bitmap Algorithms for Counting Active Flows on High-Speed Links 20/64 Var(V n ) is easy to obtain! Using Taylor expansion and Var(V n ) to obtain
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links21/64 HASH(orange)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links22/64 HASH(pink)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links23/64 HASH(yellow)= As the flow number get far more than expected upper limit, estimates get inaccurate
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links24/64 Solution: use more bits HASH(green)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links25/64 Solution: use more bits Problem: memory scales with the number of flows HASH(blue)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links26/64 Solution: a) store only a portion of the bitmap b) calculate estimate by scaling factor
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links27/64 HASH(pink)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links28/64 HASH(yellow)=
Bitmap Algorithms for Counting Active Flows on High-Speed Links 29/64 Similar with what we done in direct bitmap n: total active flow number; m: the number of active flow hash to the virtual bitmap The probability distribution of m is binominal, and expected value is: We can use (1) to estimate m and obtain n by dividing it by α
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links30/64 Slight different from what we obtained via directed bitmap Problem: estimate inaccurate when few flows active
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links31/64 Solution: use many bitmaps, each accurate for a different range
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links32/64 HASH(pink)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links33/64 HASH(yellow)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links34/64 Use this bitmap to estimate number of flows
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links35/64 Use this bitmap to estimate number of flows
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links36/64 Problem: must update up to three bitmaps per packet Solution: combine bitmaps into one OR
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links37/64 HASH(pink)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links38/64 HASH(yellow)=
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links39/64 Select the suitable “Base Component” in which the coarsest component has no more than set max bits set Add the bits in base component together and multipling with scaling factor Base Component
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links40/64 Find most accurate component Estimate number of flows hashing to it Apply scaling factor
Bitmap Algorithms for Counting Active Flows on High-Speed Links 41/64 Every Component could be the “Base Component” If the error of some component is too large? Change finer one as the “Base Component”! X
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links42/64
Bitmap Algorithms for Counting Active Flows on High-Speed Links 43/64 Direct bitmap Virtual bitmap Multiresolution bitmap
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links44/64 The accuracy of a well tuned virtual bitmap and with the wide range of multiresolution bitmaps!! A small multiresolution bitmap for estimate the magnitude of active flows number and a large virtual bitmap count them precisely The resolution of the virtual bitmap can be adjusted
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links45/64 Two Updates? Replace r-adjacent component in mutiresolution bitmap for virtual bitmap While the flow number is large, replace the components in high resolution. While the flow number is small, replace the components in low resolution
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links46/64 When the flow number is small…Replace the components of high resolution with the virtual bitmap When the flow number is large…Replace the components of lower resolution with the virtual bitmap
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links47/64
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links48/64 As to port scan…..????? A multiresolution bitmap for an active source?? This multiresolution bitmap has to be able to handle large number of flows Most traffic is NOT port scan An WASTE!!!
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links49/64 A small direct bitmap + a large multiresolution bitmap Small direct bitmap counting the active flows from a given source Once the number exceeds the threshold, a large multiresolution bitmap will be allocated for this source
Bitmap Algorithms for Counting Active Flows on High-Speed Links 50/64 N: the maximum flow number we plan to measure
Bitmap Algorithms for Counting Active Flows on High-Speed Links 51/64 Sweet spot! ρ optimal :1.594, z/b: 20.3%
Bitmap Algorithms for Counting Active Flows on High-Speed Links 52/64 b, set max, c, k b= f(k)/ 2 set max =b(1-e - max ) c = 2+log k (N/( max b)) (N is the maximum flow number we want to measure) f(k)/ln(k) is an indicator of memory usage
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links53/64 Introduction Related Work Counting Algorithm & Analysis Measurement Results Conclusion
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links54/64 Measurement Packet traces data (IP headers over a link) Measurement interval : 5 s Flows definition : 5-tuple of source and destination IP addresses, ports, and protocol
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links55/64 Virtual Bitmap Low density : sampling error High density : collision error
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links56/64 Virtual Bitmap (cont.) Comparison Problem-specific counting method for a specific problem like threshold detection can significantly outperform a one-size-fits-all technique like probabilistic counting.
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links57/64 Multiresolution Bitmap Configured for average error of 10%
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links58/64 Multiresolution Bitmap (cont.) Configured for average error of 3%, 1%
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links59/64 Adaptive Bitmap Comparison Adaptive bitmap can achieve almost the same benefits of virtual bitmap when the number of flows does not vary dramatically. Overestimating Three times more accurate
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links60/64 Triggered Bitmap Comparison Our algorithm reported 84.6% of the sources with four connections, 98.1% of those with five, and all (100%)of the sources that had at least eight connections Five times less memory
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links61/64 Triggered Bitmap (cont.) Trade-off between significantly less memory and possible missing port scanners. However, the probability of a port scanner not being detected decreases exponentially with the number of connections it opens. For example, the probability is 1.87% at five connections, 0.23% at six, 0.03% at seven, and so on.
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links62/64 Triggered Bitmap (cont.) Port scans frequently touch not just a handful of addresses, but an entire block of contiguous addresses Our algorithms reduce the memory usage by as much as an order of magnitude Count more sources at a time Detect stealthy slow scans : counting sources with longer inter-arrival times
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links63/64 Introduction Related Work Counting Algorithm & Analysis Measurement Results Conclusion
2015/12/17Bitmap Algorithms for Counting Active Flows on High-Speed Links64/64 Conclusion Solve the flow counting problem using extremely small amounts of memory and produce satisfying accuracy Customizable counting algorithm for applications :