Download presentation
Presentation is loading. Please wait.
Published byGarry Eaton Modified over 8 years ago
1
Optimal XOR Hashing for a Linearly Distributed Address Lookup in Computer Networks Christopher Martinez, Wei-Ming Lin, Parimal Patel The University of Texas at San Antonio October 28, 2005
2
Outline Motivation Hashing Background Linear Distribution Optimal Hashing Simulation Conclusion
3
Motivation All network applications require some searching Switches, routers and intrusion detection systems require the searching of IP address or subnet IDs Searching should be based on distribution of the records in the database For computer networks, searching needs to be real-time
4
Motivation (cont.) A capture of network traffic shows the non- uniform distribution of IP type C addresses Since IP address entering the network are non-uniform then searching should take this into account
5
Hashing Background Straightforward sequential searching impractical for large databases Hashing reduces the database into small subsets Searching subsets reduces search time Predictable time needed for real-time applications
6
Hashing Background Hashing algorithms are well research, we look to provide new insight base on the probability distribution This work is not concern about collision, each hashing key will have the same number of collision in a link list Hashing using probability background should limit the average number of searches in the link list
7
Hashing: Non-uniform Distribution
8
Linear Distribution From our capture network traffic we can approximate the non-uniform distribution by a linear probability distribution function
9
XOR Hashing For Linear Distribution We wanted a straightforward hashing scheme that can be used for any size database and hashing space Define the hashing function as P=(g m-1,g m-2,…,g 0 ) Measure hashing functions against each other by the value δ δ measure how close to uniform the hashing creates
10
XOR Hashing for Linear Distribution 4-bit to 2-bit Example P=(2,2)
11
XOR Hashing for Linear Distribution 4-bit to 2-bit Example P=(3,1)
12
XOR Hashing for Linear Distribution 4-bit to 2-bit Example P=(1,3)
13
XOR Hashing Observation Observations: g i > 1: leads to equal partitioning g i = 1: leads to unequal partitioning δ: difference between highest hash distribution density and mean To find δ: we need to determine highest final hash distribution density
14
Optimal XOR Hashing for Linear Distribution Hashing consists of m steps (from step m-1 to step 0) p i : highest density value after step i Derive p i from p i +1 at step i p m = A = 1/2 n (original mean before hashing) δ = p 0 – 1/2 m
15
Optimal XOR Hashing for Linear Distribution
16
δ vs. P for Linear Distribution Optimal solution comes from all groups XORing more than 1 bit
17
Simulation Goal: Demonstrate that lower δ leads to better search performance Hashing: map from 2 n to 2 m Each simulation performs 2 m hash lookups
18
Simulation Three performance measurements Number of Empty Bins (NEB) Average maximum Search Length (ASL) Maximum Search Length (MSL)
19
Simulation Improvement from best δ over worst δ NEB: 18% ASL: 12% MSL: 17%
20
Simulation
21
Future Work Find optimal XOR hashing for exponential distribution and partial linear distribution Look more in depth to see if what applications exhibit linear distribution Find performance gain of using this hashing scheme in an intrusion detection system
22
Conclusion Network applications demonstrate non- uniform distribution making known search techniques less than optimal Linear distribution can benefit from the XOR folding property Optimal XOR grouping can be easily identified to minimize error in hashing distribution Theory in linear case can be applied to other non-uniform distributions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.