Pipelined van Emde Boas Tree: Algorithms, Analysis, and Applications Hao Wang and Bill Lin University of California, San Diego
IEEE Infocom Introduction Priority queues used many network applications Per-flow weighted fair queueing Management of per-flow packet buffers using DRAM Maintenance of exact statistics counters for real-time network measurements Items in priority queue sorted at all times (e.g. smallest key first) Challenge: Need to operate at high speeds (e.g. 10+ Gb/s)
IEEE Infocom Introduction Binary heap common data structure for priority queues O(lg n) time complexity, where n is # items e.g. in fine-grained per-flow weighted fair queueing, n can be very large (e.g. 1 million) O(lg n) may be too slow for high line rates Pipeline heaps [Bhagwan, Lin 2000][Ioannou, Katevenis 2001] Reduced time complexity to O(1) At the expense of O(lg n) pipeline stages
IEEE Infocom This Talk Instead of pipelining binary heaps, we present new approach based on pipelining van Emde Boas trees van Emde Boas (vEB) trees introduced in 1975 Instead of maintaining a priority queue of sorted items, maintain a sorted dictionary of keys In many applications, since keys are represented by a w -bit integer, possible keys can only be from fixed universe of u = 2 w values Only O(lg lg u) complexity vs. O(lg n) for heaps Main result: pipelined vEB with O(1) operation and O(lg lg u) pipeline stages
IEEE Infocom van Emde Boas (vEB) Trees Goal: Maintain a sorted subset S from universe U = {0, 1, …, u – 1} of size u = 2 w, subject to I NSERT, D ELETE, E XTRACT M IN, S UCCESSOR, P REDECESSOR. e.g. I NSERT – Inserts new item into the queue E XTRACT M IN – Removes item with smallest key
IEEE Infocom vEB Trees Size u vs. n depends on application. e.g. If u is only polynomial in n, i.e. u = O(n c ), then O(lg lg u) = O(lg lg n), exponential speedup over O(lg n) Per-flow Fair Queues 24 bits 4 × 10 6 Statistics Counters 10 6 counters 512 groups 10 6 flows Applicationwu = 2 w n 5 3 O(lg lg u) 20 8 O(lg n) u ≈ n u « n u ≈ n compare 5128 bits
IEEE Infocom H and L[0], …, L[ – 1] recursively defined as vEB trees with w/2 bits vEB Trees Conceptually think of the universe of w -bits, U = {0, 1, …, 2 w – 1}, as a binary tree with height w Think of top part H of w/2 bits, and bottom part of sub-trees of w/2 bits, L[0], L[1], …, L[ – 1] L[0]L[1]L[ -1]... H w/2 bits sub-trees
IEEE Infocom vEB Trees Suppose w = 8 bits, consider e.g. a key x = 31 Split x into high and low parts: Consider I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) I NSERT ( x l, L[x h ] ) Can avoid 2 nd recursion by storing min[S] L[0]L[1]L[ -1]... H w/2 bits sub-trees xhxh xlxl
IEEE Infocom Representing vEB Trees min[S] : minimum key in S n[S] : number of elements in S H, L’s : just pointers to corresponding vEB sub-trees H L[0] L[1] L[ – 1] min[S]n[S]
IEEE Infocom I NSERT Operation // if S empty set min to x // if x is smaller swap them // increment size of S // recursive call to either L[x h ] or H Only One Recursive Call
IEEE Infocom I NSERT Operation I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) min( L[x h ] ) ← x l L[0]L[1]L[255]... H 4 bits Suppose w = 8 x = xhxh xlxl
IEEE Infocom I NSERT Operation I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) min( L[x h ] ) ← x l L[0]L[1]L[255]... H 4 bits Suppose w = 8 x = xhxh xlxl
IEEE Infocom I NSERT Operation I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) min( L[x h ] ) ← x l L[1] 4 bits Now w = 4 x = 1111 xhxh xlxl
IEEE Infocom I NSERT Operation I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) min( L[x h ] ) ← x l L[0]L[1]L[255]... 2 bits H Now w = 4 x = 1111 xhxh xlxl
IEEE Infocom I NSERT Operation I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) min( L[x h ] ) ← x l L[0]L[1]L[255]... H 2 bits Now w = 4 x = 1111 xhxh xlxl
IEEE Infocom I NSERT Operation I NSERT ( x, S ) If x h H, recursively call I NSERT ( x l, L[x h ] ) Else, recursively call I NSERT ( x h, H ) min( L[x h ] ) ← x l L[0]L[1]L[255]... 2 bits H Now w = 4 x = 1111 xhxh xlxl Overall O(lg w) or O(lg lg u) time
IEEE Infocom E XTRACT M IN Operation // if S empty, nothing to return // set return to current min // decrement size of S // recursive call to either H or L[m h ] // set new minimum and return value
IEEE Infocom E XTRACT M IN Operation // if S empty, nothing to return // set return to current min // decrement size of S // recursive call to either H or L[m h ] // set new minimum and return value Again Only One Recursive Call
IEEE Infocom Basic Idea of Pipelining vEB operations are recursively defined Each operation only makes one recursive call at each level of recursion Each call goes down to a sub-tree with universe defined by w/2 bits (then w/4 bits, w/8 bits, etc) Therefore, intuitively can unroll into lg lg u steps L[0]L[1]L[ -1]... H w/2 bits sub-trees
IEEE Infocom Pipeline Structure stage 1 op1arg1 stage 2 op2arg2 stage 3 op3arg3 op4arg4 input :: M1 = pointers to all subtrees of width w/2 output minn HL0L0 L1L1 … n HL0L0 L1L1 … n HL0L0 L1L1 … n HL0L0 L1L1 … stage 4 M2 = pointers to all subtrees of width w/4 M3 = pointers to all subtrees of width w/8 M4 = pointers to all subtrees of width w/16
IEEE Infocom Main Issues Want to initiate new operation at every pipeline cycle (e.g. initiate an I NSERT or E XTRACT M IN ) Even if throughput is O(1), Don’t want to wait lg lg u pipeline cycles to retrieve data (e.g. want data in same pipeline cycle for E XTRACT M IN ) But previous I NSERT and E XTRACT M IN operation(s) may still be in pipeline Two pipeline stages may need to access same data Need to resolve memory access orders
IEEE Infocom Basic Idea Operations proceed from top to bottom min[S] for the vEB tree rooted at that level is updated immediately as operations flow thru pipeline e.g. I NSERT at Stage 1 will update min[S] immediately so next E XTRACT M IN at Stage 1 will retrieve correct min[S] stage 1 op1arg1 stage 2 op2arg2 stage 3 op3arg3 op4arg4 input :: output minn HL0L0 L1L1 … n HL0L0 L1L1 … n HL0L0 L1L1 … n HL0L0 L1L1 … stage 4
IEEE Infocom Other Operations In addition to I NSERT and E XTRACT M IN, paper describes S UCCESSOR ( x, S ) – Returns next element in S larger than x in the universe U Can also define in a similar pipeline fashion P REDECESSOR ( x, S ) – Returns next element in S smaller than x in the universe U E XTRACT M AX ( S ) – Removes item with largest key To facilitate these operations, we also Store max[S] for corresponding S at each pipeline stage Provide another pipeline memory structure for tracking “in-flight” S UCCESSOR and P REDECESSOR operations
IEEE Infocom Conclusions Pipelined van Emde Boas trees can achieve O(1) operations with O(lg lg u) pipelined stages Same O(1) time complexity as pipelined heaps, but exponentially fewer pipeline stages than O(lg n) required for pipelined heaps when u is a polynomial of n Can simultaneously support E XTRACT M IN and E XTRACT M AX, which is harder to do with heaps Can support other operations like S UCCESSOR and P REDECESSOR, which have potential application to various network problem instances
Thank You