Download presentation
Presentation is loading. Please wait.
Published byJesse Cook Modified over 9 years ago
1
Range Queries in Non-blocking k-ary Search Trees Trevor Brown Hillel Avni
2
Problem Statement ● Want to store keys in a dynamic data structure supporting insertion, deletion, and: ● RangeQuery(a, b): returns all keys of the data structure in range [a, b]
3
Previous Solutions ● Software transactional memory ● Locks ● Persistence a a d d f f e e b b root pointer e e b b d d c c Insert(c)
4
k-ary Search Tree (k-ST) ● Add or remove keys by replacing node(s) ● Related to persistent data structures [Brown, Helga]
5
The Range Query Algorithm RangeQuery(a, b): – Traverse the tree, skipping sub-trees which cannot contain a key in [a, b] – During this traversal, save a pointer to each leaf that contains a key in [a, b] – […] – Problem: how to efficiently tell if a key was added or removed during this traversal?
6
Extending the Data Structure ● Add a dirty bit to each leaf ● Each leaf has its dirty bit set just before it is replaced – Consequence: If a leaf's dirty bit is not set, then it has not been replaced
7
The Range Query Algorithm RangeQuery(a, b): – Traverse the tree, skipping sub-trees which cannot contain a key in [a, b] – During this traversal, save a pointer to each leaf that contains a key in [a, b] – After this traversal, check the dirty bits of these leaves, one by one – If no dirty bit is set, then return “the result” – Otherwise, retry ● Reading dirty bit is far faster than re-traversing
8
Example: RangeQuery(3, 14) 8, 13, 25 2, 3, 5 8, 9, 12 1 1 4 4 5, 7 14, 19, 21 13 15, 16, 18 23, 24 29, 35 Saved pointers 3, 4 Insert(3) RangeQuery sees 4 is dirty… Retry!
9
Retrying RangeQuery(3, 14) 8, 13, 25 2, 3, 5 8, 9, 12 1 1 4 4 5, 7 14, 19, 21 13 15, 16, 18 23, 24 29, 35 Saved pointers 3, 4 Success! Return the result…
10
Ctrie ● Taking a “snapshot:” atomically replace root ● Old tree no longer changes ● Future searches and updates copy nodes from the old tree [Prokopec, Bronson, Bagwell, Odersky] | | | … … … … … … … … root pointer | | | 00 011011011011
11
When our Algorithm is Good ● When workloads contain range queries over small ranges (i.e., where snapshots are bad) – Example: database applications such as airline database of flights When it might not be ● Very large ranges increase the chance that a range query will have to retry – Our experiments explore how much this matters – In extreme cases Ctrie or Snap might be better
12
Experiment: compare performance of ● k-ST: k=16, 32, 64 ● Snap ● Ctrie ● Java’s Concurrent Skip List (SL) – NOT LINEARIZABLE!
13
Experiment ● Throughput vs. number of concurrent threads ● Each thread repeatedly chooses a random operation (Search, Insert, Delete, RangeQuery) with arguments chosen uniformly randomly in [0, 10^6) ● Each experiment ran with a fixed amount of memory, for a fixed, sufficiently long amount of time
14
Hardware ● Intel 4-chip, 40-core, 80-thread ● Sun 2-chip, 16-core, 128-thread
15
Many queries with small ranges Snap Ctrie 16-ST 32-ST SL 64-ST Number of threads Throughput (millions) Throughput (millions) 50% search, 5% insert, 5% delete, 40% range query size 100
16
Many queries with bigger ranges Snap Ctrie 16-ST 32-ST SL 64-ST Number of threads Throughput (hundred thousands) Throughput (hundred thousands) 50% search, 5% insert, 5% delete, 40% range query size 10,000
17
Few queries (with small ranges) Snap Ctrie 16-ST 32-ST SL 64-ST Number of threads Throughput (ten millions) Throughput (ten millions) 59% search, 20% insert, 20% delete, 1% range query size 100
18
Throughput versus P(range query) Snap Ctrie KST16 KST32 SL KST64 Probability of range query Throughput (ten millions) Throughput (ten millions) 1:10000 operations is RQ
19
Throughput versus arity 5i-5d-40r-size10000 20i-20d-1r-size10000 5i-5d-40r-size100 20i-20d-1r-size100 Degree of tree Throughput (ten millions) Throughput (ten millions)
20
Conclusion ● Provably correct algorithm ● Searches can ignore concurrent updates ● Although dirty bits invalidate range queries, they do not invalidate searches ● Range queries are invisible ● No CAS, don’t change data structure ● Avoids excessive duplication of nodes ● Appears to be practical when workloads contain queries over small ranges
21
Future work ● Adding balance – (a, b)-tree, chromatic tree, relaxed AVL tree ● Wait-freedom?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.