SKIP LIST & SKIP GRAPH James Aspnes Gauri Shah

Slides:

Advertisements

Similar presentations

© 2004 Goodrich, Tamassia Skip Lists1 S0S0 S1S1 S2S2 S3S

Advertisements

Skip Lists. Outline and Reading What is a skip list (§9.4) – Operations (§9.4.1) – Search – Insertion – Deletion Implementation Analysis (§9.4.2) – Space.

The Dictionary ADT: Skip List Implementation

SKIP LISTS Amihood Amir Incorporationg the slides of Goodrich and Tamassia (2004)

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK

SKIP GRAPHS Slides adapted from the original slides by James Aspnes Gauri Shah.

Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.

© 2004 Goodrich, Tamassia Skip Lists1  S0S0 S1S1 S2S2 S3S3    2315.

CSC401 – Analysis of Algorithms Lecture Notes 7 Multi-way Search Trees and Skip Lists Objectives: Introduce multi-way search trees especially (2,4) trees,

Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.

Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC 2002.

SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.

Skip Lists1 Skip Lists William Pugh: ” Skip Lists: A Probabilistic Alternative to Balanced Trees ”, 1990  S0S0 S1S1 S2S2 S3S3 

P2P Course, Structured systems 1 Skip Net (9/11/05)

E.G.M. PetrakisB-trees1 Multiway Search Tree (MST)  Generalization of BSTs  Suitable for disk  MST of order n:  Each node has n or fewer sub-trees.

SKIP GRAPHS James Aspnes Gauri Shah To appear in SODA Level 0 Level 1 Level 2.

Balanced Search Trees CS 3110 Fall Some Search Structures Sorted Arrays –Advantages Search in O(log n) time (binary search) –Disadvantages Need.

1 Distributed Data Structures for a Peer-to-peer system Advisor: James Aspnes Committee: Joan Feigenbaum Arvind Krishnamurthy Antony Rowstron [MSR, Cambridge,

Randomized Algorithms - Treaps

Other Structured P2P Systems CAN, BATON Lecture 4 1.

Introduction To Algorithms CS 445 Discussion Session 2 Instructor: Dr Alon Efrat TA : Pooja Vaswani 02/14/2005.

CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.

Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.

David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.

1 Peer-to-Peer Networks. Overlay Network A logical network laid on top of the Internet A B C Internet Logical link AB Logical link BC.

Skip Lists 二○一七年四月二十五日

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.

Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.

BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.

2/19/2016 3:18 PMSkip Lists1  S0S0 S1S1 S2S2 S3S3    2315.

Skip Lists. Linked Lists Fast modifications given a pointer Slow traversals to random point.

Skip Lists S3   S2   S1   S0  

Sorted Maps © 2014 Goodrich, Tamassia, Goldwasser Skip Lists.

Skip Lists 5/10/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.

Point location and Persistent Data structures

Spatial Indexing I Point Access Methods.

B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree

Temporal Indexing MVBT.

Peer-to-Peer and Social Networks

Skip Lists S3 + - S2 + - S1 + - S0 + -

Accessing nearby copies of replicated objects

SKIP GRAPHS James Aspnes Gauri Shah SODA 2003.

(2,4) Trees (2,4) Trees 1 (2,4) Trees (2,4) Trees

Skip Lists S3 + - S2 + - S1 + - S0 + -

(2,4) Trees /26/2018 3:48 PM (2,4) Trees (2,4) Trees

Chord Advanced issues.

Skip Lists S3 + - S2 + - S1 + - S0 + -

Sorted Maps © 2014 Goodrich, Tamassia, Goldwasser Skip Lists.

Dictionaries < > = /3/2018 8:58 AM Dictionaries

Indexing and Hashing Basic Concepts Ordered Indices

Chord Advanced issues.

(2,4) Trees (2,4) Trees (2,4) Trees.

Chord Advanced issues.

CSIT 402 Data Structures II With thanks to TK Prasad

(2,4) Trees 2/15/2019 (2,4) Trees (2,4) Trees.

Parasol Lab, Dept. CSE, Texas A&M University

(2,4) Trees /24/2019 7:30 PM (2,4) Trees (2,4) Trees

(2,4) Trees (2,4) Trees (2,4) Trees.

Proof of Kleinberg’s small-world theorems

Multiway Search Tree (MST)

Skip List: formally A skip list for a set S of distinct (key, element) items is a series of lists S0, S1 , … , Sh such that Each list Si contains the special.

(2,4) Trees /6/ :26 AM (2,4) Trees (2,4) Trees

CS210- Lecture 17 July 12, 2005 Agenda Collision Handling

Heaps & Multi-way Search Trees

Binary Search Trees < > = Dictionaries

CS210- Lecture 20 July 19, 2005 Agenda Multiway Search Trees 2-4 Trees

Lecture-Hashing.

SKIP GRAPHS (continued)

Presentation transcript:

SKIP LIST & SKIP GRAPH James Aspnes Gauri Shah Many slides have been taken from the original presentation by the authors

Definition of Skip List A skip list for a set L of distinct (key, element) items is a series of linked lists L0, L1 , … , Lh such that Each list Li contains the special keys +∞ and -∞ List L0 contains the keys of L in non-decreasing order Each list is a subsequence of the previous one, i.e., L0 ⊃ L1 ⊃ … ⊃ Lh List Lh contains only the two special keys +∞ and -∞

Skip List (Idea due to Pugh ’90, CACM paper) Dictionary based on a probabilistic data structure. Allows efficient search, insert, and delete operations. Each element in the dictionary typically stores additional useful information beside its search key. Example: <student id. Transcripts> [for University of Iowa] <date, news> [for Daily Iowan] Probabilistic alternative to a balanced tree.

Skip List HEAD TAIL J Level 2 -∞ +∞ A J M Level 1 Level 0 A G J M R W Each node linked at higher level with probability 1/2.

Another example L2 -∞ 31 +∞ L1 -∞ 23 31 34 64 +∞ L0 56 64 78 +∞ 31 34 44 -∞ 12 23 26 Each element of Li appears in L.i+1 with probability p. Higher levels denote express lanes.

Searching in Skip List Search for a key x in a skip: Start at the first position of the top list At the current position P, compare x with y ≥ key after P x = y -> return element (after (P)) x > y -> “scan forward” x < y -> “drop down” If we move past the bottom list, then no such key exists

Example of search for 78 L3 L2 L1 L0 +∞ -∞ L3 L2 -∞ 31 +∞ L1 -∞ 23 31 34 64 +∞ L0 -∞ 12 23 26 31 34 44 56 64 78 +∞ At L1 P is 64, the next element +∞ is bigger than 78, we drop down At L0, 78 = 78, so the search is over.

Insertion The insert algorithm uses randomization to decide in how many levels the new item <k> should be added to the skip list. After inserting the new item at the bottom level flip a coin. If it returns tail, insertion is complete. Otherwise, move to next higher level and insert <k> in this level at the appropriate position, and repeat the coin flip.

Insertion Example Suppose we want to insert 15 Do a search, and find the spot between 10 and 23 Suppose the coin come up “head” three times L3 +∞ -∞ p2 L2 L2 -∞ +∞ +∞ -∞ 15 p1 L1 L1 -∞ 23 +∞ +∞ -∞ 23 15 p0 L0 L0 -∞ 10 23 36 +∞ +∞ -∞ 10 36 23 15

Deletion Search for the given key <k>. If a position with key <k> is not found, then no such key exists. Otherwise, if a position with key <k> is found (it will be definitely found on the bottom level), then we remove all occurrences of <k> from every level. If the uppermost level is empty, remove it.

Deletion Example 1) Suppose we want to delete 34 2) Do a search, find the spot between 23 and 45 3) Remove all occurrences of the key p2 L2 L2 -∞ 34 +∞ -∞ +∞ p1 L1 L1 -∞ 23 34 +∞ -∞ 23 +∞ p0 L0 L0 -∞ 12 23 34 45 +∞ -∞ 12 23 45 +∞

O(1) pointers per node Average number of pointers per node = O(1) Total number of pointers = 2.n + 2. n/2 + 2. n/4 + 2. n/8 + … = 4.n So, the average number of pointers per node = 4

Number of levels The number of levels = O(log n) w.h.p Pr[a given element x is above level c log n] = 1/2c log n = 1/nc Pr[any element is above level c log n] = n. 1/nc = 1/nc-1 So the number of levels = O(log n) with high probability

Search time Consider a skiplist with two levels L0 and L1. To search a key, first search L1 and then search L0. Cost (i.e. search time) = length (L1) + n / length (L1) Minimum when length (L1) = n / length (L1). Thus length(L1) = (n) 1/2, and cost = 2. (n) 1/2 (Three lists) minimum cost = 3. (n)1/3 (Log n lists) minimum cost = log n. (n) 1/log n = 2.log n

Skip lists for P2P? Advantages O(log n) expected search time. Retains locality. Dynamic node additions/deletions. Disadvantages Heavily loaded top-level nodes. Easily susceptible to failures. Lacks redundancy.

A Skip Graph G W A J M R G R W A J M A G J M R W Level 2 A J M R 101 100 000 001 011 110 100 G R W Level 1 A J M 110 101 001 001 011 Membership vectors Level 0 A G J M R W 001 100 001 011 110 101 Link at level i to nodes with matching prefix of length i. Think of a tree of skip lists that share lower layers.

Properties of skip graphs Efficient Searching. Efficient node insertions & deletions. Independence from system size. Locality and range queries.

Searching: avg. O (log n) Restricting to the lists containing the starting element of the search, we get a skip list. G W Level 2 A J M R G R W Level 1 A J M Level 0 A G J M R W Same performance as DHTs.

Node Insertion – 1 buddy new node J 001 G W Level 2 A M R 100 101 000 011 110 G R W Level 1 A M 110 101 100 001 011 Level 0 A G M R W 001 100 011 110 101 Starting at buddy node, find nearest key at level 0. Takes O(log n) time on average.

Node Insertion - 2 Total time for insertion: O(log n) At each level i, find nearest node with matching prefix of membership vector of length i. G W Level 2 A J 001 M R 100 101 000 011 110 G R W Level 1 A J 001 M 110 101 100 001 011 A G J 001 M R W Level 0 001 100 011 110 101 Total time for insertion: O(log n) DHTs take: O(log2n)

Independent of system size No need to know size of keyspace or number of nodes. E Z 1 J 00 01 Level 0 Level 1 Level 2 J insert E Z Level 1 E Z Level 0 1 Old nodes extend membership vector as required with arrivals. DHTs require knowledge of keyspace size initially.

Locality and range queries Find key < F, > F. Find largest key < x. Find least key > x. Find all keys in interval [D..O]. Initial node insertion at level 0. A D F I A D F I L O S

Applications of locality Version Control e.g. find latest news from yesterday. find largest key < news: 02/13. Level 0 news:02/09 news:02/10 news:02/11 news:02/12 news:02/13 Data Replication e.g. find any copy of some Britney Spears song. Level 0 britney01 britney02 britney03 britney04 britney05 DHTs cannot do this easily as hashing destroys locality.

So far... Coming up...      Self-stabilization. Decentralization. Load balancing. Tolerance to faults. Self-stabilization. Random faults. Adversarial faults. Decentralization. Locality properties. O(log n) space per node. O(log n) search, insert, and delete time. Independent of system size.     

Load balancing Interested in average load on a node u. i.e. the number of searches from source s to destination t that use node u. Theorem: Let dist (u, t) = d. Then the probability that a search from s to t passes through u is < 2/(d+1). where V = {nodes v: u <= v <= t} and |V| = d+1.

Observation s Level 2 Nodes u Level 1 Level 0 Node u is on the search path from s to t only if it is in the skip list formed from the lists of s at each level.

Tallest nodes s t u s u is not on path.  u is on path. u u t Node u is on the search path from s to t only if it is in T = the set of k tallest nodes in [u..t]. Pr [u T] = Pr[|T|=k] • k/(d+1) = E[|T|]/(d+1). k=1 d+1 Heights independent of position, so distances are symmetric.

Load on node u Average load on a node is inversely proportional Start with n nodes. Each node goes to next set with prob. 1/2. We want expected size of T = last non-empty set. We show that: E[|T|] < 2. = T Asymptotically: E[|T|] = 1/(ln 2) ± 2x10-5 ≃ 1.4427… [Trie analysis] Average load on a node is inversely proportional to the distance from the destination.

Experimental result Load on node Node location Expected load Actual load Destination = 76542 76400 76450 76500 76550 76600 76650 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 0.0 Load on node Node location

Fault tolerance How do node failures affect skip graph performance? Random failures: Randomly chosen nodes fail. Experimental results. Adversarial failures: Adversary carefully chooses nodes that fail. Bound on expansion ratio.

Random faults 131072 nodes

Searches with random failures 131072 nodes 10000 messages

Adversarial faults Theorem: A skip graph with n nodes has dA = nodes adjacent to A but not in A. To disconnect A all nodes in dA must be removed Expansion ratio = min |dA|/|A|, 1 <= |A| <= n/2. A dA Theorem: A skip graph with n nodes has expansion ratio = (1/log n). f failures can isolate only O(f•log n ) nodes.

Need for repair mechanism G W Level 2 A J M R G R W Level 1 A J M Level 0 A G J M R W Node failures can leave skip graph in inconsistent state.

Let xRi (xLi) be the right (left) neighbor of x Ideal skip graph Let xRi (xLi) be the right (left) neighbor of x at level i. If xLi, xRi exist: k xRi = xRi-1. xLi = xLi-1. Successor constraints x Level i Level i-1 i xR i-1 1 2 ..00.. ..01.. xLi < x < xRi. xLiRi = xRiLi = x. Invariant

Basic repair If a node detects a missing neighbor, it tries to patch the link using other levels. 1 5 1 3 5 6 1 2 3 4 5 6 Also relink at other lower levels. Successor constraints may be violated by node arrivals or failures.

Constraint violation Neighbor at level i not present at level (i-1). x ..00.. ..01.. x Level i x x Level i-1 ..00.. ..01.. ..01.. ..01.. ..00.. ..01.. ..01.. ..01.. zipper Level i-1 Level i x

Conclusions Similarities with DHTs Decentralization. O(log n) space at each node. O(log n) search time. Load balancing properties. Tolerant of random faults.

Differences Property DHTs Skip Graphs Insert/Delete time O(log2n) Locality No Yes Repair mechanism ? Partial Tolerance of adversarial faults Keyspace size Reqd. Not reqd.

Open Problems Design efficient repair mechanism. Incorporate geographical proximity. Study multi-dimensional skip graphs. Evaluate performance in practice. Study effect of byzantine failures. ?