CSCI 104 Skip Lists Mark Redekopp.

Slides:



Advertisements
Similar presentations
The Dictionary ADT: Skip List Implementation
Advertisements

Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Skip List & Hashing CSE, POSTECH.
© 2004 Goodrich, Tamassia Skip Lists1  S0S0 S1S1 S2S2 S3S3    2315.
Skip Lists1 Skip Lists William Pugh: ” Skip Lists: A Probabilistic Alternative to Balanced Trees ”, 1990  S0S0 S1S1 S2S2 S3S3 
Randomized Algorithms - Treaps
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Skip Lists Mrutyunjay. Introduction ▪ Linked Lists Benefits & Drawbacks: – Benefits: – Easy Insert and Deletes, implementations. – Drawbacks: – Hard to.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Dynamic Array. An Array-Based Implementation - Summary Good things:  Fast, random access of elements  Very memory efficient, very little memory is required.
Skip Lists 二○一七年四月二十五日
CMSC 341 Skip Lists. 8/3/2007 UMBC CMSC 341 SkipList 2 Looking Back at Sorted Lists Sorted Linked List What is the worst case performance of find( ),
2/19/2016 3:18 PMSkip Lists1  S0S0 S1S1 S2S2 S3S3    2315.
Lists List Implementations. 2 Linked List Review Recall from CMSC 201 –“A linked list is a linear collection of self- referential structures, called nodes,
Skip Lists – Why? BSTs –Worse case insertion, search O(n) –Best case insertion, search O(log n) –Where your run fits in O(n) – O(log n) depends on the.
Priority Queues and Heaps Tom Przybylinski. Maps ● We have (key,value) pairs, called entries ● We want to store and find/remove arbitrary entries (random.
Skip Lists S3   S2   S1   S0  
Sorted Maps © 2014 Goodrich, Tamassia, Goldwasser Skip Lists.
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
Skip Lists 5/10/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Graphs Chapter 15 introduces graphs which are probably the most general and commonly-used data structure. This lecture introduces heaps, which are used.
Lecture 15 Nov 3, 2013 Height-balanced BST Recall:
File Organization and Processing Week 3
CSCE 3110 Data Structures & Algorithm Analysis
CSCE 3110 Data Structures & Algorithm Analysis
B/B+ Trees 4.7.
Week 7 - Friday CS221.
Top 50 Data Structures Interview Questions
Binary Search Trees A binary search tree is a binary tree
BST Trees
COSC160: Data Structures Binary Trees
Priority Queues and Heaps
Mark Redekopp David Kempe
UNIT III TREES.
CSE 143 Linked Lists [Chapter , 8.8] 3/30/98.
Cinda Heeren / Geoffrey Tien
CSCI Trees and Red/Black Trees
CSCI 104 Log Structured Merge Trees
Skip Lists S3 + - S2 + - S1 + - S0 + -
ITEC 2620M Introduction to Data Structures
Binary Tree and General Tree
Chapter 20: Binary Trees.
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
Skip Lists.
Chapter 21: Binary Trees.
Lec 12 March 9, 11 Mid-term # 1 (March 21?)
Skip Lists S3 + - S2 + - S1 + - S0 + -
What is a heap? Always keep the thing we are most interested in close to the top (and fast to access). Like a binary search tree, but less structured.
CSCI 104 Splay Trees Mark Redekopp.
CMSC 341 Skip Lists 1.
A Kind of Binary Tree Usually Stored in an Array
Skip Lists S3 + - S2 + - S1 + - S0 + -
B- Trees D. Frey with apologies to Tom Anastasio
Sorted Maps © 2014 Goodrich, Tamassia, Goldwasser Skip Lists.
Data Structures Review Session
B- Trees D. Frey with apologies to Tom Anastasio
CMSC 341 Skip Lists.
Heaps Chapter 11 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates.
CMSC 341 Skip Lists.
A Robust Data Structure
Parasol Lab, Dept. CSE, Texas A&M University
B- Trees D. Frey with apologies to Tom Anastasio
Advanced Implementation of Tables
Linked Lists.
Mark Redekopp David Kempe
CMSC 341 Skip Lists.
Tree A tree is a data structure in which each node is comprised of some data as well as node pointers to child nodes
Skip Lists The non-list List Copyright © 2016 Curt Hill.
Heaps.
Presentation transcript:

CSCI 104 Skip Lists Mark Redekopp

Sources / Reading Material for these slides was derived from the following sources http://courses.cs.vt.edu/cs2604/spring02/Projects/1/Pugh.Skiplists.pdf http://www.cs.umd.edu/~meesh/420/Notes/MountNotes/lecture11-skiplist.pdf

Skip List Intro Another map/set implementation (storing keys or key/value pairs) Insert, Remove, Find Remember the story of Goldilocks and the Three Bears Father's porridge was too hot Mother's porridge was too cold Baby Bear's porridge was just right Compare Set/Map implementations BST's were easy but could degenerate to O(n) operations with an adversarial sequence of keys (too hot?) Balanced BSTs guarantee O(log(n)) operations but are more complex to implement and may require additional memory overhead (too cold?) Skip lists are fairly simple to implement, fairly memory efficient, and offer "expected" O(log(n)) operations (just right?) Skip lists are a probabilistic data structure so we expect O(log(n)) Expectation of log(n) does not depend on keys but only random # generator

Skip List Visual Think of a skip list like a sorted linked list with shortcuts (wormholes?) Given the skip list below with the links (arrows) below what would be the fastest way to find if 28 is in the list? H E A D 21 NIL 9 6 15 28 2 7 13 18 25

Skip List Visual Think of a skip list like a sorted linked list with shortcuts (wormholes?) Given the skip list below with the links (arrows) below what would be the fastest way to find if 28 is in the list? Let p point to a node. Walk at level i while the desired search key is bigger than p->next->key, then descend to the level i-1 until you find the value or hit the NIL (end node) NIL node is a special node whose stored key is BIGGER than any key we might expect (i.e. MAXKEY+1 / +infinity) H E A D 21 NIL 9 6 15 28 2 7 13 18 25

Perfect Skip List How did we form this special linked list? We started with a normal linked list (level 0) Then we took every other node in level 0 (2nd node from original list) and added them to level 1 Then we took every other node in level 1 (4th node from the original list) and raised it to level 2 Then we took every other node ) in level 2 (8th node from the original list) and raised it to level 3 There will be O(log2(n)) levels (We would have only 1 node at level log2(n) Level 3 H E A D 21 NIL Level 2 9 Level 1 6 15 28 Level 0 2 7 13 18 25

Search Time for Perfect Skip List How long would it take us to find an item or determine it is not present in the list O(log(n)) Proof At each level we visit at most 2 nodes At any node, x, in level i, you sit between two nodes (p,q) at level i+1 and you will need to visit at most one other node in level i before descending There are O(log(n)) levels So we visit at most O(2*log(n)) levels = O(log(n)) p q Level 3 H E A D 21 NIL x Level 2 9 Level 1 6 15 28 Level 0 2 7 13 18 25

The Problem w/ Perfect Skip Lists Remember in a perfect skip list Every 2nd node is raised to level 1 Every 4th node is raised to level 2 … What if I want to insert a new node or remove a node, how many nodes would need their levels adjusted to maintain the pattern described above? In the worst case, all n-1 remaining nodes Inserting/removing may require n-1 nodes to adjust p q Level 3 H E A D 21 NIL x Level 2 9 Level 1 6 15 28 Level 0 2 7 13 18 25

Quick Aside Imagine a game where if you flip a coin and it comes up heads you get $1 and get to play again. If you get tails you stop. What's the chance you win at least $1 $2 $3 P($1)=1/2, P($2)=1/4, P($3)=1/8

Randomized Skip Lists Rather than strictly enforcing every other node of level i be promoted to level i+1 we simply use probability to give an "expectation" that every other node is promoted Whenever a node is inserted we will promote it to the next level with probability p (=1/2 for now)…we'll keep promoting it while we get heads What's the chance we promote to level 1, 2, 3? Given n insertions, how many would you expect to be promoted to: Level 1 = n/2, Level 2 = n/4, Level 3 = n/8

Randomized Skip List As nodes are inserted they are repeating trials of probability p (stopping when the first unsuccessful outcome occurs) This means we will not have an "every other" node promotion scheme, but the expected number of nodes at each level matches the non-randomized version Note: This scheme introduces the chance of some very high levels We will usually cap the number of levels at some MAXIMUM value However the expected number of levels is still log2(n) Level 4 H E A D 15 NIL Level 3 21 Level 2 6 Level 1 2 13 28 Level 0 7 9 18 25

Worst Case What might a worst case skip list look like? All the same height Or just ascending or descending order of height These are all highly unlikely Level 4 H E A D 2 NIL Level 3 4 Level 2 7 Level 1 Level 0 9 13 15 18 24 25 28

Analysis Option A: If we can move up we do To analyze the search time with this randomized approach let's start at the node and walk backwards to the head node counting our expected number of steps Recall if we can move up a level we do, so that we take the "faster" path and only move left if we can't move up 21 Option B: No higher level, move left 21 28 Level 4 H E A D 15 NIL Level 3 21 Level 2 6 Level 1 2 13 28 Level 0 7 9 18 25

Analysis Probability of Option A: p Probability of Option B: 1-p Option A: If we can move up we do Probability of Option A: p Recall we added each level independently with probability p Probability of Option B: 1-p For this analysis let us define the top level at level 0 and the current level where we found our search node as level k (expected max k = log2(n)) 21 Option B: No higher level, move left 21 28 Level 0 H E A D 15 NIL Level 1 21 Level 2 6 Level 3 2 13 28 Level 4 7 9 18 25

Analysis Option A: If we can move up we do Define a recurrence relationship of the cost of walking back from level k to level 0 Base case: C(0) = O(1) Only expect 1 node + head node at level 0 Recursive case: C(k) = (1-p)(1+C(k)) + p(1+C(k-1)) 1+C(k) = Option B and its probability is (1-p) 1+C(k-1) = Option A and its probability is p 21 C(k) = 1+C(k-1) Option B: No higher level, move left 21 28 C(k) = 1+C(k) Level 0 H E A D 15 NIL Level 1 21 Level 2 6 Level 3 2 13 28 Level 4 7 9 18 25

Analysis Solve C(k) = (1-p)(1+C(k)) + p(1+C(k-1)) Option A: If we can move up we do Solve C(k) = (1-p)(1+C(k)) + p(1+C(k-1)) C(k) = (1-p) + (1-p)C(k) + p + pC(k-1) pC(k) = 1 + pC(k-1) C(k) = 1/p + C(k-1) = 1/p + 1/p + C(k-2) = 1/p + 1/p + 1/p + C(k-3) = k/p = log2(N) / p = O(log2(N)) 21 C(k) = 1+C(k-1) Option B: No higher level, move left 21 28 C(k) = 1+C(k) Level 0 H E A D 15 NIL Level 1 21 Level 2 6 Level 3 2 13 28 Level 4 7 9 18 25

Node & Class Definition template < class K, class V > struct SkipNode{ K key; V value; SkipNode** forward; //array of ptrs SkipNode(K& k, V& v, int level){ key = k; value = v; forward = new SkipNode*[level+1]; } }; Each node has an array of "forward" ("next") pointers Head's key doesn't matter as we'll never compare it End's forward pointers don't matter since its key value is +INF template < class K, class V > class SkipList{ int maxLevel; // data members SkipNode* head; SkipList(int max){ maxLevel = max; head = new SkipNode(dummy,dummy,maxLevel); SkipNode* end = new SkipNode(INFINITY,dummy,maxLevel); for(int i=0; i < maxLevel; i++){ header->forward[i] = end; } } }; H E A D NIL

Search Pseudocode template < class K, class V > SkipNode<K,V>* SkipList<K,V>::search(const Key& key){ SkipNode<K,V>* current = head; for(int i=maxLevel; i >= 0; i--){ while( current->forward[i]->key < key){ current = current->forward[i]; } // will always stop on level 0 w/ current=node // just prior to the actual target node or End node current = current->forward[0]; if(current->key == key) return current; else return NULL; // key is not in the list search(28) would stop the for loop with current pointing at node 25, then take one more step Level 4 H E A D 15 NIL Level 3 21 Level 2 6 Level 1 2 13 28 Level 0 7 9 18 25

Insert Pseudocode insert(25) template < class K, class V > void SkipList<K,V>::insert(const Key& key, const Value& v){ SkipNode<K,V>* current = head; vector<SkipNode<K,V>*> update(maxLevel+1); // perform typical search but fill in update array ... current = current->forward[0]; if(current->key == key) { current->value = v; return; } else { int height = randomLevel(); // Allocate new node, x for(int i=0; i < height; i++){ x->forward[i] = update[i]->forward[i]; update[i]->forward[i] = x; } } insert(25) As we walk we'll fill in an "update" array of the last nodes we walked through at each level since these will need to have their pointers updated p update array Level 4 H E A D 15 p NIL q Level 3 21 q Level 2 6 q Level 1 2 13 q 28 Level 0 7 9 18 q

Insert Pseudocode template < class K, class V > void SkipList<K,V>::insert(const Key& key, const Value& v){ SkipNode<K,V>* current = head; vector<SkipNode<K,V>*> update(maxLevel+1); // perform typical search but fill in update array ... current = current->forward[0]; if(current->key == key) { current->value = v; return; } else { int height = randomLevel(); // Allocate new node, x for(int i=0; i < height; i++){ x->forward[i] = update[i]->forward[i]; update[i]->forward[i] = x; } } int SkipList<K,V>::randomLevel() { int height = 1; // assume rand() returns double in range [0,1) while(rand() < p && height < maxLevel) height++; return height; } randomLevel returns a height >h with probability (1/ph) update array p Level 4 H E A D 15 p NIL q Level 3 21 q Level 2 6 q 25 Level 1 2 13 q 28 Level 0 7 9 18 q

Summary Skip lists are a randomized data structure Provide "expected" O(log(n)) insert, remove, and search Compared to the complexity of the code for structures like an RB-Tree they are fairly easy to implement In practice they perform quite well even compared to more complicated structures like balanced BSTs