Skip Lists The non-list List Copyright © 2016 Curt Hill
List Problems Lists are linear This does not sound bad but this makes most operations on a list as O(N) For lookups this is bad with so many better possibilities Hash tables O(1) if table is not too full Binary search O(log2N) Tree search O(log2N) Copyright © 2016 Curt Hill
What is the real problem? The single link only can move us forward a small amount Recall Bubble Sort where we always move a small amount Also no indexing One potential solution to this problem is the skip list In such a list we have multiple pointers that move multiple distances forward Copyright © 2016 Curt Hill
History Comparatively recent data structure Most of the data structures we have covered date to the 1970s or earlier This one is from 1990 Conceived by William (Bill) Pugh This is a probabilistic data structure Most operations are O(log2N) We start with and then enhance a sorted linked list Copyright © 2016 Curt Hill
List A singly-linked straight list 2 6 8 11 12 16 21 NULL Copyright © 2016 Curt Hill
Sorted If we are really interested in good search performance then a sorted list is required Yet sorting alone will not help our search times in the list In a binary search we can use array subscripting to move quickly Since we cannot do this with a list we have to add another set of links This is the express lane Copyright © 2016 Curt Hill
List With Express Lane 2 6 8 11 12 16 21 NULL NULL Copyright © 2016 Curt Hill
Search Now a search can walk along the express lane When it overshoots it can back up one and take the non-express lane The links are one directional We keep a leading and trailing pointer Lets consider the search for 16 Copyright © 2016 Curt Hill
Search List 2 < 16 – advance both links on express lane 2 6 8 11 12 21 NULL NULL 2 < 16 – advance both links on express lane Red is leading Green is trailing Copyright © 2016 Curt Hill
Search List 8 < 16 – advance both links on express lane 2 6 8 11 12 21 NULL NULL 8 < 16 – advance both links on express lane Red is leading Green is trailing Copyright © 2016 Curt Hill
Search List 12 < 16 – advance both links on express lane 2 6 8 11 21 NULL NULL 12 < 16 – advance both links on express lane Red is leading Green is trailing Copyright © 2016 Curt Hill
Search List 2 6 8 11 12 16 21 NULL NULL 21 > 16 – leading has overshot – now use the regular links Red is leading Green is trailing Copyright © 2016 Curt Hill
Search List 16 = 16 – We have found target 2 6 8 11 12 16 21 NULL NULL 16 = 16 – We have found target Red is leading Green is trailing Copyright © 2016 Curt Hill
Commentary If we searched the original linked list for 16 we would have examined 6 nodes Here we looked at 5 Half of the 6 plus 2 Should the list be longer the savings would be greater The best we can hope for is about O(N/2) which is still linear Copyright © 2016 Curt Hill
Why Stop There? We can add other sets of links and skip farther into list This will reduce the search time even more Copyright © 2016 Curt Hill
Multiple Express Lanes 2 6 8 11 12 16 21 NULL NULL NULL Copyright © 2016 Curt Hill
Commentary This list now has a link that skips forward two and another that skips forward four If it were longer another set of links could be added that skipped forward eight or more A search takes the fastest set of links until it overshoots It then uses the second fastest We can reduce the search time to log2N Copyright © 2016 Curt Hill
Terminology We now end up with something like a two dimensional grid of lists The list of links we call a level The original linked list is a level 0 The one that skips every other is 1 etc. A tower is that group of links this is related vertically Copyright © 2016 Curt Hill
Perfect Skip List A perfect skip list has log2N levels Shortest one has only two nodes Each one skips twice as many as the next longest We only need two searches at each level Then we go to the next longer level What is the problem with perfection? Copyright © 2016 Curt Hill
Problems, Problems Perfect skip lists suffer the same problems as perfectly balanced trees Insertions and deletions require too much work We lose the log2N performance that we want Deletion is not much of a problem We delete it from every list it is on Insertion is more interesting Copyright © 2016 Curt Hill
Insertion Generally skip lists take a probabilistic approach Half the insertions only touch level 0 Of the other half, half of these insert into level 0 and list 1 Of the remaining quarter, half insert into levels 0, 1 and 2, etc We do not have the perfect skip list, but we end with something that is still log2N Copyright © 2016 Curt Hill
Implementation Many different ways to implement Like any list, it may be singly or doubly linked Each level is a list in its own right with additional links to higher or lower levels Make the link to not be a single link but a vector of links There is also the possibility of a lattice of linked lists Often a tower of infinite end values Copyright © 2016 Curt Hill
Pictures 2 6 8 11 12 2 6 2 8 12 Vector of links Semi-independent lists -inf 2 inf -inf inf -inf inf -inf inf End values and doubly linked lists Lattice of links Copyright © 2016 Curt Hill
Finally This is clearly not a linear structure It is made up of many linear structures Most processing is O(log2N) Insertion is easier than a sorted array Traveral or iteration is easier than a tree Growth is easier than a hash table Copyright © 2016 Curt Hill