Skip Lists – Why?. BSTs Worse case insertion, search O(n) Best case insertion, search O(log n) Where your run fits in O(n) – O(log n) depends on the order of the data Hard to keep balanced 2-3 trees Guaranteed to be balanced Complicated to implement. Skip List.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Skip Lists – Why? • BSTs • Worse case insertion, search O(n) • Best case insertion, search O(log n) • Where your run fits in O(n) – O(log n) depends on the order of the data • Hard to keep balanced • 2-3 trees • Guaranteed to be balanced • Complicated to implement
Skip List • Still no guarantee to be O(log n) for insertion, search, deletion, in every situation. • Will give O(log n) performance with high probability (in most runs). • Good compromise between performance and difficulty of implementation.
What is it? • Recall the ordered linked list with one pointer to the next item. (worst case O(n), avg O(n/2) ) • Add a pointer to every other node that points to the second next (current+2). (worst case O(n/2) ) 37 49 2 15 21 29
What (cont.) • Now add in a third level so that every 4th item points 4 beyond it • • Add a (log n) level so that every (n/2)th item points n/2 away from it. • Now search is O(log n) • But insertion (and keeping “balanced”) is VERY difficult.
Probabilistic Structure • Rather than worrying about being perfect, just be good in the long run (especially if it is difficult to be perfect. • The number of pointers is the level of the node. • Randomly pick a level. • Want an exponential distribution with 50% being level 1, 25% being level 2, …
Sidebar – Implementing this Random Function • Flipping a coin gives us a 50% chance of getting a head. • If we get a tail on the first flip, then flip again. The chance of getting a TH is 25% • P(TTH) = 12.5% • Use a random number generator that generates uniformly distributed random numbers 0 or 1) • Count the number of times it takes to get a 0 • This will be the level and it will occur with the proper frequency (in the long run, i.e. with big enough data sets.
Operations • Search – easy • Insert • Pick level • Find position • Keep track of previous node for each level used in the search. • Update next pointers in “previous nodes” from level “picked level” to 0 (as well as the next in the inserted node). i.e., splice the node into the lists that it is part of.