1 / 28

A Non-Blocking, Contention-Friendly Skip List

A Non-Blocking, Contention-Friendly Skip List. Joint work with Tyler Crain (IRISA) and Michel Raynal (U. Rennes 1, IUF). School of Information Technologies. Dr Vincent Gramoli | Lecturer. Context. Multi-cores (dual-core, core-duo, quad-core…) are everywhere.

gypsy
Download Presentation

A Non-Blocking, Contention-Friendly Skip List

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Non-Blocking, Contention-FriendlySkip List Joint work with Tyler Crain (IRISA) and Michel Raynal (U. Rennes 1, IUF) School of Information Technologies Dr Vincent Gramoli | Lecturer

  2. Context Multi-cores (dual-core, core-duo, quad-core…) are everywhere • 2011, concurrency in production (tablets, phones) • All appsmust beconcurrent! Vincent Gramoli

  3. Context Many-cores (the only affordable way of getting higher performance) • Pollack’s law [Borkar, CACM 2011] Vincent Gramoli

  4. Motivations Why Skip List? • Data structures are the bottleneck • B-trees are cool to batch disk, not to main-memory • Hash tables do not support single-access range queries • Binary trees per-key load expectation is not uniform • Skip lists are good candidates for in-memory database • Single-access range queries are supported • Per-key load distribution is expected to be uniform • In-production databases use it (e.g., memsql, arangodb) Vincent Gramoli

  5. Skip List • Sequential skip list [Bill Pugh, CACM 1990] • similar to a linked list with shortcuts (plus next pointers) • provide logarithmic insert/delete/contains in expectation • a tower of high height is more likely accessed • all shortcuts must be updated (in addition to next pointers) -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  6. Skip List • Sequential skip list [Bill Pugh, CACM 1990] • similar to a linked list with shortcuts (plus next pointers) • provide logarithmic insert/delete/contains in expectation • a tower of high height is more likely accessed • all shortcuts must be updated (in addition to next pointers) contains(23)? -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  7. Skip List • Sequential skip list [Bill Pugh, CACM 1990] • similar to a linked list with shortcuts (plus next pointers) • provide logarithmic insert/delete/contains in expectation • a tower of high height is more likely accessed • all shortcuts must be updated (in addition to next pointers) remove(62) -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  8. Skip List • Sequential skip list [Bill Pugh, CACM 1990] • similar to a linked list with shortcuts (plus next pointers) • provide logarithmic insert/delete/contains in expectation • a tower of high height is more likely accessed • all shortcuts must be updated (in addition to next pointers) remove(62) -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  9. Related work • Non-blocking skip lists [Fomitchev, Ruppert, PODC’04], [Sundell, Tsigas, SAC’04], [Lea, JSR166]. • provide logarithmic complexity in expectation (always) • are not contention-friendly • Typical hot spots are at the top (frequently accessed) Contention scale Hot spots! Vincent Gramoli

  10. Related work • Non-blocking skip lists [Fomitchev, Ruppert, PODC’04], [Sundell, Tsigas, SAC’04], [Lea, JSR166]. • provide logarithmic complexity in expectation (always) • are not contention-friendly • Typical hot spots are at the top (frequently accessed) Contention scale Vincent Gramoli

  11. Contention-friendly, non-blocking skip list • Key ideas: • Ensuring O(log n) complexity in the absence of contention • Diminishing contention in case of contention bursts by relaxing O(log n) • By means of update decoupling: • Eager abstract modification: • Update: returns after updating the bottom level • Lazy and selective adaptation: • Update: postpone the adaptation at higher levels • Remove: chooses the least likely contended towers Vincent Gramoli

  12. Contention-friendly, non-blocking skip list Example: inserting 12 • Eager abstract modification at bottom level insert(12) -∞ 36 +∞ -∞ 36 +∞ -∞ 5 23 36 62 118 +∞ Vincent Gramoli

  13. Contention-friendly, non-blocking skip list Example: inserting 12 • Eager abstract modification at bottom level insert(12) -∞ 36 +∞ -∞ 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  14. Contention-friendly, non-blocking skip list Example: inserting 12 • Eager abstract modification at bottom level • Operation is done, client gets response insert(12) -∞ 36 +∞ -∞ 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  15. Contention-friendly, non-blocking skip list Example: inserting 12 • Eager abstract modification at bottom level • Operation is done, client gets response • Lazy update of the higher shortcuts insert(12) -∞ 36 +∞ -∞ 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  16. Contention-friendly, non-blocking skip list Example: inserting 12 • Eager abstract modification at bottom level • Operation is done, client gets response • Lazy update of the higher shortcuts insert(12) -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  17. Contention-friendly, non-blocking skip list Example: removing 36 • Eager abstract modification marks the tower • Operation is done, client gets response • Selective adaptation may decide not to remove it remove(36) -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  18. Contention-friendly, non-blocking skip list Example: removing 36 • Eager abstract modification marks the tower • Operation is done, client gets response • Selective adaptation may decide not to remove it remove(36) -∞ 36 +∞ -∞ 12 36 +∞ -∞ 5 12 23 36 62 118 +∞ Vincent Gramoli

  19. Contention-friendly, non-blocking skip list Non-blocking: the system as a whole always makes progress • Fault-tolerance: • If one core crashes, the others can still progress • If the adaptation thread crashes, then performance might degrade but safety and progress is preserved • Heterogeneous architecture: • one slow core does not necessarily affect the execution of other cores Vincent Gramoli

  20. Actual Java Implementation • Head: dummy node, tail: null node • Leaves are nodes, internal are IndexItems, bottom pointers allow to fetch value in O(1) • Logical deletion mark: node.value = ⊥ (null) • Backward pointer to backtrack in case of deleted nodes found • Only CAS are used for synchronization (never blocks) • Doubly linked list at the bottom to allow backtrack in case a deleted node is found Vincent Gramoli

  21. Actual Java Implementation Implementation of insert/delete/contains • Traversing the data structure (insert/delete/contains) is • From left to right • From top to bottom • If a removed item is encountered, then backtrack to last non-removed item • If a logically deleted node is found, then help-remove it • Help-removal: • If node already marked (concurrent help-removal) then give up help-removal • Else mark it (CAS), give it a dummy marked successor (as in j.u.c.ConcurrentSkipListMap) • Insertion: • Logical: If logically deleted tower found at right position, then reset the value (1 CAS) • Physical: If no such tower is found, insert as usual if not in between two marked nodes (1 CAS) to avoid lost update scenarios • Deletion: Only done logically (1 CAS), further traversal may physically remove it Vincent Gramoli

  22. Actual Java Implementation Implementation of adaptation • A single thread traverses repeatedly the structure • Unlink physically deleted nodes that have height of 1. • Deterministic level adjustment: • If 3 consecutive towers have the same heights, raise the one in the middle by 1 • When too many tall towers are logically deleted as all towers of heights 1 are removed, then • remove the bottom most level of the index items (setting the down pointer to ⊥ (null)) Vincent Gramoli

  23. Actual Java Implementation • Our skip list • provides logarithmic complexity (w/o contention) • is contention-friendly (sequential at the top, almost not contended at the bottom) Contention scale No hot spots, almost no contention Vincent Gramoli

  24. Experiments Settings • Two 12-core AMD processors for a total of 24 hardware threads • #executed operations per millisecond averaged over 5 runs of 5 seconds each • Size (5000 elements, 10000 keys, 50% of update success, size expectation is fixed) • Java • Java SE 1.6.0 12-ea in server mode • HotSpot JVM 11.2-b01 • java.util.concurrent.ConcurrentSkipListMap vs. Contention-friendly non-blocking skip list • SPECjbb: in-memory database Java server benchmark • Emulation of a three-tier client/server system • Key-value store (with Map interfaced collections) • We replace (order and order histories) thread-local collections into a shared one [Carlstrom et al. PPoPP’07] • In addition to these accesses: • Java BigDecimal • XML processing • Thread-local collections accesses Vincent Gramoli

  25. Experiments Microbenchmark Attempted update: 0-100% (Effective update: 10-50%) 24 threads • Gain at 0%: half of the nodes have multiple level in the contention-friendly skip list while 25% only in the j.u.c.ConcurrentSkipListMap • Gain at >0% update: due to the contention-friendliness • Up to 2.5x speedup Vincent Gramoli

  26. Experiments Microbenchmark (con’t) Attempted update: 20% (Effective update: 10%) • Up to 1.8x speedup Vincent Gramoli

  27. Experiments SPECjbb 2005 (modified) • High scalability in both cases: • 23 threads: 12.3x speedup over sequential for j.u.c.CSLM • 23 threads: 14.6x speedup over sequential for CF-SkipList • Performance: • 23 threads: 747730 business ops/sec for j.u.c.CSLM • 23 threads: 777662 business ops/sec for CF-SkipList • J.u.c.CSLM has some advantage at 1 core: 7000 bops more (due to memory optimization?) • 24 threads: the adapting thread compete for processor time Vincent Gramoli

  28. Conclusion Contention-Friendly Non-Blocking Skip List • Data structures are the bottleneck • Skip list is appealing for in-memory database • Non-blocking-ness gives fault tolerance and heterogeneity support • Contention-friendliness is the most effective • At high level of concurrency • Under high contention • Next steps: optimizations • Open questions: • What else is used in in-memory DB? What are the observed workloads? • Can we think of other data structures benefiting from the same decoupling? • Hash table? Vincent Gramoli

More Related