1 / 19

Succinct Priority Indexing Structures for the Management of Large Priority Queues

Succinct Priority Indexing Structures for the Management of Large Priority Queues. Hao Wang and Bill Lin University of California, San Diego IEEE IWQoS 2009 Charleston, South Carolina July 13-15, 2009. Introduction. Priority queues used many network applications

orien
Download Presentation

Succinct Priority Indexing Structures for the Management of Large Priority Queues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Succinct Priority Indexing Structures for the Management of Large Priority Queues Hao Wang and Bill Lin University of California, San Diego IEEE IWQoS 2009 Charleston, South Carolina July 13-15, 2009

  2. Introduction • Priority queues used many network applications • Per-flow advanced QoS scheduling • Management of per-flow DRAM packet buffers • Maintenance of per-flow statistics counters for real-time network measurements • Items in priority queue sorted at all times(e.g. smallest key first) • Common operations: INSERT, FINDMIN, DELETE • Challenge: Need to operate at high speeds(e.g. 40+ Gb/s)

  3. Introduction • Binary heap common structure for priority queues • HasQ(log2n) time complexity, where n is # items • e.g. in fine-grained per-flow scheduling, n can be very large (e.g. 1 million) • But Q(log2n) may be too slow for high line rates • Pipelined heaps[Bhagwan, Lin 2000][Ioannou 2001][Wang, Lin 2006] • Reduced amortized time complexity to constant time • At the expense of Q(log2n) pipeline stages

  4. Introduction • van Emde Boas (vEB) trees • Instead of maintaining priority queue of sorted items, maintain sorted dictionary of keys • In many applications, since keys are represented by a k-bit integer, possible keys can only be from a fixed universeofU = 2kvalues • Only Q(log2log2U) complexity vs. Q(log2n) for heaps • Pipelined vEB trees[Wang, Lin 2007] • Reduced amortized time complexity to constant time • At the expense of Q(log2log2U)pipeline stages

  5. This Talk • Propose 3 related Priority Indexing (PI) structures that leverage built-in hardware optimized instructions in modern 64-bit x86 processors (both Intel and AMD) • Specifically, given a W=64 bit word, the instructions BSR (bit-scan-reverse) and BSF (bit-scan-forward) return the positions of the most-significant and least-significant bits, respectively 0 0 1 0 1 1 … 0 0 1 0 0 0 BSR BSF

  6. This Talk • Most-significant (least-significant) bit positions can also be easily implemented using efficient priority encoder designs in custom hardware

  7. Basic Priority Indexing Structure • Essentially a W-way tree. Maintains sorted subset S of N elements from a fixed universe of size U = Wh, where N ≤ U. • Each element i of the universe is associated with a binary bit bi • Leaf node contains W bits of bi: bi = 1 if element i is in the set • Non-leaf node serves as summary of child nodes: bit in non-leaf node set to 1 if its child node has at least one non-zero bit Data Structure of PI with h = 3

  8. Example Operations • TEST(i): Just check bi • INSERT(i): Start at leaf, set bi. Set corresponding bit in parent. Repeat until root. • FINDMIN(): Start at root. Find MSB (most-significant-bit) and traverse sub-tree. Repeat until leaf. • DELETE(i): Start at leaf, clear bi. If word = 0 (no more bits set), clear corresponding bit in parent. Repeat until root. Data Structure of PI with h = 3

  9. Time/Memory Complexity of PI • TEST(i) takes constant time. All other operations take Q(logwU) time, which is asymptotically not as “good” as the Q(log2log2U) time complexity of a van Emde Boas tree • However, for W = 64, PI requires fewer or same number of operations for U ≤ 64 billion (h ≤ 6), but much simpler • For PI of size U, memory size only 1.016U bits, Q(U) space

  10. Motivation for Modified Structures • PI is fast, but Q(logwU) time may still not be fast enough for high-performance applications • Want constant time operations (issue new operation every cycle) • But PI cannot be readily pipelined • Some PI operations are top-down (e.g. FINDMIN), but others are bottom-up (e.g. DELETE) • Propose 2 modified structures • Counting Priority Index (CPI) • Pipelined Counting Priority Index (Pipelined CPI)

  11. Counting-Priority-Index • In addition to having a bit set to indicate a child node has at least one bit set, add counter to keep track of “how many” bits in a child node are set. Enables all top-down operations. Data Structure of CPI with h = 3

  12. Example CPI Operations • FINDMIN(): Start at root. Find MSB (most-significant-bit) and traverse sub-tree. Repeat until leaf. Same as before. • DELETE(i): Start at root. Decrement counter. If count = 0, clear bit. Go down corresponding sub-tree. Repeat until leaf. Data Structure of CPI with h = 3

  13. Time/Memory Complexity of CPI • TEST(i) takes constant time. All other operations take Q(logwU) time, same as basic PI structure. • But all operations supported in top-down fashion • For CPI of size U, memory size only 1.11U bits, still Q(U) space

  14. Pipelined Counting-Priority-Index • Reduced amortized time complexity to constant time • At the expense of Q(logwU)pipeline stages • Memory size also only 1.11U bits, Q(U) space Data Structure of Pipelined CPI with h = 3

  15. Operations Supported • Operations supported by all 3 priority indexing structures TEST(i) Test if index i is in set S INSERT(i) Insert a new index i to set S DELETE(i) Delete index i from set S FINDMIN Find the smallest index in set S FINDMAXFind the largest index in set S EXTRACTMIN Delete the smallest index in set S EXTRACTMAX Delete the largest index in set S SUCCESSOR(i) Find the successor of index i in set S PREDECESSOR(i) Find the predecessor of index i in set S EXTRACTSUCC(i) Delete the successor of index i in set S EXTRACTPRED(i) Delete the predecessor of index i in set S

  16. Comparison Time Hardware Memory PI Q(logwU) constant 1.016 U CPI Q(logwU) constant 1.11 U Pipelined CPI constant Q(logwU) 1.11 U

  17. Hardware Complexity of Pipelined CPI Number of Pipeline Stages in the Data Structures

  18. Summary Fast sorting data structures Fast and scalable succinct data structures for the implementation of priority queues The Pipelined CPI supports constant time priority management operations The hardware complexity is only Q(logwU) with Q(U) memory space

  19. Thank You

More Related