1 / 20

LHT : A Low-Maintenance Indexing Scheme over DHTs

LHT : A Low-Maintenance Indexing Scheme over DHTs. Yuzhe Tang and Shuigeng Zhou 湯宇哲 , 周水庚 復旦大學 The 28th International Conference on Distributed Computing Systems. 89721001 博一 張睿元. Outline. Introduction Preliminaries Algorithms Experimental Results Conclusion. Introduction.

palmer-hale
Download Presentation

LHT : A Low-Maintenance Indexing Scheme over DHTs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LHT:A Low-Maintenance Indexing Scheme over DHTs Yuzhe Tang and Shuigeng Zhou 湯宇哲,周水庚 復旦大學 The 28th International Conference on Distributed Computing Systems 89721001 博一 張睿元

  2. Outline • Introduction • Preliminaries • Algorithms • Experimental Results • Conclusion

  3. Introduction • DHTs have several outstanding advantages: --(1) Scalability and efficiency. --(2) Robustness. --(3) Load balance.

  4. Introduction • Basic DHTs do not support complex query processing. • However, complex queries are highly desired and actually gaining popularity in many P2P applications.

  5. Introduction • The popularity of complex queries poses an urgent need for DHT-based indexing schemes. • Generally speaking, to design an indexing structure, query efficiency comes as the first priority. • As a result, P2P systems have to invest a lot maintenance cost for adjusting their index structures.

  6. Introduction • This problem, however, has not be effectively resolved in the existing P2P indexing schemes. • Instead, they focused on improving query efficiency, and as a trade-off, sacrificed maintenance efficiency. • For example, in Prefix Hash Tree (PHT), each leaf knows its neighboring leaves.

  7. Introduction • Based on this observation, in this paper we propose LHT, a Low maintenance Hash Tree for data indexing in DHT based P2P systems. • LHT requires no modification of the underlying DHTs and can be easily adapted to any DHT substrate.

  8. Preliminaries

  9. Preliminaries

  10. Preliminaries

  11. Preliminaries

  12. Algorithms • Incremental Tree Growth: --When a data insert and make the number of data in a leaf bucket exceed θsplit, then we will split this leaf bucket. # EX: θsplit=2 and insert 0.72 #010 0.62 0.70 #0100 #0101 0.62 0.70 0.72 0.72

  13. Algorithms • LHT Lookup: Consider lookup for 0.9 with depth 14. λ #01110011001100 λfn(λ) NULL #0111001 #011100 λfn(λ) Not Exist 0.9 #011 #0 λ #01111 λ #01110

  14. Algorithm • Range Queries: --Simple case: --General case:

  15. Algorithms • Min/Max Queries: --Min:A DHT-Lookup of # returns the result. --Max:A DHT-Lookup of #0 returns the result.

  16. Experimental Results • LHT has no need of periodical maintenance for index integrality and consistency. • LHT’s maintenance cost is only paid for its tree structure adjustment, incurred by data insertion/deletion. • This structural adjustment involves leaf split and merge. • In comparison with PHT, LHT’s saving ratio of maintenance cost can be up to 75% and at least 50%.

  17. Experimental Results • Both uniform and gaussian datasets were used. • Maintenance Cost: --Data-Movement Cost: LHT is about 50% lower than PHT. --DHT-Lookup Cost: LHT is about 25% lower than PHT.

  18. Experimental Results • Lookup Performance: --For uniform data: LHT is about 20% lower than PHT. --For gaussian data: LHT is about 30% lower than PHT. • When data size equals to some special number (ex:212, 216, 220 which lead the tree depth equal to D/2, D/4, 3D/8) the binary search thus can be resolved in the first (fewer) DHT-lookup.

  19. Experimental Results • Range Query Performance: --Bandwidth: PHT(parallel)>PHT(sequential)≈LHT --Latency: PHT(parallel)>PHT(sequential)≈LHT LHT is about 18% lower than PHT(parallel). • PHT(sequential)’s near-optimal bandwidth consumption is owing to the presence of B+ tree-like leaf link which incurs extra maintenance cost. • PHT(parallel) can achieve competitive time latency, which however deteriorates when data distribution tends to be skewed (like in gaussian data).

  20. Conclusion • This paper proposes LHT for data indexing over DHTs. • As compared with PHT, LHT can save up to 75%(at least 50%) maintenance cost, and achieves better performance in exact-match and range query processing. • LHT is adaptable to any generic DHT, and is easy to be implemented and deployed.

More Related