1 / 28

Paper Survey of DHT

Paper Survey of DHT. Distributed Hash Table. Usages. Directory service Very little amount of information, such as URI, metadata, … Storage Data, such as files, … Immutable, just for download Database Each entry is small, but large amount of entries Mutable Special operations for query.

abba
Download Presentation

Paper Survey of DHT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paper Survey of DHT Distributed Hash Table

  2. Usages • Directory service • Very little amount of information, such as URI, metadata, … • Storage • Data, such as files, … • Immutable, just for download • Database • Each entry is small, but large amount of entries • Mutable • Special operations for query

  3. Challenges • Immutable • Latency • Availability • Query Consistency • Mutable • Object Consistency

  4. Latency • Query • Different routing architectures • Chord, Tapestry, Pastry, Kademlia, Can, … • Recursive, interactive • Proximity Neighbor Route • Parallel • Routing table size • Fetch • Transport Protocol • Proximity Neighbor Selections • Cache • Distributed Object

  5. Query: Routing Architectures • Routing Complexity • O (log n), O (d), O (1), … • Principle • Each peer has a unique digest • Object with a digest • Put the object to the peer with the closed digest • Famous ones are O (log n) • O (1) • cache

  6. Query: Recursive or Interactive • Query is recursive forward • Faster 2 times than interactive theoretically • Primary parameters • Base • # of successor • Persistent problem

  7. Query: Recursive or Interactive • Query is interactively forward • Not very slow in practical • Primary parameters • # of parallel query • Routing table tree • Learning new neighbor easily • Exchange information with other peers • Flexible

  8. Query: Proximity Neighbor Route • Route by a node with smaller delay • Small delay -> small timeout • TCP > Vivaldi > fixed

  9. Query: Proximity Neighbor Route • Measure methods • Global Sampling • Neighbor’s neighbors • Neighbor’s inverse • Recursive sampling

  10. Query: others • Parallel query • Faster • With partial PNS property • Persistent • More traffic • Large routing table • Easy to find a closer node locally

  11. Fetch: Cache • Cache objects on nodes closer to the primary one • # of nodes to cache is upon the popularity of the object • Average query hops can be reduced to a constant number ( O (1) ) • Hard to apply to mutable object • Consider churn  more bandwidth consumption

  12. Fetch: Distributed Object • Split object to small pieces and put on different nodes • Recover faster • Download faster • Hard to maintain • Only for immutable data

  13. Fetch: Transport Protocol • Striped Transport Protocol • UDP • Window control • Retransmission

  14. Availability • Replicate • Reactive / Proactive • Eager / lazy repair • Erasure coding • Load balance is broken • High correlation between uptime and storage • Maintenance traffic problem

  15. Availability: Replicate • Reactive • Duplicate when a copy is lost • Consume lots of bandwidth in short time • When churn is low, reactive is better • Proactive • Duplicate continually • Consume constant and small bandwidth continually • Need avail. prediction and redundancy management • Bandwidth usage is predictable

  16. Availability: Replicate • Temporary / Permanent churn • Availability <-> Durability • Achieve 100% availability or/and durability ? • Eager repair • Duplicate immediately • Lazy repair • Duplicate after timeout • Need a good choice of timeout • Reintegrating returning replicas

  17. Availability: Erasure Coding • Matter more on larger object • Save storage and bandwidth • For high churn, the bandwidth consumption is still not acceptable • Complex maintenance • Download latency is heterogeneous • Only for immutable data

  18. Query Consistency • A digest-object mapping is existed, then the result of query must be it • Weakly consistent KBR • Eventual consistency • Most of existed DHT • Strongly consistent KBR • Causality consistency • Strong consistency • Solution • Route by W-KBR to a group • S-KBR in a group

  19. Mutable DHT • Object stored in DHT is mutable • Insert, update, delete • Churn -> Replica • New Challenge …

  20. Object Consistency • For immutable data • For security issue, it may be there • Merkle tree • For mutable data • Consensus algorithm • Distributed algorithm for data consistency • Quorum algorithm • Read / write locks

  21. Pitfalls • Different kinds of p2p have different properties • Lack of new real traces • Standard simulation platform

  22. References • Efficient Replica Maintenance for Distributed Storage Systems • Proactive replication for data durability • On object Maintenance in Peer-to-Peer systems • Enforcing Routing Consistency in Structured Peer-to-peer Overlays: Should We and Could We? • High Availability in DHTs: Erasure Coding vs. Replication • Toward Fault-tolerant Atomic Data Access in Mutable Distributed Hash Tables • Kademlia: A Peer-to-peer Information System Based on the XOR Metric • Total Recall: System Support for Automated Availability Management • Designing a DHT for low latency and high throughput

  23. References • Fallacies in evaluating decentralized systems • Anatomy of a P2P Content Distribution system with Network Coding • Comparing the performance of distributed hash tables under churn • EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State management • Bandwidth-efficient management of DHT routing tables • Improving Lookup Performance over a Widely-Deployed DHT • Failure Recovery for Structured P2P Networks: Protocol Design and Performance Evaluation • Handling Churn in a DHT

More Related