1 / 39

Pastry And Squirrel

Pastry And Squirrel. Presented by Eirik T. Laberg Håvard Semundseth Orri G. Pálsson. What is Pastry System?. Overlay network that handles: Routing between nodes Object localization Each node is assigned a unique nodeId. 128-bit SHA-1hash of either the nodes public key or IP-address.

quant
Download Presentation

Pastry And Squirrel

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pastry And Squirrel Presented by Eirik T. LabergHåvard SemundsethOrri G. Pálsson

  2. What is Pastry System? • Overlay network that handles: • Routing between nodes • Object localization • Each node is assigned a unique nodeId. • 128-bit • SHA-1hash of either the nodes public key or IP-address

  3. Pastry Node Leaf set (L) A set of nodes that are numerically closer in the nodeId space to the present Node. Half larger and half smaller than the current node. The leaf set is mainly used when routing messages. Routing table (R) The routing table consists of a number of rows, where row i containing nodes sharing i initial digits of the nodeId with the local node

  4. Pastry Node (2) • Neighborhood set (M): • Contains nodeIds and IP addresses of the |M| nodes that are closest (according to the proximity metric) to the local node. • The neighbor set is used as a starting point for the construction of the routing table process and maintaining locality properties.

  5. Routing • Message with key D arrives to node with nodeId A. • Checks if the key falls within the range of nodeIds in leaf set. If yes, forward message to destination node • If no, use routing table. Forward to node with common prefix by at least one more digit • Table entry are empty or node unreachable. Forward to node with prefix at least as long as key, and numerically closer in the nodeId space

  6. Routing (2) • Example:

  7. Node Arrival (1/3) • A new nodeId is X and its nearby Pastry node is A. • Assumed that the new node X knows initially about the nearby Pastry node A. • Node X asks A to route a “join” message with a key equal to X. • Pastry routes the join message to existing node Z whose id is numerically closest to X in the nodeId space. • Nodes A, Z and all nodes encountered on the path sends their state tables to X.

  8. Node Arrival (2/3)

  9. Node Arrival (3/3) • New node X initializes its own state tables • Neighborhood set is initialized with A’s(closest in proximity metric) neighborhood set • Since Z is closest numerically to X: • X’s leaf set is initialized with Z’s leaf set. • Row 0 (R0) of A’s routing table used to initialize X row 0 • Row 1 (R1) of node B’s routing table used to initialize X row 1 • … • Node X transmits a copy of its resulting state to all nodes in its neighborhood set (M), leaf set (L) and routing table (R) • Nodes in Pastry network updates own state based on info. received

  10. Locality • What do we mean by locality? • We mean the ability to exploit “local” resources over “global” ones whenever possible. • The route chosen for a message is likely to be “good“ with respect to the proximity metric. • Who can we maintain this property when a new node X joins/arrives? • Discuss the locality property regarding: • Locality in the routing table • Route locality • Locating the nearest among k nodes

  11. Locality in the routing table • Goal: • Want to be sure that all routing entries refer to a node that is near the present node. • According to the proximity metric with live nodes with appropriate prefix for entry • Want to maintain this property when a new node X arrives. • Stage one: Require that node A is near X and A’s R0 are close to A according to proximity metric. • We also assume that node B’s R1 entries are reasonable choice for R1 of X. • The new node X initializes its state in this fashion • X’s routing table (R) and neighborhood set (M) approximate the desired locality property. • Problem: The quality of the approximation must be improved • => Avoiding cascading errors that could lead to poor route locality. • Solution: Use second stage. • Node X requests the state from each of the nodes in its routing table and neighborhood set to update its entries to closer nodes. • Neighborhood set contributes valuable information.

  12. Route locality • Each routing step moves the message closer to the destination: • In the nodeId space. • While traveling the least possible distance in the proximity space. • Given that: • A routed message from A to B at a distance d cannot be routed to a node with a distance of less than d from A. • “Local” information is used, Pastry minimizes the distance of the next routing step with no global direction. • Does not guarantee shortest path from source to destination is chosen.

  13. Locating the nearest among k nodes • Goal: • Peer-to-peer applications may want to use Pastry to replicate information on k Pastry nodes. • The k Pastry nodes is numerically closest to a given key in the Pastry nodeId space. • A message routed from a client application (CA), reaches first a node near the CA, among k numerically closest nodes to a key. • Problem: • Since Pastry routes primarily is based on nodeId prefixes, it may miss nearby nodes with a different prefix than the key. • Solution: • Pastry uses heuristic to overcome prefix mismatch issues. • It detects when a message approaches the set of k nodes and switches to numerically nearest address based routing. • Heuristic: based on estimating the density of nodeIds.

  14. Node departure • Node considered failed when its immediate neighbor in the nodeId space can no longer communicate with the node. • Leaf node: the failed node’s neighbor in the nodeId space contacts the node with the largest index in L on the side of the failed node and ask for its leaf table. • Routing node : contacts another node of the same row, and asks for that nodes neighbor. If it can’t find a suitable on the same row it looks on the row below.

  15. Arbitrary node failures • Node continues to be responsive, but behaves incorrectly or even maliciously. • Repeated queries fail each time since they normally take the same route. • Solution: Routing can be randomized • The choice among multiple nodes that satisfy the routing criteria should be made randomly

  16. Some experimental results • Quad-processor Compaq AlphaServer ES40 • 500 MHz 21264 Alpha CPUs • 6GBytes of main memory • True64 UNIX • Implemented in Java • Pastry nodes were configured to run in a single java VM

  17. Routing performance

  18. Routing performance

  19. Maintaining the network (1/2)

  20. Maintaining the network (2/2) • Shows the quality of the routing tables • With respect to locality property. • How information exchange during join operation affects the quality. • Optimal: the best (closest according to the proximity metric). • Sub-Optimal: an entry was not the closest or was missing. • SL: considers only the appropriate row from each node along the route. • WT: fetches the entire state of each node along the path (omitting the second stage of update). • WTF: WT + the second stage of update. • Result: => Pastry’s method of node integration (“WFT”) is effective in initializing the routing tables with good locality. • Less information exchange during join operation => lower quality with respect to locality.

  21. Conclusion • Pastry is a generic peer-to-peer content location and routing system • Scales well • Used for applications like global file sharing, file storage etc. • Takes into account locality when routing messages

  22. What is squirrel ? • Squirrel is an alternative to caches that are deployed on dedicated machines on the boundaries of corporative LAN's. • Client desktop machines cooperate in a p2p fashion inside the LAN to provide the functionality of a proxy.

  23. Traditional approach • One machine that have to be capable of handling peak loads of traffic • Expensive hardware and administrative costs. • Growth of users require hardware updates. • Single point of failure.

  24. Web caching • If object is found in the cache server it is tested for freshness (ttl) • If it is fresh, object is returned, otherwise a cGET request is generated by the browser. Two types of cGET • If-Modified-Since (uses timestamp) • If-None-Match (ETag = hashed web content) • Response from cGET either includes the content or not-modified message

  25. Pastry • Uses pastry for the location of web objects by mapping the url to a node. • Hashes the url and uses it as a key. • If web browser does not find the requested object in his cache then squirrel tries to locate a copy on another node.

  26. Models • Home store model • Directory model

  27. Home store model • Homenode of an object is the node that has nodeId numerically closest to a given objectId • All external requests are routed through home node.

  28. Scenario Home store

  29. Directory model • A homenode holds a small directory of pointers to nodes (called delegates) that have recently accessed the object. • Additionally it stores meta data about the object such as ETag, fetch time, last modified time, ttl etc. • Requests are forwarded to a randomly chosen delegate that is know to have the object

  30. Directory model scenario 1

  31. Directory model scenario 2

  32. Directory model scenario 3

  33. Directory model scenario 4

  34. Arrival, failure and departure • Arrival • New node is automatically set as homenode. • Two neighboring nodes transfer objects and directories which objectId are numerically closest to the newly joined node. • Failure • Future requests will be routed to the node that has become numerically closest to the objectID. • Departure • Nodes that are capable of announcing their desire to leave the system can transfer stuff to neighbors

  35. External bandwidth consumption • A 100 mb of disk donation from each client to squirrel, lowers the external bandwidth consumption to the level of a dedicated cache

  36. Latency • Latency is dependant on LAN hops • In traditional proxy caching LAN hops = 2 • In squirrel LAN hops = 4-7 • LAN communication is fast user perceived latency is minimal

  37. Load on each node *36,782 clients **105 clients Directory model

  38. Load on each node *36,782 clients **105 clients Home store model Average load on any given minute is 0,31 object/min (for Redmond) for both models. Squirrel performes webcaching with low cost

  39. Fault tolerance • Possible to loose connection to the internet due to router failure • Internal link ore router fails results in partitioning the network. Squirrel would partition itself into two separate systems • Individual nodes can fail. Most nodes leave voluntarily

More Related