Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing
Tapestry is an innovative overlay location and routing infrastructure designed for wide-area networks, providing fault tolerance, scalability, and reliability in an environment dominated by dynamic, untrusted systems. Built upon the IP framework, Tapestry supports nomadic data management, enabling data access anytime, anywhere. It efficiently handles complex requirements such as naming, locating, and routing in a decentralized manner using a self-organizing structure. This presentation outlines Tapestry's operational challenges, solutions, routing mechanisms, data management strategies, and evaluation of its performance, aimed at achieving effective data utility in ubiquitous computing.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing
E N D
Presentation Transcript
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Ben Y.Zhao , John Kubiatowicz, and Anthony D,Josephetc. Computer Science Division University of California, Berkeley Presenter: Chunyuan Liao March 6, 2002
Outline • Challenges • System overview • Operations, concerned issues & solutions • Route • Locate • Publish • Insert • Delete • Move • Evaluation & Conclusion • Implementation • Summary & Comments
Project background • Driving force : Ubiquitous Computing • OceanStore – A data utility infrastructure • Goals: • Based on the current untrusted Infrastructure • Achieve Nomadic Data • Anytime, Anywhere • Highly scalable, reliable and fault-tolerant • Basic issues: • Data Location • Routing
Challenges • How to achieve naming, location and routing with a complex & chaotic computing environment • Dynamic nature • Mobile and replicated Data & Services • Complex interaction between components, even in motion • Traditional approaches • fail to address the extreme dynamic nature
Tapestry : An infrastructure forFault-tolerant wide-area Location and Routing • An overlay Location & Routing infrastructure built on the IP • Features • Highly scalable : Decentralized, Point-2-Point Self-Organizing • Highly fault-tolerant : Redundancy, Adaptation • Good locality Content-based routing&location • Highly durable
Basic Model of Tapestry • Originated in Plaxton Scheme • Basic components: • Nodes Servers Routers Clients • Objects Data or Services • Link Point-2-Point link
Operations in Trapestry • Naming • Routing • Object Location • Publishing Objects • Inserting/Deleting Objects • Mobile Objects
Tapestry - Naming • Node ID/Object ID • A fixed length bit string (4 bits in each level ) 84F8, 9098 • Global • Randomly generated • Location-Independent • Even distributed • Not unique ( shared by replicas )
Routing : Rules • Suffix matching ( similar to Plaxton ) • Incrementally routing digital by digital 7598 B4F8 Msg to 4598 4598 9098 6789 B437 • Maximum hops : logb(N)
Routing : Neighbor maps • A table with b*logb(N) entries • The i-th level neighbor share (i-1) suffix chunks • Entry( i, j ) • Pointer to the neighbor • “ j” + (i-1) suffix • Secondary Neighbors • Back Pointers • Create bi-direction link 0642
Routing : Fault-tolerant • Detect Server/Link failure • TCP time out( Ping ) • Periodic “heart beat” msg along back pointers • Resist fault • Secondary neighbor • Recover • Probing message • Second Chance
Locating : basic procedure • 4 phrases locating • Map the Object ID to a “virtual” Node ID • Route the request to that node • Arrive the surrogate or“root for the object • Direct to the server 6234 <O:1234,S:B346> B234 F734 8724 Surrogate Routing Server : B346 Client : B4F8 1234
Locating : Surrogate Routing(1) • Given any client at different place, how to find the same “root”? • Plaxton • Find the nodes with the maximum matching suffix (Stop at the empty entry in neighbor map) • Order them with the global knowledge • Choose the No.1 • Tapestry • Go further than Plaxton( choose an alternate entry ) • Stop at a neighbor map where there is only one non-empty entry pointed to node R 3. R is the root
Locating : Surrogate Routing(2) Conclusion: 1. Root can always be found 2. E. of Sur. Route is 2 Assumption: 1.Every node is reachable Ensure the same “patterns” 2.Even distributed ID Ensure less and less nodes in mapping table 51145 <O:12345, S:B3467> E1145 B1145 F3145 92145 B3467 12345 B7645 B3945
Publishing • Similar to locating • Server send msg and pretends to locate the object • Find the surrogate node as the “root” for the Obj. • Save the related info there, such as <O,S> 6234 <O:1234,S:B4F8> B234 F734 8724 Surrogate Routing Server :B4F8 1234
Locating/Publishing : Fault-Tolerant & Locality • Multiple “root” (better than Plaxton) • Map the Obj. ID to several “root” • Publish/Locate can be executed simultaneously • Cache 2-tuple <O,S> • Clients can get the <O,S> on the way to the root • Intermediate notes can receive multiple <O,S> for the same Obj., the nearest one is chosen
Insert a new node: basic procedure • Get an Node ID • Begin with a “Gateway node” G • Pretends to route to itself • Establish nearly optimal neighbor map during the “pseudo routing” by coping & Choosing nearest ones. • Go back and notify neighbors 6234 B234 F734 8724 Surrogate Routing Gateway node : B4F8 New node : 1234
Delete a note Most simple operation • Explicitly notify the neighbors with back pointers • Use Soft sate Don’t send “heart beat” messages and republish messages any more
Maintain System Consistency • Components in a Tapestry node • Neighbor map • Back pointers • Object-Location pointers <Object, Node> • Hotspot Monitor <Object, Node, Freq> • Object store • Main correct status • Soft sate • Proactive explicit update
Soft state • Advantage • Easy to implement • Suited to slowly changing systems • Disadvantage • Tradeoff between bandwidth overhead and level of consistency • Not suited to the fast changing systems • Example : Bytes for the republishing for a server can be 1400MB (!) in a single interval.
Proactive explicit update( PEU ) • Proactive explicit updates • Epoch number • sequence # of the rounds • Expanded 3-tuple • <Obj. ID, Server ID, LastHopID > • Soft state : backup resort
PEU : Node Mobility Root Deleting (123,A) with “LostHopID” C D * E Republishing (123,B) * F A B Move Object 123 from A to B
PEU : Recover location pointers Root Reconstruction (O,S,B) E F Deleting Old Data D C B A Server Exiting Notification
Introspective Optimization :Adapting to the changing environment • Load balance • Periodically Ping by refresher thread • Update neighbor pointers • Hotspot • Find the source of the heavy traffic, “Hotspot” • Pub the desired data near the hotspot
Evaluation • Gain • Good Locality • Low Location latency • High Stability • High Fault-tolerence • Cost • Bandwidth overhead linear to the replicas
Implementation • Packet level simulators are finished in C • Used to support other applications • such as OceanStore • Bayeus, application-level multicast protocol • Future Working • Security issues • Mobile-IP like functionality
Summary • Urgent need for new Location/Routing Scheme • Features of Tapestry • Location-independent naming • Integration of location and routing • Content-based routing • Support for the dynamic environment inserting/deleting/moving Node/Object
Comments and Questions • Paradox or discrepancy? The underlying IP has bad scalability, how can Tapestry achieve high scalability? Just for demo! • What’s the relation between the IP and Tapestry? Tapestry doesn’t intend to replace IP, it just tries to establish a higher level locating & routing infrastructure to support the content-based operation. • How can we achieve the same goal without IP?