1 / 34

Flat Identifiers

Flat Identifiers. Jennifer Rexford Advanced Computer Networks http://www.cs.princeton.edu/courses/archive/fall08/cos561/ Tuesdays/Thursdays 1:30pm-2:50pm. Outline. Distributed Hash Tables (DHTs) Mapping key to value Flat names Semantic Free Referencing DHT as replacement for DNS

carl
Download Presentation

Flat Identifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Flat Identifiers Jennifer Rexford Advanced Computer Networks http://www.cs.princeton.edu/courses/archive/fall08/cos561/ Tuesdays/Thursdays 1:30pm-2:50pm

  2. Outline • Distributed Hash Tables (DHTs) • Mapping key to value • Flat names • Semantic Free Referencing • DHT as replacement for DNS • Flat addresses • Routing On Flat Labels • DHT as an aid in routing

  3. Distributed Hash Tables http://pdos.csail.mit.edu/chord/

  4. Hash Table • Name-value pairs (or key-value pairs) • E.g,. “Jen” and jrex@cs.princeton.edu • E.g., “www.cnn.com/foo.html” and the Web page • E.g., “BritneyHitMe.mp3” and “12.78.183.2” • Hash table • Data structure that associates keys with values value lookup(key) key value

  5. Distributed Hash Table • Hash table spread over many nodes • Distributed over a wide area • Main design goals • Decentralization: no central coordinator • Scalability: efficient even with large # of nodes • Fault tolerance: tolerate nodes joining/leaving • Two key design decisions • How do we map names on to nodes? • How do we route a request to that node?

  6. Hash Functions • Hashing • Transform the key into a number • And use the number to index an array • Example hash function • Hash(x) = x mod 101, mapping to 0, 1, …, 100 • Challenges • What if there are more than 101 nodes? Fewer? • Which nodes correspond to each hash value? • What if nodes come and go over time?

  7. Consistent Hashing • Large, sparse identifier space (e.g., 128 bits) • Hash a set of keys x uniformly to large id space • Hash nodes to the id space as well 2128-1 0 1 Id space represented as a ring. Hash(name)  object_id Hash(IP_address)  node_id

  8. Where to Store (Key, Value) Pair? • Mapping keys in a load-balanced way • Store the key at one or more nodes • Nodes with identifiers “close” to the key • Where distance is measured in the id space • Advantages • Even distribution • Few changes as nodes come and go… Hash(name)  object_id Hash(IP_address)  node_id

  9. Nodes Coming and Going • Small changes when nodes come and go • Only affects mapping of keys mapped to the node that comes or goes Hash(name)  object_id Hash(IP_address)  node_id

  10. Joins and Leaves of Nodes • Maintain a circularly linked list around the ring • Every node has a predecessor and successor pred node succ

  11. Joins and Leaves of Nodes • When an existing node leaves • Node copies its <key, value> pairs to its predecessor • Predecessor points to node’s successor in the ring • When a node joins • Node does a lookup on its own id • And learns the node responsible for that id • This node becomes the new node’s successor • And the node can learn that node’s predecessor (which will become the new node’s predecessor)

  12. How to Find the Nearest Node? • Need to find the closest node • To determine who should store (key, value) pair • To direct a future lookup(key) query to the node • Strawman solution: walk through linked list • Circular linked list of nodes in the ring • O(n) lookup time when n nodes in the ring • Alternative solution: • Jump further around ring • “Finger” table of additional overlay links

  13. Links in the Overlay Topology • Trade-off # of hops vs. # of neighbors • E.g., log(n) for both, where n is number of nodes • E.g., overlay links 1/2, 1/4 1/8, … around the ring • Each hop traverses at least half of the remaining distance 1/2 1/4 1/8

  14. Semantic-Free Referencing(DHT as a DNS Replacement) http://nms.lcs.mit.edu/projects/sfr/

  15. Motivation for Flat Identifiers • Stable references • Shouldn’t have to change when object moves • Object replication • Store object at many different locations • Avoid fighting over names • Avoid cyber squatting, typo squatting, … Proposed Current <A HREF= http://isp.com/dog.jpg >my friend’s dog</A> <A HREF= http://f0120123112/ >my friend’s dog</A>

  16. Separate References and User-level Handles • Let people fight over handles • Do not fight over references • Allow multiple handle-to-reference services • Flat identifiers • Do not embed object or location semantics • Are intentionally human-unfriendly User Handles (AOL Keywords, New Services, etc.) Human-unfriendly References Object Location

  17. Semantic-Free Referencing GET(0xf012c1d) <A HREF= http://f012c1d/ >Spot</A> Managed DHT-based Infrastructure o-record (10.1.2.3, 80, /pics/dog.gif) orec HTTP GET: /pics/dog.gif 10.1.2.3 API • orec = get(tag); • put(tag, orec); Anyone can put() or get() /pics/dog.gif Web Server

  18. Resilient Linking • Tag: abstracts object reachability information • Object granularity: files, directories, hosts, … HTTP GET: /docs/pub.pdf 10.1.2.3 <A HREF= http://f012012/pub.pdf >here is a paper</A> /docs/ HTTP GET: /~user/pubs/pub.pdf 20.2.4.6 (10.1.2.3,80, /docs/) /~user/pubs/ (20.2.4.6,80, /~user/pubs/) SFR

  19. Flexible Object Replication o-record (Doesn’t address massive replication) SFR (IP1, port1, path1), (IP2, port2, path2), (IP3, port3, path3), . . . 0xf012012 • Grass-roots replication • People replicate each other’s content • Does not require control over Web servers

  20. Reference Management • Requirements • No collisions, even under network partition • References must be human-unfriendly • Only authorized updates to o-records • Approach: randomness and self-certification • tag = hash(pubkey, salt) • o-record has pubkey, salt, signature • Anyone can check if tag and o-record match

  21. Reducing Latency • Look-ups must be fast • Solution: extensive caching • Clients and DHT nodes cache o-records • DHT nodes cache each other’s locations

  22. Routing On Flat Labels(DHT to Help in Routing)

  23. How Flat Can You Get? • Flat names • DHT as a replacement for DNS • Stable references, simple replication, avoid fighting • Still route based on hierarchical addresses • For scalability of the global routing system • Flat addresses • Avoid translating name to an address • Route directly on flat labels • Questions • Is it useful? • Can it scale?

  24. Area 1 Area 2 1.1 1.2 2.1 2.2 Area 4 Area 3 4.1 B K 3.2 4.2 3.3 Q 3.1 V A S F X J Topology-Based Addressing • Disadvantages: complicates • Access control • Topology changes • Multi-homing • Mobility • Advantage • Scalability • Scalability • Scalability • …

  25. Virtual topology K A F X F J A F J K Q S V X V J S K Q Routing on Abstract Graph: Know Your Neighbors K Q V 1. Write down sorted list of IDs 2. Build paths between neighbors in list S X F J A Network topology

  26. A X F K Q V J F J S K Send(K,F) Q Routing on Abstract Graph: Forwarding Packets Virtual topology K Q V S X F J A Network topology

  27. Resulting path length: 10 hops A X F V J V X A F J S K Send(J,V) Q Shortest path length: 3 hops Routing on Abstract Graph: Stretch Problem Virtual topology K Q V S X F J A Network topology

  28. Resulting path length: 4 hops A X F V J V X F J A X S K Send(J,V) Q Shortest path length: 3 hops Routing on Abstract Graph: Short-cutting Virtual topology K Q V S X F J A Network topology

  29. Identifiers • Identity tied to public/private key pair • Everyone can know the public key • Only authorized parties know the private key • Self-certifying identifier: hash of public key • Host associates with a hosting router • Proves it knows private key, to prevent spoofing • Router joins the ring on the host’s behalf • Anycast • Multiple nodes have the same identifier

  30. Basic Mechanisms behind ROFL • Goal #1: Scale to Internet topologies • Mechanism: DHT-style routing, maintain source-routes to successors (fingers) • Provides: Scalable network routing without aggregation • Goal #2: Support for BGP policies • Mechanism: Intelligently choose successors (fingers) to conform to ISP relationships • Provides: Support for policies, operational model of BGP

  31. Successor list: 0x3F6C0 4. intermediate routers may cache pointers 2. hosting routers participate in ROFL on behalf of hosts ISP ISP Pointer cache: 0x3B57E 0x3F6C0 0x3BAC8 0x3B57E 0x3F6C0 0x3B57E (joining host) 0xFA291 Pointer list: 0x3F6C0 0x3BAC8 0x3BAC8 5. external pointers provide reachability across domains 3. hosting routers maintain pointers with source-routes to attached hosts’ successors/fingers How ROFL Works 1. hosts are assigned topology-independent “flat” identifiers

  32. hierarchy #1 hierarchy #2 hierarchy #3 provider routes must not be exported to peers prefer customer over peer routes peer link customer link Source Destination Internet Policies Today • Economic relationships: peer, provider/customer • Isolation: routing contained within hierarchy • Economic relationships: peer, provider/customer • Isolation: routing contained within hierarchy

  33. Joining host Internal Successor External Successor External Successor Source Destination Isolation in ROFL  Traffic between two hosts traverses no higher than their lowest common provider in the AS hierarchy

  34. Discussion • How flat should the world be? • Flat names vs. flat addresses? • What should be given a name? • Objects? • Hosts? • Networks? • What separation to have? • Human-readable names • Machine-readable references • Network location

More Related