1 / 39

CSCI 599: Beyond Web Browsers

CSCI 599: Beyond Web Browsers. Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA 90089. QUIZ 1:. When you register your email with Google, Google emails you a key that must be included with each request to their web methods. (True)

Download Presentation

CSCI 599: Beyond Web Browsers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA 90089

  2. QUIZ 1: • When you register your email with Google, Google emails you a key that must be included with each request to their web methods. (True) • The Google web service API can be invoked using ASP.NET (True) • Once a client caches the results of a Google search, Google will invalidate this client’s cache when it detects updates to its information system. (False) • Napster employed a central server to store the index of all files available for download by a client. (True) • CAN assumes nodes that insert (key,value) pairs will periodically refresh their inserted entries. (True)

  3. A Scalable Content Addressable Network (CAN)by S. Ratnsmy, P. Francis, M. Handley, R. Karp, S. Shenker.

  4. CAN • CAN is composed of individual nodes. • CAN employs a hash function to insert, lookup, and delete (key,value) pairs. • A node stores a chunk, termed a zone, of the entire hash table. • A node maintains information about its neighboring nodes.

  5. EXAMPLE HASH FUNCTION • A two dimensional hash function: • h(K) = a 6 bit unsigned integer • The low three and high three bits form the 2 dimensions of a hash index. • e.g., h(“Thriller”) = 111011 • Low 3 bits = 011 • High 3 bits = 111 • Three bits range in value from 0 (000) to 7 (111)

  6. ADDRESS SPACE • A 2 dimensional address space, can be partitioned across 64 nodes 111 110 101 100 High bits 011 010 001 000 000 001 010 011 100 101 110 111 Low bits

  7. EXAMPLE • A 2 dimensional space partitioned across six nodes 111 5 6 110 101 4 3 100 High bits 1 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  8. EXAMPLE (CONT…) • h(“Thriller”) = 111011 = (111, 011) = node 6 111 5 6 110 101 4 3 100 High bits 1 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  9. NEIGHBORS • Two nodes are neighbors if their coordinate spans overlap along d-1 dimensions and about along one dimension. 5 6 4 3 1 2

  10. NEIGHBORS (CONT…) • 3’s neighbors: 6 5 4 3 1 2

  11. NEIGHBORS (CONT…) • 5 is not 3’s neighbor because it does not overlap along one dimension; it only abuts along two dimensions. 5 6 4 3 1 2

  12. NEIGHBORS (CONT…) • The coordinate space is a d-torus, it wraps. Example, 5’s neighbors: 6 5 4 3 1 2

  13. NEIGHBORS (CONT…) • A node maintains information about its neighbors in order to route a lookup, insert, and delete: 5 6 4 3 1 2

  14. NODE ADDRESSING • CAN has an associated DNS domain name that resolves to the IP address of one or more CAN bootstrap nodes. • A bootstrap node maintains a partial list of CAN nodes it believes are currently in the system. • A request is routed to one of these nodes. • The contacted node applies the hash function and routes the request towards its target destination (using information about its neighbors).

  15. EXAMPLE • A client looks up “Fragile”, h(“Fragile”) = 100010, (4,2) by contacting N5 (7,0). • Reduce the y-value (high-bits) from 7 to 4, Increase x-value (low bits) from 0 to 2 111 5 6 110 101 4 3 100 High bits 1 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  16. EXAMPLE • A client looks up “Fragile”, h(“Fragile”) = 100010, by contacting N5. 111 5 6 110 101 4 3 100 High bits 1 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  17. EXAMPLE • A client looks up “Fragile”, h(“Fragile”) = 100010, by contacting N5. 111 5 6 110 101 4 3 100 High bits 1 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  18. EXAMPLE • A client looks up “Hey bebe!”, h(“Hey bebe!”) = 110101, by contacting N5; how is the request routed? 111 5 6 110 101 4 3 100 High bits 1 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  19. OBSERVATIONS • Observation 1: • In a d-dimensional space, each node has 2d neighbors. • A node maintains information about its neighbors. • Thus, one may grow the number of nodes without increasing the node state. • Observation 2: • The average path length grows as O(n 1/d) as a function of the number of nodes, n. • Observation 3: • The path length is O(d n 1/d) hops for d dimensions and n nodes

  20. NUMBER OF DIMENSIONS • Figure 4 • Substantial improvement with going from d=2 to 4. Beyond 4, the percentage improvement levels off. • This same observation is shown in Figure 6.

  21. NEW NODE • CAN incorporates a new node, say N7, as follows: • N7 must find a CAN node. • N7 randomly chooses a point P that maps to a node, say N1, and sends it a join request. N7’s zone will be partitioned between N7 and N1. • N1 splits is zone in half, retains one half and handles the other half to N7. • N7 identifies its neighbors. • Neighbors of N1 are notified to include N7 for routing.

  22. NEW NODE (Cont…) • Zone belonging to N1 is partitioned between N1 and N7 111 5 6 110 101 4 3 100 High bits 1 7 011 010 2 001 000 000 001 010 011 100 101 110 111 Low bits

  23. Questions & Answers

  24. NODE REMOVAL (FAILURE)

  25. IMPROVEMENTS • Categories of improvements: • Replication: reduce path length • Multiple realities: additional state information (Sec 3.2) • Multiple hash functions (Sec 3.5) • MAX replica based: additional state information (Sec 3.4) • Routing of requests: reduce path latency • Route requests to a candidate neighbor with minimum RTT: additional state information (3.3) • Assignment of nodes to zones • Uniform partitioning of space (3.7): load balancing • Topologically close nodes are assigned to the same zone (3.6): reduce path latency

  26. IMPROVEMENTS • A matrix perspective (missing: data caching) • One may consider a combination, e.g., data placement & replication Reduce path length Reduce path latency Load balancing Replication Better Routing Data placement

  27. REPLICATION: MULTIPLE REALITIES • Maintain multiple, independent coordinate spaces. • Each node is assigned a different zone in each coordinate space. • Here are two realities: 5 6 1 2 4 3 5 6 1 3 2 4 Reality-1 Reality-2

  28. MULTIPLE REALITIES • Replication increases availability of data in the presence of failures • Figure 5 • The benefit (percentage improvement) with 2 and 3 realities is substantial. It levels off with 4 or more realities. • Figure 6 • Number of neighbors is fixed on the x-axis with both (a) d=2&r=varying, and (b) d=varying&r=2 • To improve routing efficiency, multiple dimensions is more beneficial than increasing the number of realities (given the same amount of space). • Qualitatively: additional realities provide a higher degree of data availability (in the presence of failures). • Notice the knee of both curves in Figure 6 (impact of realities & dimensions is marginal beyond a certain point).

  29. REPLICATION: MULTIPLE HASH FUNCTION • Use k different hash functions to map a single key to k different points in the coordinate space. • This results in 0 to k replicas of a single key. In case of collisions to a single zone do not construct replicas. • With a lookup, retrieve the entry from the closest node. (Retrieve the node from all the k potential targets, consuming more bandwidth.) • Figure 7

  30. CONTROLLED REPLICATION • Multiple nodes share the same zone. • These nodes are termed peers. • MAXPEERS is a system parameter to control the number of replicas. • Logically, a node has 2d(MAXPEERS) neighbors • To maintain a fixed amount of state information per node: • A node selects one neighbor from amongst the peers in each of its neighboring zones.

  31. CONTROLLED REPLICATION • Neighbor selection: • Periodically, a node request its coordinate neighbor to transmit its peer list • Measure the RTT to all nodes in its neighboring zone • Retain the node with the lowest RTT as its neighbor • Replication: • Increases data availability • Improves performance: path length, path latency, and load balancing • Increases update overhead

  32. CONTROLLED REPLICATION • Table 2: per-hop latency Relative % Improvement MAXPEERS

  33. Questions & Answers

  34. BETER ROUTING • Standard routing metric: Progress towards the destintion in terms of the Cartesian distance. • Better routing: • Each node measures the network-level round-trip-time (RTT) to each of its neighbors • For a given destination, a message is forwarded to the neighbor with the max(progress / RTT) • Table 1 (20-40% improvement)

  35. Questions & Answers

  36. Topologically sensitive routing • Assign zones to nodes in a manner that assigns neighboring zones to nodes that have a minimum RTT • USC’s neighboring node should be UCLA (instead of Cornell) assuming the RTT to UCLA is smaller. • How? • Identify m well known set of machines as landmarks. • Every CAN measures its RTT to these machines and maintains a vector listing closest to farthest. • m! ordering of landmarks is possible • Partition the coordinate space into m! portions • When a new node joints, it is mapped to a portion with a matching landmark ordering

  37. Topologically sensitive routing • Assuming 3 landmarks, the Cartesian space is divided into six portions: m splits along x-axis, m-1 splits along the y-axis 111 110 101 100 High bits 011 010 001 000 000 001 010 011 100 101 110 111 Low bits

  38. Topologically sensitive routing • Figure 8: • 4 landmarks with a minimum hop distance of 5 • Latency stretch = CAN network latency  average IP network latency • (2-d with landmark ordering) out performs (4-d without landmark ordering)

  39. Questions & Answers

More Related