ECE 6102 Qiyu Liu Ethan Trewhitt

PeerCluster: A Cluster-BasedPeer-to-Peer SystemXin-Mao Huang, Cheng-Yue Chang, and Ming-Syan Chen, Fellow, IEEE ECE 6102 Qiyu Liu Ethan Trewhitt

Agenda • Background • Structure • Functional Protocols • Structural Protocols • Scaling • Performance

Background – Existing P2P Systems • Centralized system - Napster • Pro: Low cost to resolve queries • Cons: Single point of failure • Decentralized/unstructured - Gnutella • Pro: Fault-tolerant, resilient to join/leaves • Cons: Search mechanism scales poorly • Decentralized/structured - PeerCluster • Same benefits of decentralized/unstructured • Cluster structure reduces broadcast flooding

Background – PeerCluster • Principle of interest grouping • A given user has few interests • Queries relate to interests • How to exploit? • Logically group users with similar topics • Increases query efficiency

Background – PeerCluster

Background – Query Resolution • A node receives a query if (query topic = present cluster’s interest topic) { broadcast to all nodes in present cluster // intracluster broadcasting } else { route to responsible node in corresponding interest cluster // intercluster broadcasting } • Intra/intercluster broadcasting are main operations in query resolution • How to implement?

Structure – Hypercube • Three interests can be implemented with 5-D hypercube • Nodes & edges are virtual • One hypercube address  one computer • However, one computer  multiple hypercube addresses

Structure – Clusters • Interest-based • Realized with hypercubes within the overall system hypercube • Initial size based on popularity, Huffman coding

Structure – Tree Creation • Assume n-dimensional hypercube with k different interest topics • Ij: jth interest topic where 0 ≤ j ≤ k - 1 • pop[Ij]: popularity of Ij • 0 < pop[Ij] < 1 and • Construct Huffman tree based on pop[Ij] • Cluster size = 2n-length(prefix[Ij])

Structure – Routing Table • Routing table created for each computer • Must keep track of mapping of neighboring computers to send messages • addr(A): addresses owned by computer A • NH(A): neighboring hypercube addresses =Uai Є addr(A) Ne(ai) – addr(A) where Ne(ai) is set of hypercube addresses adjacent to address ai

Structure – Assigned Tree • Assigned tree records number of free addresses in every cluster • Root address is lowest address • Parent and child address differ by 1 bit only • Child address is longer than parent address • Present address manages assignment of child address • Every address records number of free addresses of all its children. Initial number of free addresses of children = total number of subtrees • When parent address wants to assign free address to joining request, checks number of free addresses starting from lowest address

Functional Protocol – Broadcast Proc_Broadcast(subq, msg, node_addr, step) for (i = step to subq – 1) { dest_addr = node_addr xor 2i; send(subq, msg, dest_addr, i++); }

Functional Protocol – Route Proc_Route(msg, dest_addr, node_addr) if (dest_addr != node_addr) { i = Compare(dest_addr, node_addr); send(msg, dest_addr, node_addr xor 2i); }

JOIN Protocol • Joining computer A finds any computer B in the system • Ask computer B to find computer C with the same major interest • Ask computer C to find computer D that holds an available alias address* • Take the available address and notify neighbors • Computer D notifies its parent nodes of one less available address *if there are no available addresses, a cluster expansion must be performed

LEAVE Protocol • Leaving computer A finds the root node B (smallest address) of the cluster • Donate address (and aliases) to computer at B • Computer B notifies its neighbors that A has left

SEARCH Protocol • Searching computer A wants to find something • Query computer B in the corresponding interest cluster who has the same postfix • Computer B broadcasts query to its cluster • Computers in the queried cluster respond directly to A with relevant results

Cluster Expansion • Runs whenever a computer wants to join but the cluster is full • Query the utilization rates of neighboring clusters • Choose a neighboring cluster • The neighboring cluster splits and loans the upper half of its addresses • Upper-half addresses rejoin at the lower half

Cluster Expansion Issues • Expansion and splitting cause partitions • Clusters are no longer a single hypercube • System restoration consolidates clusters • If the cluster can’t be expanded or the system is full, the system must be expanded

System Expansion • Easier than cluster expansion • Addresses gain an additional bit, entire system doubles in size • Each node becomes two • Each cluster doubles in size

Performance Setup • Uses data from the Open Directory Project • Compares Gnutella and PeerCluster • Determined the “query efficiency”, which is the ratio of files found to query messages sent • Varied the Search Limit (SL), which acts like a TTL value • Also varied the number of interest clusters • Base 4 vs. base 2

Performance

Questions?

ECE 6102 Qiyu Liu Ethan Trewhitt

ECE 6102 Qiyu Liu Ethan Trewhitt

Presentation Transcript

Ethan

Ethan Frome

Ethan Allen

ETHAN ALLEN INTERIORS

Ethan Frome

Ethan Frome

Ethan Lowry

ETHAN FROME

England by Ethan

Ethan Brown

Ethan James Arrived

Ethan Frome Timeline

Ethan Frome

Ethan Allen

LOGBOOK PXGT 6102

Ethan Frome

Ethan Frome

ECOMP 6102

Ethan Frome

Ethan Frome