Loading in 5 sec....

Constructing Scalable Overlays for Pub/Sub With Many TopicsPowerPoint Presentation

Constructing Scalable Overlays for Pub/Sub With Many Topics

- 1620 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Overlays for Pub/Sub With Many Topics' - libitha

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Constructing Scalable Overlays for Pub/Sub With Many Topics

Problems, Algorithms, and Evaluation

G. Chockler, R. Melamed, Y. Tock, IBM Haifa Research Lab

R. Vitenberg, University of Oslo

Publish/Subscribe (Pub/Sub)

{A,B,C,E,}

Subscription(N1)={B,C,D}

N2

{A,D}

N1

N3

M1

Message Bus

M1

{A,X}

Publish(M1, A)

N5

M1

N4

{A,B,X}

Scalability of Pub/Sub

- Most traditional pub/sub systems are geared towards small scale deployment
- E.g., Isis MDS, TIB, MQSeries, Gryphon

- New generation of applications…
- Large data centers: Amazon, Google, Yahoo, EBay,…
- RSS, feed/news readers, on-line stock trading and banking
- Web 2.0, Second Life

- …drive dramatic growth in scale
- 10,000s of nodes, 1000s of topics, Internet-wide distribution

- Emerging systems address this trend using P2P techniques

Overlay-Based Pub/Sub

Relay

{A,B,C,E}

{B,C,D}

(M1, A)

N2

{A,D}

N1

N3

(M1, A)

(M1, A)

(M1, A)

- SCRIBE
- Corona
- Feedtree
- Sub-2-Sub
- TERA
- ...

N5

(M1, A)

{A,X}

N4

{A,B,X}

Overlay Topologies for Pub/Sub

- “Good”overlay will allow for efficient and simple publication routing
- Small routing tables, low load on relays,
- low latency

- Ideally, overlay is topic-connected: i.e., one connected component for each topic-induced sub-graph
- Most existing implementations construct topic-connected overlays

Topics A and D are disconnected

Topic-Connectivity{A,B,C,E}

{B,C,D}

N2

{A,D}

N1

N3

N5

{A,X}

N4

{A,B,X}

- Node degree grows linearly with the subscription size
- Roughly twice as big as the average subscription size for rings/trees

{A,B,C,E}

{B,C,D}

N2

{A,D}

N1

N3

N5

{A,X}

N4

{A,B,X}

Scalability of the Simple Solution

- Negative impact on performance due to
- CPU load: neighbor monitoring, message processing
- Connection maintenance and header overhead
- Memory overhead: per-link state associated with routing and/or compression schemes being used, etc.

- Scalability barrier for large systems offering a wide range of subscription choices

Can we do better?

The Min-TCO Problem

- Minimum Topic-Connected Overlay (Min-TCO) problem:
- For a set of nodes V, set of topics T, and Interest: V T {true, false}
- Construct a topic-connected overlay G with the minimum possible number of edges (or average degree)

- TCO (decision version):
- Decide whether there is a topic-connected overlay consisting of k edges (for a given k)

Complexity of TCO

{B,C,D}

{A,B}

Lemma: TCO(V,T,Interest,k)NP

Proof: Topic connectivity is verifyable in polynomial time

Lemma: TCO(V,T,Interest,k) is NP-hard

Proof:

- Define an auxiliary problem Single Node TCO (SN-TCO) which is to decide if there is a topic-connected overlay in which the degree of single given node d
- Set Cover is polynomially reducible to SN-TCO
- SN-TCO is polynomially reducible to TCO
Theorem: TCO is NP-complete

N5

N2

{A,D}

N3

N1

N4

{A,B,C,D}

{A,C}

Approximating Min-TCO

- The idea: exploiting subscription overlaps
- Connecting the nodes with overlapping interests improves connectivity of several topics at once

- Greedy Merge (GM) algorithm:
- Start from a singleton connected component for each (v, t) V T
- At each iteration: add an edge that reduces the number of connected components for the biggest number of topics
- Stop, once there is a single connected component for each topic

Greedy Merge

{B,C,D}

{A,B,C,E}

N1

N2

{A,D}

N3

N5

{A,X}

N4

{A,B,X}

GM Running Time

- O(|V|4|T|)
- At most |V|2 iterations
- At most |V|2 edges inspected at each iteration
- At most |T| steps to inspect an edge

- Can be optimized to run in O(|V|2|T|)
- For each e V V, weight(e) = the number of connected components merged by e
- At each iteration, output the heaviest edge and adjust the other edge weights accordingly
- Stop once there are no more edges with weight > 0

Approximability Results

Lemma:

- The number of edges in the overlay constructed by GM log(|V||T|) OPT
Proof: Similar to that of the approximation ratio of the greedy algorithm for Set Cover

- There exists an input on which GM’s output meets this ratio
Theorem: No algorithm can approximate Min-TCO within a constant factor (unless P=NP)

Proof: Existence of such an algorithm would imply existence of the constant factor approximation for Set Cover which is known to be impossible (unless P=NP)

More Overlay Design Problems

- Filtering: Given an upper bound d on the node degree, minimize the number of relays used to connect each topic
- Captures the cases when full topic-connectivity is infeasible because of resource constraints

- Diameter: Given an upper bound d on the node degree, minimize the diameter of each topic in the overlay
- Latency optimal routing under resource constraints

- …

Conclusions

- Initiated formal study of the problem of designing efficient and scalable overlay topologies for pub/sub
- Defined a representative problem (Min-TCO) capturing the cost of constructing topic-connected overlays
- NP-Completeness, polynomial approximation, inapproximability results

- Empirical evaluation showed effectiveness of our approximation algorithm on practical inputs

Future Directions

- Study dynamic case
- Investigate other overlay design problems
- Study distributed case
- Partial knowledge of other node interest
- Dynamically changing interest assignments

Download Presentation

Connecting to Server..