Scalable Matching Algorithm in Publish/Subscribe Systems

Practical Theory Perspectives CS598ig – Fall 04 Presented by: Mayssam Sayyadian

Publish/Subscribe System • Event notification system • Producer publishes messages • Consumer waits for certain types of events by placing subscriptions • Basic components to be defined: • Information space • Subscriptions • Events (event schema) • Notifications • Many applications and examples: stock information delivery, auction systems, air traffic control, news feed , network monitoring, etc…

Research Issues • System architecture • Matching and Dispatching • Routing • Reliable messaging sending • Security • Special application issues • Mobile environment…

Pub/Sub Systems: Examples • IBM – Gryphon • Stanford – SIFT and more… • CU-Boulder – Siena • France – Le Subscribe • Technische University Darmstadt – REBECA • Microsoft – Herald • MIT • Others – XMLBlaster, Elvin4, TIB, Keryx, REBECA

Earlier Classification • Subject based (channel based) • System contains many channels • Subscriptions and notifications belong to special channel • Simple and straight forward matching • Restrictive • Content based • No channel • Notifications are sent to the subscribers based on their content • More generic • Matching suffer from scaling problem (addressed in this paper)

Content Based Matching Problem • Naïve solution: • Match incoming events per each subscription • Linear to the number of subscriptions • Not practical • Requisite: • Matching and dispatching should be sub-linear in terms of subscriptions • Intuition: • Combine parts of subscription to reduce the number of tests for each event

Event Forwarding Algorithms • Decision trees • Use a tree structure to describe the event matching information • Forwarding process is an event go through the tree structure • Example: Gryphon • Hash functions • Use hash function to index all components of notifications • Use other efficient way to find matched notifications • Examples: Le Subscribe

The Big Picture: The Information Bus Picture from “The Information Bus – An Architecture for Extensible Distributed Systems”, Brian M. Opi, et al, SOSP 1993

A Scalable Matching Algorithm • “Matching Events in a Content-based Subscription System”, M. K. Aguilera - IBM • Address scalability of matching algorithms • Sub-linear in the number of subscriptions • Space complexity: linear • Do preprocessing • Assume (almost) infrequent update for subscriptions

Matching Algorithm • Classification ? • Consider a decision tree classifier with subscriptions as set of possible classes • Analyze subscriptions • sub := pr1 ^ pr2 ^ pr3 • Conjunction of elementary predicatespri = testi(e)  resi • e.g. (city=LA) and (temperature < 40) • pr1 = test1(…)  LA • pr2 = test2(…)  “<“ • test1 = “examine attribute city” • test2 = “examine attribute temperature 40”

Matching Algorithm (Cont’d.) • Preprocess to make the matching tree • Each non-leaf node is a test • Each edge from test node is a possible result • Each leaf node is a subscription • Pre-process each of the subscriptions and combine the information to prepare the tree • On receiving events, follow the sequence of test nodes and edges till a leaf node is reached

Matching Tree • Don’t care tests • Related tests sub3=(test1  res1)^(test2  res2) sub4=(test3  res3)^(test4  res4) (test3  res3)  (test1  res1)

Matching Tree (Equality Tests) Conjugation of equality tests: sub1=(attr1=v1)^(attr2=v2)^(attr3=v3) sub2=(attr1=v1)^(attr2=*)^(attr3=v3’) sub3=(attr1=v1’)^(attr2=v2)^(attr3=v3)

Complexity • Assumptions: • All attributes have the same value set • Only equality tests being done • No related test in the tree • Events come from a uniform distribution • Pre-processing: • Time complexity: O(NK), K attributes & N subscriptions • Space complexity: O(NK) • Matching Complexity: • Expected time to match a random event: O(N 1-λ ), sub linear • λ = ln V / (ln V + ln K’), note 1> λ >0 • V: number of possible values for each attribute • K’: number of attributes in the schema + 1 • What about worst case ?

Optimizations • Collapse a chain of * edges (60% gain) • Example: collapse B to A • Statically pre-compute successor nodes (20% gain) • Separate sub-trees for attributes that rarely have don’t care in subscriptions

Performance • Operations per Event • Space per Event = Edges + Successor nodes • Latency: 4ms for 25,000 subscriptions • Attributes vary in popularity, follow Zipf’s distribution • Tests for 30 attributes with 3 possible values • Distribution always got 100 matches per event Operations per Event Space (thousands of cells)

Discussion Points • Topology Matters ! • What about non-equality based subscriptions ? • If content based subscriptions are used with equality tests only, are there other ways to achieve sub-linear matching times? • Exact vs. approximate results • What if • Subscriptions vary by time frequently • Stream of subscriptions • Multi dimensional events

“Computation in Networks of Passively Mobile Finite-State Sensors”, Dana Angluin, James Aspnes, Zoe Diamadi, Michael Fischer, Rene Peralta, PODC 2004.

The Problem … A Flock of Birds ! • Birds: finite state agents (sensors with states) • Resource is limited • Passive mobility (no control) • Communication: How much ? • Problems • Is there a solutions ? • What is the probable solutions ?

A Wider View • Question: • What computations are possible in a cooperative network of passively mobile finite-state sensors. • Assumptions: • Mobility is passive (not under sensor’s control) • Sufficiently rapid and unpredictable (no stable routing strategy) • Complete communication • Identical sensors: no identifier

Formal Model: Population Protocols • Population Protocol (A): • A finite input and output alphabets: X, Y • A finite set of states: Q • An input function: I : X→Q • An output function: O : Q →Y • A transition function: : (Q  Q) → Q  Q • Transitions:(p,q)→(p’,q’) if (p,q)=(p’,q’)

Formal Model (Cont’d) • Population protocol runs in a Population of any finite size n. • Population P : • A set A of n agents with irreflexive relationship E AA that are interpreted as directed edges of an interaction graph • Population Configuration • A mapping C: A Q • Specifies the set of states of each member of the population • Computation: • A finite or infinite sequence of population configurations: C0 , C1 , C2 , … such that i: C  Ci

Formal Models: Computation • No halting but stabilizing ! • Stabilizing is a global property of the population • Individual agents do not know the if they have stabilized • It is possible to bound number of interactions before having outputs stabilized, by some stochastic assumptions • To model computation: • What is the input assignment • What should be the output assignment • Definition of an output stable configuration • Formally define: stably computing an input-output relation by a population protocol • FA(x) = y for R(x, y)  A stably computes the partial function FA: X Y

Functions • Population protocols compute partial functions from X to Y . • Need for suitable input and output encoding for functions on other domains • Functions with multiple arguments • Predicates on X • Integer Functions

A Stably Computable Expression Language • Closure properties: • If f and g are stably computable then so is about f, f  g and f  g • Parity (if there are odd number of 1’s in the input) • Majority • Arithmetic functions • Stably computable expression language • An upper bound on the set of stably computable predicates All predicates stably computable in the model with all pairs enabled are in the class NL  characterization of this theorem is an open problem

Other Issues • Restricted Interactions • Some interaction graphs permit powerful computations • E.g. a population whose interaction graph is a directed line  linear space Turing machine • The complete graph (discussed so far) is the weakest structure for computing predicates •  Any weakly connected graph can simulate this

Randomized Interactions • Measures other than stability • Let’s add probabilistic assumptions on interactions • Consider computations that are correct with high probability • Question about expected resource use • Benefits of a leader • Simulating counters: The model can simulate O(1) counters of O(n) • How to elect a leader  use ideas of majority and parity functions • The set of predicates accepted by a randomized population protocol with probability ½ +  is contained in P RL

Discussion Points • So what ?! • Theoretic fundamentals always help • Consider interaction graph as input  what interesting properties about the underlying interaction graph for input could be stably computed ?  applications in analyzing the structure of sensor nets. • Consider one-way communication • Assume sampling models other than uniform, where does this help? • Formal methods + Methodology • Remember converting differential equations into distributed protocols • What do you THINK ! • Formalizing computation  Apply methodology

“Performance Evaluation of a Communication Round over the Internet”, Omar Bakr, Idit Keidar, PODC’02 Some slides taken from Omar Bakr’s’s presentation

Communication Round • Exchange of information from all hosts to all hosts • Part of many distributed algorithms, systems • consensus, atomic commit, replication, ... • Evaluation  Some metric • Number of rounds (or steps) required • How long is it going to take • Local running time of one host engaged • Overall running time • What is the best way to implement it ? • Centralized vs. decentralized

Example Implementations (b) (a) • All to all • Leader • Secondary Leader (c)

Experiment I • 10 hosts: Taiwan, Korea, US academia, ISPs • TCP/IP (connections always up) • Algorithms: • All-to-all • Leader (initiator) • Secondary leader (not initiator) • Periodically initiated at each host • 650 times over 3.5 days

Overall Running Time: • Elapsed time from initiation (at initiator) until all hosts terminate • Requires estimating clock differences • Clocks not synchronized, drift • We compute difference over short intervals • Compute 3 different ways • Achieve accuracy within 20 ms. on 90% of runs • Overall Running Times From MIT • Ping-measured latencies (IP): • Longest link latency 240 milliseconds • Longest link to MIT 150 milliseconds

Measured Running Times Runs Initiated at MIT / Taiwan

What’s going on ? • Loss rates on two links are very high • 42% and 37% • Taiwan to two ISPs in the US • Loss rates on other links up to 8% • Upon loss, TCP’s timeout is big • More than round-trip-time • All-to-all sends messages on lossy links • Often delayed by loss

Distribution of Running Times Up to 1.3 sec.

Removing Taiwan • Overall running times much better • For every initiator and algorithm, less than 10% over 2 seconds (as opposed to 55% previously) • All-to-all overall still worse than others! • either Leader or Secondary Leader best, depending on initiator • loss rates of 2% - 8% are not negligible • all-to-all sends O(n2) messages; suffers • But, all-to-all has best local running times

Probability of Delay due to Loss • If all links would have same latency • assume 1% loss on all links; 10 hosts (n=10) • Leader sends 3(n-1) = 27 messages • probability of at least one loss: 1 -.9927 »24% • All-2-all sends n(n-1) = 90 messages • probability of at least one loss: 1 -.9990 » 60% • In reality, links don’t have same latency • only loss on long links matters • Each communication has a cost !

Discussioln Points and Lessons Learned • Internet is A VERY SPECIAL distributed system (not an ideal one !) • Message loss causes high variation in TCP link latencies • latency distribution has high variance, heavy tail • Latency distribution determines expected time for receiving O(n) concurrent messages • Secondary leader helps • No triangle inequality, especially for loss • Different for overall vs. local running times • Number of rounds/steps not sufficient metric • One-to-all and all-to-all have different costs

Scalable Matching Algorithm in Publish/Subscribe Systems

Scalable Matching Algorithm in Publish/Subscribe Systems

Presentation Transcript

Career Theory Individual Perspectives

Reflecting on Grounded Practical Theory

Goal Setting: Theory, research and practical applications

Capital Structure: Theory and Practical Decision Making

Practical Applications of Reliability Theory

‘Nothing as Practical as a Good Theory’

Practical Use of Straight line Theory

PRACTICAL PERSPECTIVES IN COACH/SCHOOL LIABILITY

CONTROL THEORY: HISTORY, MATHEMATICAL ACHIEVEMENTS AND PERSPECTIVES

Business Valuation Theory: A Practical Perspective

Practical Geological Application of Fourier Theory

Perspectives in string theory ?

The Practical Art of Endpoint Selection: Industry Perspectives

Product Stewardship: Global, Local and Practical Perspectives

Practical Bifurcation Theory

Pedagogical Perspectives and Practical Knowledge of Teaching

Critical Theory: Other Perspectives Michel Foucault

Preparing to Downsize: Legal, Practical and Human Perspectives

Reflecting on Grounded Practical Theory

Practical Applications of Credibility Theory

Critical Theory: Other Perspectives Philosophical Hermeneutics

Development Theory: Third World Perspectives