Count / Top-k Continuous Queries on P2P Networks

Count / Top-k Continuous Queries on P2P Networks 01/11/2006

Outline • Problem Definition • P2P Architecture • Count • Top-K • Experiment Setup • Future Work

Streaming Data in P2P • P2P • Dynamic changing topology, large scale, … • Streaming data • Continuous, unbounded, rapid, time-varying, noise • P2P + Streaming data • Dynamic in both data and topology

Objective and Goal • Objective • Issue a continuous query to estimate count and top-K • Goal • Lower down the communication cost • Lightweight maintenance • Approximated answers • An adaptive and progressive approach

Naïve approach • Flooding the overlay continuous • Pros • Closer to the exact answer • Cons • Network congestion • Still non-real time

The State-of-the-Art • Count • Focus on one-time answer in P2P • Deal with streaming data only • Top-K • P2P environment without streaming data • Distributed environment not P2P

P2P architecture • Assumption • Hierarchical P2P (Focused) • Super-peer hierarchical structure • Query issuer is a super-peer • Super peer connect with other super peers • Each peer belongs to only one super peer • Pure unstructured P2P

Big picture Group Accumulate information within a group based on the constraint and statistics Report changes SetConstraint Approximated answer

Group in hierarchical P2P Coordinator Issuer Peer

Group in hierarchical P2P 1 3 2 4

Group in hierarchical P2P 1 3 2 3 4 4

After partition Assume we have N objects and K Groups after partition Group1 Group3 Group2

User-specified Epsilon Group1 User-specifiedε(Precision) Group3 Group2

Consider a group O1 O2 O3 P2 P3 Objects P1 P4 Node Coordinator

Each node maintain the distribution information of owning objects # R2 Rate R3 P2 P3 object P1 R4 P4 R1

At initial - Polling P2 P3 P1 P4 Node Coordinator

Information at coordinator after polling 26 # 33 22 P2 P3 object P1 P4

Statistics information Estimated value Change value for each object Latest real value 26 # 33 P1 P2 P3 P4Δ O1 1/1 6/6 10/10 5/5 22 O2 11/11 13/13 5/5 9/7 36 O3 15/15 6/6 3/3 9/9 33 R 0.3 0.2 -0.05 0.6 T 15 15 17 13 22 Updated time stamp object Maximum changing rate(+/-) of objects in each peer

Update to Coordinator (Δ13, Δ23, Δ33) (Δ11, Δ21, Δ31) (Δ12, Δ22, Δ32) T2

Calculate Count

Redistribute Epsilon wi=Max(Δi)/Cx,0 where x is the i-index of Max(Δi) δi=wiεCx,0/ ∑wi

Visiting sequence P2 P3 Pick those peers would violate δ P1 P4

Update information P1 P2 P3 P4Δ O1 1/1 6/6 10/10 8/8- O2 11/11 11/11 5/5 6/6- O3 15/15 5/5 3/3 11/11- R 0.3 0.4 -0.05 0.2 T 15 30 17 33 Group

For those nodes not being visited P1 P2 P3 P4Δ O1 1/26/6 10/98/8 25 O2 11/1311/11 5/46/6 34 O3 15/185/5 3/211/11 36 R 0.3 0.4 -0.05 0.2 T 15 30 17 33 Group

Un-notified Leave P2 P3 Ping P1 is dead P1 P4 Remove P1’s information

Experiment Setup • Generate synthetic data set by statistics distribution for • Streaming data • Life time of peers • Metrics • Message size • Communication cost • Response latency • Result accuracy

Top-K • Use Regression to predicate the reasonable trend of changes • Once a updated result is required, Super Peer only need to ask those doubtfulpeers for doubtfulobjects • Update its counting list, and return the top k objects

Future Work • Connect and recommend latent good friends for each user • Good friends: the ones with the same interests (behaviors) • Exploiting current connecting peers to discover good friends bit by bit • Design a system that could make clusters reflecting current interests of individual peers and connecting them together based on their similarity by using user’s social network

Advantages • Reduce search time and diminish query traffic by using friends list • By utilizing their different strength of arcs/edges/ties = friendshipness, social networks exceed random-walk networks in quickly finding target objects

Example Level 1 Level 2

Example has larger weight than Score(Ni) Similarity Score(Ni)

Count / Top-k Continuous Queries on P2P Networks

Count / Top-k Continuous Queries on P2P Networks

Presentation Transcript

Demo Help Manual or Continuous presentation for familiarisation : 1. If you would like a continuous demo, select

L.O.1 To be able to count in steps of equal size.

RBC Manual Count

Continuous Renal Replacement Therapy (CRRT)

Foundations of Probabilistic Answers to Queries

Student October Count 2012 Resource Guide August 28, 2012

Querying Sensor Networks

COMPS311F

SQL Unit 7 Set Operations

Sampling Bayesian Networks

Vehicular Ad hoc Networks (VANET)

Complex networks are found throughout biology

NETWORK MODELS

Building Bayesian Networks

SQL Queries

Chapter 8: SQL-99

Interconnection Networks

Chapter 05 Ad Hoc Networks

Complex networks are found throughout biology

Architecting for Continuous Delivery

Architecting For Continuous Delivery