1 / 11

The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks

The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks. D. ZeinalipourYazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava. Cathy Wang 05-04-2006. Outline. Introduction Problem Definition The TJA Algorithm Conclusion.

marlon
Download Presentation

The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks D. ZeinalipourYazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava Cathy Wang 05-04-2006

  2. Outline • Introduction • Problem Definition • The TJA Algorithm • Conclusion

  3. Introduction • Works for distributed sensor networks • Finds the k highest ranked answers • Minimizes the number of tuples to be transferred • Resolves queries in the network – minimize the consumption of bandwidth and delay

  4. Problem Definition • R: n attributes (sensors) each featuring m objects • G(V, E): network graph that interconnects the n vertices in V using the edge set E. V1 V2 V4 V3 V5 The local scores of five objects o1..o5 which are located at nodes v1..v5 network graph G

  5. Problem Definition • Q = (q1, q2, . . . , qn), a top-k query with n attributes. • Score function – monotone: ex: o1:(s1=100F, s2=90F, s3=80F) and o2:(s1=100F, s2=70F, s3=80F), wj=1, sim(qj, oij) represents the percentage ofsimilarity to the most similar object in dimension j. The top-1 object to the query Q=(max(temp), max(temp), max(temp)), would be o1 because Score(o1)=3.0 (i.e. 1*1.0 + 1*1.0 + 1*1.0) and Score(o2)=2.77 (i.e. 1*1.0 + 1*0.77 + 1*1.0) wj: weight factor

  6. The TJA Algorithm • Three phases: • Lower Bound phase – construct a threshold • Hierarchical Joining phase – each node eliminates objects below the lower bound, and joins qualifying objects from children nodes • Clean-Up phase – actual top-k results are identified

  7. U 1,2,3,4,5 2,3,4,5: U U 4 5 2,3 4,5 4,5: 3: Empty Oij Occupied Oij 5: Lower Bound Phase • Identify a set of objects that are used to construct a threshold • list(vi): descending similarity ordered elements of node vi • listk(vi): k local highest ranked objects of list(vi) • L(vi): partial lower bound: • Complete lower bound: LqueryNode=Ltotal={l1, l2,…, lo}, o ≥ k Ltotal {1,3} V1 V2 V3 V4 Ex: Find the time moment with the highest average temperature V5

  8. U 1,2,3,4,5 U U 4 5 2,3 4,5 4,5: 3: 5: Hierarchical Joining Phase • Propagate Ltotal to all nodes in the network • Each node vi search list(vi), and identify the lowest ranked object (idx) belong to Ltotal. • Objects above idx are candidates listidx(vi) • Forward listidx(vi) to parent if vi is a leaf node, else • Receives listidx(vj) from its children, and get a partial result: • Superset of the final top-k result: RqueryNode=Rtotal={r1, r2,…, ro}, o ≥ k ex: Rtotal={(O1, 3.63),(O3, 4.05),(O’4, 3.54)} + Rtotal {1,3,4} V1 2,3,4,5: + V2 + V3 V4 Empty Oij V5 Occupied Oij Occupied Oij

  9. Clean-Up Phase • If objects have upper bound higher than the k-th complete result, compute the exact scores of these objects by: • request exact score from its children • objectR’(vi): fetch all objects in R’. • join lists from children and get the full score for each object in R’, Ctotal. • get Ctotal, and compute the final top-k answers.

  10. Conclusion This paper • studies the problem of finding the k highest rank answers to user query in a sensor network environment. • uses a fixed number of phases. • deploys in-network aggregation to minimize the utilization of the network.

  11. Thank You! Have a great break!

More Related