1 / 26

Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries

Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries. Presented By: Muhammad Aamir Cheema Joint work with Xuemin Lin, Wenjie Zhang, Ying Zhang. University of New South Wales, Australia. Introduction. Nearest Neighbor Query

fortune
Download Presentation

Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries Presented By:Muhammad Aamir Cheema Joint work with Xuemin Lin, Wenjie Zhang, Ying Zhang University of New South Wales, Australia

  2. Introduction • Nearest Neighbor Query • Find the user that is closest to the query facility • Reverse k Nearest Neighbor Query (RkNN) • Find every user for which the query facility is one of the k closest facilities u2 u1 q f1 • Monochromatic RkNN • queries are also supported u3 f2 u4 f3 • Nearest Neighbor Query • Find the user that is closest to the query facility • Reverse Nearest Neighbor Query (RNN) • Find every user for which the query facility is the closest facility u3 is the nearest neighbor of q u1 and u2 are RNNs of q u1, u2 and u3 are R2NNs of q

  3. Preliminaries Half-space Pruning [VLDB04] dist(u,f2) < dist(u,q) so u cannot be RNN The half-space that contains f2 can be pruned Filtering Repeat until there is no facility in the unpruned area Find a nearby facility in the unpruned space Prune by using the half-space Containment The users that are contained in the unpruned space are the candidates Verification For each candidate user u Verify it if no object is within range dist(u,q) RNN query u3 u u1 f3 f5 u4 q f2 f1 f4 u2

  4. Preliminaries Half-space Pruning [VLDB04] the space that is contained by k half-spaces can be pruned Filtering Repeat until there is no facility in the unpruned area Find a nearby facility in the unpruned space Prune by using the half-space Containment The users that are in the unpruned space are the candidate objects Verification For each candidate user u Verify it if less than k facilities are within the range dist(u,q) R2NN query u u3 u1 f3 f5 q u4 f2 f1 f6 f4 u2

  5. Preliminaries Filtering Repeat until there is no facility in the unpruned area Find a nearby facility in the unpruned space Prune by using the half-space Containment The users that are in the unpruned space are the candidate objects Verification For each candidate user u Verify it if less than k facilities are within the range dist(u,q) R2NN query f3 f5 q f2 f1 f6 f4 FINCH [VLDB08] approximates the unpruned region by a convex polygon

  6. Preliminaries • Influence Zone Zk • An area such that for every p not in Zk, |Cp| ≥ k and for every p’ in Zk, |Cp’| < k • RkNN query • Return every user u for which |Cu| < k • OR every user u in Zk _ K =2 u3 u1 f3 f5 p u4 f2 q • Notations • For any point p, Cp denotes the circle centered at p with radius r = dist(p,q) • | Cp | denotes the number of facilities inside Cp f6 f4 p’ u2

  7. Advantage Existing Algorithms Our Algorithm Pruning Pruning Prune the data space Compute influence zone Containment Containment Candidates = objects in the unpruned space Result = objects that are inside the influence zone Verification Verify each candidate object if q is one of its k nearest neighbors

  8. Naïve Approach • For each facility f • Draw the half-space between f and q • Zk is the space that is pruned by at most k-1 half-spaces _ f5 f3 f5 f2 q f6 f4

  9. Observation 1 • A facility f can be ignored if it lies outside Cp for every p inside current unpruned polygon f7 f3 • Intuition • If a facility f lies outside Cp, the half-space b/w • f and q cannot prune p • If f lies outside every Cp, the half-space of f cannot prune any p f5 q f2 f6 f4

  10. Observation 2 • A facility f can be ignored if it lies outside Cp for every p on the boundary of current unpruned polygon f7 f3 • Intuition • For every p’ inside the polygon, there exists a p on the boundary such that Cp contains Cp’ f5 q f2 p p’ f6 f4

  11. Observation 3 • A facility f can be ignored if it lies outside Cv for everyvertex of current unpruned polygon v f7 • Intuition • CA U CB contains CC f3 f5 p q f2 v’ C A f6 B f4 q

  12. Observation 4 • A facility f can be ignored if it lies outside Cv for everyconvexvertex v of current unpruned polygon f7 f3 • Intuition • Any vertex v’ that is not a convex vertex lies inside the convex polygon and is not required to check for this reason f5 q f2 v’ f6 f4 The above pruning condition is tight

  13. Algorithm • Initialize Zk as the data universe • Insert root of R-tree in heap • While heap is not empty • Deheap an entry e • If e cannot be pruned • If e is an intermediate node • Insert children of e in the heap • Else • Use the half-space of e to update Zk If e lies outside every Cv for every convex vertex of Zk then e can be pruned

  14. Other highlights of our algorithm • Observations to efficiently prune certain entries • Efficient determination of convex vertices • Prove that influence zone is always a star-shaped polygon • Efficient containment checks are possible for star-shaped polygons

  15. RkNN Processing • Static RkNN queries • Pruning Phase (compute Zk) • Containment Phase (return users inside Zk) • Continuous BichromaticRkNN queries • Compute Zk • The users that enter Zk become RkNNs and the users that leave it are no more the RkNNs

  16. Theoretical Analysis • Area of Influence Zone = k/|F| • Number of RkNNs = |U|. k / |F| • IO cost of computing Zk= • IO cost of RkNN queries • = IO cost of computing Zk+ • r = • S = number of facilities (i.e., |F|) • f = fanout of the R-tree • r = • S = number of users (i.e., |U|) • f = fanout of the R-tree

  17. Experiments • Snapshot RkNN Queries • FINCH [1] (page size 4KB and number of buffers is 10) • Real dataset containing 175,812 locations in North America • Half of the randomly chosen points form the set of facilities • The remaining half form the set of users [1] W. Wu, F. Yang, C. Y. Chan, K. L. Tan. FINCH: Evaluating Reverse k Nearest Neighbor Queries on Location Data. VLDB 2008

  18. Experiments • Snapshot RkNN Queries • The facilities are from the real data set • The users follow Normal distribution

  19. Experiments • Verification of Theoretical Analysis • 100,000 facilities following Uniform distribution • 100,000 users following Uniform distribution

  20. Experiments • Verification of Theoretical Analysis • 100,000 facilities following Uniform distribution • 100,000 users following Uniform distribution

  21. Experiments • Continuous RkNN Queries • Moving objects and queries generated using Brinkhoff Generator [1] on road map of Texas • Data space is 1000 Km X 1000 Km • Our algorithm (InfZone) is compared with LazyUpdates [2] [1] T. Brinkhoff. A framework for generating network-based moving objects.GeoInformatica, 2002. [2] M. A. Cheema, X. Lin, Y. Zhang, W. Wang, W. Zhang. Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse kNN.PVLDB, 2009.

  22. Experiments

  23. Thanks

  24. Experiments • Snapshot RkNN Queries

  25. Experiments • Snapshot RkNN Queries

  26. Relationship with Voronoi

More Related