Processing transitive nearest neighbor queries in multi channel access environments
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments PowerPoint PPT Presentation


  • 46 Views
  • Uploaded on
  • Presentation posted in: General

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments. Xiao Zhang 1 , Wang-Chien Lee 1 , Prasenjit Mitra 1, 2 , Baihua Zheng 3 1 Department of Computer Science and Engineering 2 College of Information Science and Technology The Pennsylvania State University

Download Presentation

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Processing transitive nearest neighbor queries in multi channel access environments

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments

Xiao Zhang1, Wang-Chien Lee1, Prasenjit Mitra1, 2, Baihua Zheng3

1 Department of Computer Science and Engineering

2 College of Information Science and Technology

The Pennsylvania State University

3 School of Information Systems, Singapore Management University

EDBT, Nantes, France, 03/28/2008


Outline

Outline

  • Background

  • Problem Analysis

  • New TNN Algorithms

  • Optimization

  • Experiments

  • Conclusions & Future Work


Background tnn

Background – TNN

  • What is TNN?

    • S is a set of banks

    • R is a set of restaurants

    • TNN distance = 5+1 = 6


Background tnn1

Background – TNN

  • What is TNN?

  • Given a query point p and two datasets S and R, TNN returns a pair of objects (s, r) such that ∀(s’, r’)∈S×R,

    dis(p, s) + dis(s, r) ≤ dis(p, s’) + dis(s’, r’)

    where dis(p,s) is the Euclidean distance between p and s.

  • First proposed by Zheng, Lee and Lee [1].

[1] B. Zheng, K.C.Lee and W.-C.Lee. Transitive nearest neighbor search in mobile environments. SUTC 2006


Background broadcast

Background - broadcast

  • Server has all the data and broadcasts data in forms of radio signals in channels.

  • Mobile clients (cell phones and PDAs) tune in to broadcast channels, download necessary data and process queries.

  • Broadcast VS. on-demand

    • Support an arbitrary number of mobile devices to have simultaneous access

    • Efficient use of limited bandwidth

    • Light workload on the server side


Background motivation

Background - motivation

  • Assumption:

    • Zheng, Lee and Lee assumed a single broadcast channel.

    • Based on existing technology (dual-mode, dual-standby cell phone), we assume multiple channels.

    • A mobile client can access information in multiple channels simultaneously

  • Challenges:

    • How to utilize the parallel processing ability of mobile clients to facilitate query processing?

    • How to reduce access time?

    • How to reduce energy consumption?


Our contributions

Our contributions:

  • 1. We developed two new algorithms for TNN query in multi-channel access environment.

  • 2. We proposed two new distance metrics (MinTransDist and MinMaxTransDist) so that our new algorithms efficiently reduce search cost.

  • 3. We proposed an optimization technique to reduce energy consumption.


Background settings

Background – settings

  • 1. Two broadcast channels, for S and R

  • 2. 2-dim points

  • 3. Air-indexing: R-tree[2]

  • 4. Broadcast in depth-first order, in order to avoid back-tracking

  • 5. (1, m) interleaving [3]

  • 6. performance metrics (in # of pages):

    • Access time

    • Tune-in time

[2] A. Guttman. R-trees: a dynamic index structure for spatial searching. inSigmod’84

[3] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access. TKDE 1997


Problem analysis

Problem Analysis

  • Randomly choose ANY pair of objects (s’, r’ ), use the trans. dist. as a search range

  • Guarantee to enclose the answer pair (s, r)


Problem analysis1

Problem Analysis

  • Theorem[1]:

    • the transitive distance determined by any pair of objects (s, r) is an upper bound.

  • General ideas of answering TNN queries:

    • Estimate: find a search range from the query point p by searching the index

    • Filter:filter unqualified data objects in the search range determined earlier to find the pair of objects with minimum transitive distance.


Problem analysis2

Problem Analysis

  • Deficiencies of existing algorithms:

    • Approximate-TNN-Search:

      • Uses an equation to estimate the search range in the first step

      • Search range may be too large or too small

    • Window-Based-TNN-Search:

      • Two sequential NN searches in estimation step

      • Search range estimation is done in sequential order

      • Large access time


New tnn algorithms algo1

New TNN algorithms – algo1

  • Algo 1: Double-NN-Search

    • Issue two NN queries in estimation step

    • p’s NN in S, and p’s NN in R

    • (s1, r2)


New tnn algorithms algo2

New TNN Algorithms – algo2

  • Hybrid-NN-Search

    • Increases interaction between two channels

    • Uses result of the finished NN to guide the unfinished NN in order to reduce search range

    • Uses new distance metrics to perform branch-and-bound

    • Treat TNN distance as a whole


New tnn algorithms algo 2

New TNN Algorithms – algo 2

  • NN in Channel 1 finishes first

  • Already found s=p.NN(S)

  • Looking for r2, instead of r1


New tnn algorithms algo 21

New TNN Algorithms – algo 2

  • NN in channel 2 finishes first

  • Already found r=p.NN(R)

  • Looking for s2 instead of s1

  • Use new criteria when searching the index

  • Need new distance metrics for branch&bound


New tnn algorithms algo 22

New TNN Algorithms – algo 2

  • MinTransDist:

    • Lower bound for trans. dist. from p to an MBR to r.

  • MinMaxTransDist:

    • Upper bound for trans. dist. from p to an MBR to r.

  • Details given in the paper.


New tnn algorithms hybird

New TNN Algorithms - Hybird

  • Algorithm description:

    • If the two NN searches in both channels are not finished, follow the Double-NN algorithm

    • If the NN search in Channel 1 (Dataset S) finishes first, let s=p.NN(S), use s as the new query point and perform NN on the remaining portion of R-tree for dataset R.

    • If the NN search in Channel 2 (Dataset R) finishes first, change distance metrics, use MinTransDist and MinMaxTransDist to perform branch-and-bound. Find an s which can minimize the transitive distance.


New tnn algorithms hybrid

New TNN Algorithms - Hybrid

  • Updating and pruning strategy

    • Use queue to keep potential MBRs, sorted based on their arrival time

    • Case 2 (s=p.NN(S) finishes first):

      • Switch NN query point to the s

      • Initial upper bound update

        • If there is an intermediate result r’, update the upper bound with dis(p, s)+dis(s, r’ )

        • Scan the queue of MBRs and use dist. metr. in traditional NN queries.


New tnn algorithms hybrid1

New TNN Algorithms - Hybrid

  • Updating and pruning strategy (cont.)

    • Case 3 (r=p.NN(R) finishes first):

      • If there is an intermediate result s’, use

        dis(p, s’)+dis(s’, r) as the new upper bound

      • Then scan all the MBRs in the queue, use

        z=minMi∈MBR_queue{MinMaxTransDist(p, Mi, r)} to update the upper bound.

      • In traversal, use MinMaxTransDist to update the upper bound; use MinTransDist for pruning


New tnn algorithms hybrid2

New TNN Algorithms - Hybrid

  • Example for pruning:


Optimization

Optimization

  • Goal: reduce energy consumption

  • Analysis:

    • Previous algorithms minimize the search range in the Estimate Step by issuing “exact” search

    • Energy consumption in Filter Step is low

    • Energy consumption in Estimate Step is high

  • Approach:

    • use “approximate” search in Estimate Step to save energy in this step


Optimization1

Optimization

  • Approximate Search:

    • Relax the pruning condition

    • Use ratio of overlapping area to estimate the probability

    • Compare the ratio with a threshold α


Optimization2

Optimization

  • How to determine α?

  • factors:

    • R-tree height and node depth

      • Use small α on the root and large α on leaves

    • Difference in densities of the two datasets involved

      • Small α or 0on the dataset with smaller density

exact search

approximate search

0

α

1


Performance evaluation settings

Performance Evaluation - settings

  • Dataset 1:

    • 39,000 * 39,000 square region

    • Densities: 10-7.0, 10-6.6, 10-6.2, 10-5.8, 10-5.4, 10-5.0, 10-4.6, 10-4.2

    • # of points: 152, 382, 960, 2411, 6055, 15210, 38206, 95969

  • Dataset 2:

    • 39,000 * 39,000 square region

    • # of points: 2,000 – 30,000 with 2,000 increment


Performance evaluation settings1

Performance Evaluation - settings

  • R-tree as air index

  • Broadcast in depth-first order

  • STR packing algorithm [3]

  • (1, m) interleaving [2]

  • 1,000 query points generated for each of the experiments

[3] S.Leutenegger, M.Lopez and J.Edginton. Str: a simple and efficient algorithm for r-tree packing. ICDE 1997

[2] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access. TKDE 1997


Performance evaluation

Performance Evaluation

  • Algorithms with exact search:

    • Access time: Double-NN and Hybrid-NN have the same access time, which is smaller than Window-Based

    • 1.8≥ size(S) / size(R) ≥ 1 / 40


Performance evaluation1

Performance Evaluation

  • Algorithms with exact search:

    • Tune-in time: when 0.01 ≤ size(S)/size(R) ≤ 0.4 Hybrid-NN gives the best tune-in time


Performance evaluation2

Performance Evaluation

  • ANN vs. eNN

    • Improvement in tune-in time ranges from 11%-20%


Performance evaluation3

Performance Evaluation

  • Hybrid algorithm with ANN:


Conclusions

Conclusions

  • Double-NN and Hybrid-NN effectively reduce access time

  • Cases in which our algorithms reduces tune-in time are stated and discussed

  • Optimization technique effectively reduces tune-in time of all three algorithms


Future work

Future Work

  • Generalized TNN queries in broadcast environment:

    • More than 2 datasets are involved

    • Visiting order not specified

    • Complete route query

  • Using new distance metrics in disk based environment


Thank you

Thank you!

  • Any questions?


New tnn algorithms distance metrics backup slides

New TNN Algorithms – distance metrics (backup slides)

  • Def 1: (MinTransDist)

    • Given two points p and r, and an MBR MS, MinTransDist(p, MS ,r) finds a point s on MS such that MinTransDist(p, MS ,r)=dis(p, s)+dis(s, r) and for any point s’≠ s, s’ ∈MS

      dis(p, s’)+dis(s’, r) ≥ MinTransDist(p, MS ,r)


New tnn algorithms distance metrics backup slides1

New TNN Algorithms – distance metrics (backup slides)

  • Def 2: (MaxDist)

    • Given two points p and r, and a line segment ℓ, MaxDist(p, ℓ, r) = maxi=I,2 {dis(p, vi)+dis(vi, r), where vi, (i=1, 2) are the two end points of ℓ

    • MaxDist(p, ℓ, r) gives a tight upper bound for all the transitive distances from p to any points on ℓ, to r.

r

p


New tnn algorithms distance metrics backup slides2

New TNN Algorithms – distance metrics (backup slides)

  • Def 3: (MinMaxTransDist)

    • Given two points p and r, and an MBR MS, MinMaxTransDist(p, MS, r) = min1≤i≤4{ MaxDist(p,ℓi, r ) } where ℓi (1≤i≤4) are the four sides of MBR MS

  • Lemma:

    • Given a starting point p, an ending point r, and an MBR MSenclosing a point dataset S, ∃s ∈ S, such that dis(p, s)+dis(s, r) ≤ MinMaxTransDist(p, MS, r)


  • Login