processing transitive nearest neighbor queries in multi channel access environments
Download
Skip this Video
Download Presentation
Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments

Loading in 2 Seconds...

play fullscreen
1 / 35

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments - PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments. Xiao Zhang 1 , Wang-Chien Lee 1 , Prasenjit Mitra 1, 2 , Baihua Zheng 3 1 Department of Computer Science and Engineering 2 College of Information Science and Technology The Pennsylvania State University

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments' - karik


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
processing transitive nearest neighbor queries in multi channel access environments

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments

Xiao Zhang1, Wang-Chien Lee1, Prasenjit Mitra1, 2, Baihua Zheng3

1 Department of Computer Science and Engineering

2 College of Information Science and Technology

The Pennsylvania State University

3 School of Information Systems, Singapore Management University

EDBT, Nantes, France, 03/28/2008

outline
Outline
  • Background
  • Problem Analysis
  • New TNN Algorithms
  • Optimization
  • Experiments
  • Conclusions & Future Work
background tnn
Background – TNN
  • What is TNN?
    • S is a set of banks
    • R is a set of restaurants
    • TNN distance = 5+1 = 6
background tnn1
Background – TNN
  • What is TNN?
  • Given a query point p and two datasets S and R, TNN returns a pair of objects (s, r) such that ∀(s’, r’)∈S×R,

dis(p, s) + dis(s, r) ≤ dis(p, s’) + dis(s’, r’)

where dis(p,s) is the Euclidean distance between p and s.

  • First proposed by Zheng, Lee and Lee [1].

[1] B. Zheng, K.C.Lee and W.-C.Lee. Transitive nearest neighbor search in mobile environments. SUTC 2006

background broadcast
Background - broadcast
  • Server has all the data and broadcasts data in forms of radio signals in channels.
  • Mobile clients (cell phones and PDAs) tune in to broadcast channels, download necessary data and process queries.
  • Broadcast VS. on-demand
    • Support an arbitrary number of mobile devices to have simultaneous access
    • Efficient use of limited bandwidth
    • Light workload on the server side
background motivation
Background - motivation
  • Assumption:
    • Zheng, Lee and Lee assumed a single broadcast channel.
    • Based on existing technology (dual-mode, dual-standby cell phone), we assume multiple channels.
    • A mobile client can access information in multiple channels simultaneously
  • Challenges:
    • How to utilize the parallel processing ability of mobile clients to facilitate query processing?
    • How to reduce access time?
    • How to reduce energy consumption?
our contributions
Our contributions:
  • 1. We developed two new algorithms for TNN query in multi-channel access environment.
  • 2. We proposed two new distance metrics (MinTransDist and MinMaxTransDist) so that our new algorithms efficiently reduce search cost.
  • 3. We proposed an optimization technique to reduce energy consumption.
background settings
Background – settings
  • 1. Two broadcast channels, for S and R
  • 2. 2-dim points
  • 3. Air-indexing: R-tree[2]
  • 4. Broadcast in depth-first order, in order to avoid back-tracking
  • 5. (1, m) interleaving [3]
  • 6. performance metrics (in # of pages):
    • Access time
    • Tune-in time

[2] A. Guttman. R-trees: a dynamic index structure for spatial searching. inSigmod’84

[3] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access. TKDE 1997

problem analysis
Problem Analysis
  • Randomly choose ANY pair of objects (s’, r’ ), use the trans. dist. as a search range
  • Guarantee to enclose the answer pair (s, r)
problem analysis1
Problem Analysis
  • Theorem[1]:
    • the transitive distance determined by any pair of objects (s, r) is an upper bound.
  • General ideas of answering TNN queries:
    • Estimate: find a search range from the query point p by searching the index
    • Filter: filter unqualified data objects in the search range determined earlier to find the pair of objects with minimum transitive distance.
problem analysis2
Problem Analysis
  • Deficiencies of existing algorithms:
    • Approximate-TNN-Search:
      • Uses an equation to estimate the search range in the first step
      • Search range may be too large or too small
    • Window-Based-TNN-Search:
      • Two sequential NN searches in estimation step
      • Search range estimation is done in sequential order
      • Large access time
new tnn algorithms algo1
New TNN algorithms – algo1
  • Algo 1: Double-NN-Search
    • Issue two NN queries in estimation step
    • p’s NN in S, and p’s NN in R
    • (s1, r2)
new tnn algorithms algo2
New TNN Algorithms – algo2
  • Hybrid-NN-Search
    • Increases interaction between two channels
    • Uses result of the finished NN to guide the unfinished NN in order to reduce search range
    • Uses new distance metrics to perform branch-and-bound
    • Treat TNN distance as a whole
new tnn algorithms algo 2
New TNN Algorithms – algo 2
  • NN in Channel 1 finishes first
  • Already found s=p.NN(S)
  • Looking for r2, instead of r1
new tnn algorithms algo 21
New TNN Algorithms – algo 2
  • NN in channel 2 finishes first
  • Already found r=p.NN(R)
  • Looking for s2 instead of s1
  • Use new criteria when searching the index
  • Need new distance metrics for branch&bound
new tnn algorithms algo 22
New TNN Algorithms – algo 2
  • MinTransDist:
    • Lower bound for trans. dist. from p to an MBR to r.
  • MinMaxTransDist:
    • Upper bound for trans. dist. from p to an MBR to r.
  • Details given in the paper.
new tnn algorithms hybird
New TNN Algorithms - Hybird
  • Algorithm description:
    • If the two NN searches in both channels are not finished, follow the Double-NN algorithm
    • If the NN search in Channel 1 (Dataset S) finishes first, let s=p.NN(S), use s as the new query point and perform NN on the remaining portion of R-tree for dataset R.
    • If the NN search in Channel 2 (Dataset R) finishes first, change distance metrics, use MinTransDist and MinMaxTransDist to perform branch-and-bound. Find an s which can minimize the transitive distance.
new tnn algorithms hybrid
New TNN Algorithms - Hybrid
  • Updating and pruning strategy
    • Use queue to keep potential MBRs, sorted based on their arrival time
    • Case 2 (s=p.NN(S) finishes first):
      • Switch NN query point to the s
      • Initial upper bound update
        • If there is an intermediate result r’, update the upper bound with dis(p, s)+dis(s, r’ )
        • Scan the queue of MBRs and use dist. metr. in traditional NN queries.
new tnn algorithms hybrid1
New TNN Algorithms - Hybrid
  • Updating and pruning strategy (cont.)
    • Case 3 (r=p.NN(R) finishes first):
      • If there is an intermediate result s’, use

dis(p, s’)+dis(s’, r) as the new upper bound

      • Then scan all the MBRs in the queue, use

z=minMi∈MBR_queue{MinMaxTransDist(p, Mi, r)} to update the upper bound.

      • In traversal, use MinMaxTransDist to update the upper bound; use MinTransDist for pruning
new tnn algorithms hybrid2
New TNN Algorithms - Hybrid
  • Example for pruning:
optimization
Optimization
  • Goal: reduce energy consumption
  • Analysis:
    • Previous algorithms minimize the search range in the Estimate Step by issuing “exact” search
    • Energy consumption in Filter Step is low
    • Energy consumption in Estimate Step is high
  • Approach:
    • use “approximate” search in Estimate Step to save energy in this step
optimization1
Optimization
  • Approximate Search:
    • Relax the pruning condition
    • Use ratio of overlapping area to estimate the probability
    • Compare the ratio with a threshold α
optimization2
Optimization
  • How to determine α?
  • factors:
    • R-tree height and node depth
      • Use small α on the root and large α on leaves
    • Difference in densities of the two datasets involved
      • Small α or 0on the dataset with smaller density

exact search

approximate search

0

α

1

performance evaluation settings
Performance Evaluation - settings
  • Dataset 1:
    • 39,000 * 39,000 square region
    • Densities: 10-7.0, 10-6.6, 10-6.2, 10-5.8, 10-5.4, 10-5.0, 10-4.6, 10-4.2
    • # of points: 152, 382, 960, 2411, 6055, 15210, 38206, 95969
  • Dataset 2:
    • 39,000 * 39,000 square region
    • # of points: 2,000 – 30,000 with 2,000 increment
performance evaluation settings1
Performance Evaluation - settings
  • R-tree as air index
  • Broadcast in depth-first order
  • STR packing algorithm [3]
  • (1, m) interleaving [2]
  • 1,000 query points generated for each of the experiments

[3] S.Leutenegger, M.Lopez and J.Edginton. Str: a simple and efficient algorithm for r-tree packing. ICDE 1997

[2] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access. TKDE 1997

performance evaluation
Performance Evaluation
  • Algorithms with exact search:
    • Access time: Double-NN and Hybrid-NN have the same access time, which is smaller than Window-Based
    • 1.8≥ size(S) / size(R) ≥ 1 / 40
performance evaluation1
Performance Evaluation
  • Algorithms with exact search:
    • Tune-in time: when 0.01 ≤ size(S)/size(R) ≤ 0.4 Hybrid-NN gives the best tune-in time
performance evaluation2
Performance Evaluation
  • ANN vs. eNN
    • Improvement in tune-in time ranges from 11%-20%
performance evaluation3
Performance Evaluation
  • Hybrid algorithm with ANN:
conclusions
Conclusions
  • Double-NN and Hybrid-NN effectively reduce access time
  • Cases in which our algorithms reduces tune-in time are stated and discussed
  • Optimization technique effectively reduces tune-in time of all three algorithms
future work
Future Work
  • Generalized TNN queries in broadcast environment:
    • More than 2 datasets are involved
    • Visiting order not specified
    • Complete route query
  • Using new distance metrics in disk based environment
thank you
Thank you!
  • Any questions?
new tnn algorithms distance metrics backup slides
New TNN Algorithms – distance metrics (backup slides)
  • Def 1: (MinTransDist)
    • Given two points p and r, and an MBR MS, MinTransDist(p, MS ,r) finds a point s on MS such that MinTransDist(p, MS ,r)=dis(p, s)+dis(s, r) and for any point s’≠ s, s’ ∈MS

dis(p, s’)+dis(s’, r) ≥ MinTransDist(p, MS ,r)

new tnn algorithms distance metrics backup slides1
New TNN Algorithms – distance metrics (backup slides)
  • Def 2: (MaxDist)
    • Given two points p and r, and a line segment ℓ, MaxDist(p, ℓ, r) = maxi=I,2 {dis(p, vi)+dis(vi, r), where vi, (i=1, 2) are the two end points of ℓ
    • MaxDist(p, ℓ, r) gives a tight upper bound for all the transitive distances from p to any points on ℓ, to r.

r

p

new tnn algorithms distance metrics backup slides2
New TNN Algorithms – distance metrics (backup slides)
  • Def 3: (MinMaxTransDist)
    • Given two points p and r, and an MBR MS, MinMaxTransDist(p, MS, r) = min1≤i≤4{ MaxDist(p,ℓi, r ) } where ℓi (1≤i≤4) are the four sides of MBR MS
  • Lemma:
    • Given a starting point p, an ending point r, and an MBR MSenclosing a point dataset S, ∃s ∈ S, such that dis(p, s)+dis(s, r) ≤ MinMaxTransDist(p, MS, r)
ad