1 / 35

# Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments - PowerPoint PPT Presentation

Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments. Xiao Zhang 1 , Wang-Chien Lee 1 , Prasenjit Mitra 1, 2 , Baihua Zheng 3 1 Department of Computer Science and Engineering 2 College of Information Science and Technology The Pennsylvania State University

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments' - karik

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Processing Transitive Nearest-Neighbor Queries in Multi-Channel Access Environments

Xiao Zhang1, Wang-Chien Lee1, Prasenjit Mitra1, 2, Baihua Zheng3

1 Department of Computer Science and Engineering

2 College of Information Science and Technology

The Pennsylvania State University

3 School of Information Systems, Singapore Management University

EDBT, Nantes, France, 03/28/2008

Outline Multi-Channel Access Environments

• Background

• Problem Analysis

• New TNN Algorithms

• Optimization

• Experiments

• Conclusions & Future Work

Background – TNN Multi-Channel Access Environments

• What is TNN?

• S is a set of banks

• R is a set of restaurants

• TNN distance = 5+1 = 6

Background – TNN Multi-Channel Access Environments

• What is TNN?

• Given a query point p and two datasets S and R, TNN returns a pair of objects (s, r) such that ∀(s’, r’)∈S×R,

dis(p, s) + dis(s, r) ≤ dis(p, s’) + dis(s’, r’)

where dis(p,s) is the Euclidean distance between p and s.

• First proposed by Zheng, Lee and Lee [1].

[1] B. Zheng, K.C.Lee and W.-C.Lee. Transitive nearest neighbor search in mobile environments. SUTC 2006

Background - broadcast Multi-Channel Access Environments

• Server has all the data and broadcasts data in forms of radio signals in channels.

• Mobile clients (cell phones and PDAs) tune in to broadcast channels, download necessary data and process queries.

• Support an arbitrary number of mobile devices to have simultaneous access

• Efficient use of limited bandwidth

• Light workload on the server side

Background - motivation Multi-Channel Access Environments

• Assumption:

• Zheng, Lee and Lee assumed a single broadcast channel.

• Based on existing technology (dual-mode, dual-standby cell phone), we assume multiple channels.

• A mobile client can access information in multiple channels simultaneously

• Challenges:

• How to utilize the parallel processing ability of mobile clients to facilitate query processing?

• How to reduce access time?

• How to reduce energy consumption?

Our contributions: Multi-Channel Access Environments

• 1. We developed two new algorithms for TNN query in multi-channel access environment.

• 2. We proposed two new distance metrics (MinTransDist and MinMaxTransDist) so that our new algorithms efficiently reduce search cost.

• 3. We proposed an optimization technique to reduce energy consumption.

Background – Multi-Channel Access Environmentssettings

• 1. Two broadcast channels, for S and R

• 2. 2-dim points

• 3. Air-indexing: R-tree[2]

• 4. Broadcast in depth-first order, in order to avoid back-tracking

• 5. (1, m) interleaving [3]

• 6. performance metrics (in # of pages):

• Access time

• Tune-in time

[2] A. Guttman. R-trees: a dynamic index structure for spatial searching. inSigmod’84

[3] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access. TKDE 1997

Problem Analysis Multi-Channel Access Environments

• Randomly choose ANY pair of objects (s’, r’ ), use the trans. dist. as a search range

• Guarantee to enclose the answer pair (s, r)

Problem Analysis Multi-Channel Access Environments

• Theorem[1]:

• the transitive distance determined by any pair of objects (s, r) is an upper bound.

• General ideas of answering TNN queries:

• Estimate: find a search range from the query point p by searching the index

• Filter: filter unqualified data objects in the search range determined earlier to find the pair of objects with minimum transitive distance.

Problem Analysis Multi-Channel Access Environments

• Deficiencies of existing algorithms:

• Approximate-TNN-Search:

• Uses an equation to estimate the search range in the first step

• Search range may be too large or too small

• Window-Based-TNN-Search:

• Two sequential NN searches in estimation step

• Search range estimation is done in sequential order

• Large access time

New TNN algorithms – algo1 Multi-Channel Access Environments

• Algo 1: Double-NN-Search

• Issue two NN queries in estimation step

• p’s NN in S, and p’s NN in R

• (s1, r2)

New TNN Algorithms – algo2 Multi-Channel Access Environments

• Hybrid-NN-Search

• Increases interaction between two channels

• Uses result of the finished NN to guide the unfinished NN in order to reduce search range

• Uses new distance metrics to perform branch-and-bound

• Treat TNN distance as a whole

New TNN Algorithms – Multi-Channel Access Environmentsalgo 2

• NN in Channel 1 finishes first

• Looking for r2, instead of r1

New TNN Algorithms – Multi-Channel Access Environmentsalgo 2

• NN in channel 2 finishes first

• Looking for s2 instead of s1

• Use new criteria when searching the index

• Need new distance metrics for branch&bound

New TNN Algorithms – Multi-Channel Access Environmentsalgo 2

• MinTransDist:

• Lower bound for trans. dist. from p to an MBR to r.

• MinMaxTransDist:

• Upper bound for trans. dist. from p to an MBR to r.

• Details given in the paper.

New TNN Algorithms - Multi-Channel Access EnvironmentsHybird

• Algorithm description:

• If the two NN searches in both channels are not finished, follow the Double-NN algorithm

• If the NN search in Channel 1 (Dataset S) finishes first, let s=p.NN(S), use s as the new query point and perform NN on the remaining portion of R-tree for dataset R.

• If the NN search in Channel 2 (Dataset R) finishes first, change distance metrics, use MinTransDist and MinMaxTransDist to perform branch-and-bound. Find an s which can minimize the transitive distance.

New TNN Algorithms - Hybrid Multi-Channel Access Environments

• Updating and pruning strategy

• Use queue to keep potential MBRs, sorted based on their arrival time

• Case 2 (s=p.NN(S) finishes first):

• Switch NN query point to the s

• Initial upper bound update

• If there is an intermediate result r’, update the upper bound with dis(p, s)+dis(s, r’ )

• Scan the queue of MBRs and use dist. metr. in traditional NN queries.

New TNN Algorithms - Hybrid Multi-Channel Access Environments

• Updating and pruning strategy (cont.)

• Case 3 (r=p.NN(R) finishes first):

• If there is an intermediate result s’, use

dis(p, s’)+dis(s’, r) as the new upper bound

• Then scan all the MBRs in the queue, use

z=minMi∈MBR_queue{MinMaxTransDist(p, Mi, r)} to update the upper bound.

• In traversal, use MinMaxTransDist to update the upper bound; use MinTransDist for pruning

New TNN Algorithms - Hybrid Multi-Channel Access Environments

• Example for pruning:

Optimization Multi-Channel Access Environments

• Goal: reduce energy consumption

• Analysis:

• Previous algorithms minimize the search range in the Estimate Step by issuing “exact” search

• Energy consumption in Filter Step is low

• Energy consumption in Estimate Step is high

• Approach:

• use “approximate” search in Estimate Step to save energy in this step

Optimization Multi-Channel Access Environments

• Approximate Search:

• Relax the pruning condition

• Use ratio of overlapping area to estimate the probability

• Compare the ratio with a threshold α

Optimization Multi-Channel Access Environments

• How to determine α？

• factors:

• R-tree height and node depth

• Use small α on the root and large α on leaves

• Difference in densities of the two datasets involved

• Small α or 0on the dataset with smaller density

exact search

approximate search

0

α

1

Performance Evaluation - settings Multi-Channel Access Environments

• Dataset 1:

• 39,000 * 39,000 square region

• Densities: 10-7.0, 10-6.6, 10-6.2, 10-5.8, 10-5.4, 10-5.0, 10-4.6, 10-4.2

• # of points: 152, 382, 960, 2411, 6055, 15210, 38206, 95969

• Dataset 2:

• 39,000 * 39,000 square region

• # of points: 2,000 – 30,000 with 2,000 increment

Performance Evaluation - settings Multi-Channel Access Environments

• R-tree as air index

• STR packing algorithm [3]

• (1, m) interleaving [2]

• 1,000 query points generated for each of the experiments

[3] S.Leutenegger, M.Lopez and J.Edginton. Str: a simple and efficient algorithm for r-tree packing. ICDE 1997

[2] T.Imielinski, S.Viswanathan, and B.Badrinath. Data on air: organization and access. TKDE 1997

Performance Evaluation Multi-Channel Access Environments

• Algorithms with exact search:

• Access time: Double-NN and Hybrid-NN have the same access time, which is smaller than Window-Based

• 1.8≥ size(S) / size(R) ≥ 1 / 40

Performance Evaluation Multi-Channel Access Environments

• Algorithms with exact search:

• Tune-in time: when 0.01 ≤ size(S)/size(R) ≤ 0.4 Hybrid-NN gives the best tune-in time

Performance Evaluation Multi-Channel Access Environments

• ANN vs. eNN

• Improvement in tune-in time ranges from 11%-20%

Performance Evaluation Multi-Channel Access Environments

• Hybrid algorithm with ANN:

Conclusions Multi-Channel Access Environments

• Double-NN and Hybrid-NN effectively reduce access time

• Cases in which our algorithms reduces tune-in time are stated and discussed

• Optimization technique effectively reduces tune-in time of all three algorithms

Future Work Multi-Channel Access Environments

• Generalized TNN queries in broadcast environment:

• More than 2 datasets are involved

• Visiting order not specified

• Complete route query

• Using new distance metrics in disk based environment

Thank you! Multi-Channel Access Environments

• Any questions?

New TNN Algorithms – distance metrics (backup slides) Multi-Channel Access Environments

• Def 1: (MinTransDist)

• Given two points p and r, and an MBR MS, MinTransDist(p, MS ,r) finds a point s on MS such that MinTransDist(p, MS ,r)=dis(p, s)+dis(s, r) and for any point s’≠ s, s’ ∈MS

dis(p, s’)+dis(s’, r) ≥ MinTransDist(p, MS ,r)

New TNN Algorithms – distance metrics (backup slides) Multi-Channel Access Environments

• Def 2: (MaxDist)

• Given two points p and r, and a line segment ℓ, MaxDist(p, ℓ, r) = maxi=I,2 {dis(p, vi)+dis(vi, r), where vi, (i=1, 2) are the two end points of ℓ

• MaxDist(p, ℓ, r) gives a tight upper bound for all the transitive distances from p to any points on ℓ, to r.

r

p

New TNN Algorithms – distance metrics (backup slides) Multi-Channel Access Environments

• Def 3: (MinMaxTransDist)

• Given two points p and r, and an MBR MS, MinMaxTransDist(p, MS, r) = min1≤i≤4{ MaxDist(p,ℓi, r ) } where ℓi (1≤i≤4) are the four sides of MBR MS

• Lemma:

• Given a starting point p, an ending point r, and an MBR MSenclosing a point dataset S, ∃s ∈ S, such that dis(p, s)+dis(s, r) ≤ MinMaxTransDist(p, MS, r)