Finding probabilistic nearest neighbors for query objects with imprecise locations
Download
1 / 35

Finding Probabilistic Nearest Neighbors for Query Objects with Imprecise Locations - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

Finding Probabilistic Nearest Neighbors for Query Objects with Imprecise Locations. Yuichi Iijima and Yoshiharu Ishikawa Nagoya University, Japan. Outline. Background and Problem Formulation Related Work Query Processing Strategies Experimental Results Conclusions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Finding Probabilistic Nearest Neighbors for Query Objects with Imprecise Locations' - romeo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Finding probabilistic nearest neighbors for query objects with imprecise locations

Finding Probabilistic Nearest Neighbors for Query Objects with Imprecise Locations

Yuichi Iijima and Yoshiharu IshikawaNagoya University, Japan


Outline
Outline with Imprecise Locations

  • Background and Problem Formulation

  • Related Work

  • Query Processing Strategies

  • Experimental Results

  • Conclusions


Imprecise location information
Imprecise with Imprecise LocationsLocation Information

  • Sensor Environments

    • Measurement Noise

    • Frequent updates may not be possible

      • GPS-based positioning consumes batteries

  • Robotics

    • Localization using sensing andmovement histories

    • Probabilistic approach hasvagueness

  • Privacy

    • Location Anonymity


Nearest neighbor queries
Nearest Neighbor Queries with Imprecise Locations

  • Nearest Neighbor Queries

    • Example: Find the closest bus stop

    • Traditional problem in spatial databases

      • Efficient query processing using spatial indices

      • Extensible to multi-dimensional cases (e.g., image retrieval)

  • What happen if the location of query object is uncertain?

NN of q

q


Example scenario 1
Example Scenario (1) with Imprecise Locations

  • Query: Find the nearest gas station

Location estimation based on noisy GPS data

0%

35%

Nearest gas station

depends on the possiblecar locations

10%

55%

NN objects are definedin a probabilistic way


Example scenario 2
Example Scenario (2) with Imprecise Locations

  • Mobile Robotics

    • Location of the robot is estimatedbased on movement histories andsensor data

    • Measurements are noisy

    • Localization based on probabilisticmodeling

      • Kalman filter, particle filter, etc.

    • Estimated location is given as aprobabilistic density function (PDF)

      • PDF changes on each estimation


Probabilistic nearest neighbor query 1
Probabilistic Nearest Neighbor Query (1) with Imprecise Locations

pq(x)

  • PNNQ for short

  • Assumptions

    • Location of query object qis specified as a Gaussian distribution

    • Target data: static points

  • Gaussian Distribution

    • Σ: Covariance matrix


Probabilistic nearest neighbor query 2
Probabilistic Nearest Neighbor Query (2) with Imprecise Locations

  • Definition

  • Find objects which satisfy the condition

    • The probability that the object is thenearest neighbor of q is greater than or equal to θ

q


Outline1
Outline with Imprecise Locations

  • Background and Problem Formulation

  • Related Work

  • Query Processing Strategies

  • Experimental Results

  • Conclusions


Related work
Related Work with Imprecise Locations

  • Query processing methods for uncertain (location) data

    • Cheng, Prabhakar, et al. (SIGMOD’03, VLDB’04, …)

    • Tao et al. (VLDB’05, TODS’07)

    • Consider arbitrary PDFs or uniform PDFs

    • Most of the case, target objects are imprecise

  • Research related to Gaussian distribution

    • Gauss-tree [Böhm et al., ICDE’06]

      • Target objects are based on Gaussian distributions

  • Our former work

    • Ishikawa, Iijima, Yu (ICDE’09): Probabilistic range queries


Outline2
Outline with Imprecise Locations

  • Background and Problem Formulation

  • Related Work

  • Query Processing Strategies

  • Experimental Results

  • Conclusions


Na ve approach 1
Naïve Approach (1) with Imprecise Locations

  • Use of Voronoi Diagram

    • Well-known method for standard (non-imprecise) nearest neighbor queries

Voronoiregion Ve


Na ve approach 2
Naïve Approach (2) with Imprecise Locations

  • PrNN(q, o): Nearest neighbor probability

    • Probability that object o is the nearest neighbor of query object q

    • It can be computed by integrating the probability density function pq(x) over Voronoi region Vo

  • Problem

    • Need to consider all targetobjects

    • Numerical integration (MonteCarlo method) is quite costly!

pq(x)

pq(x)

Ve


Our approach
Our Approach with Imprecise Locations

  • Outline of processing

    • Filtering

      • Prune non-candidate objects whose PrNN are obviously smaller than the threshold 

      • Low-cost filtering conditions

    • Numerical integration for the remaining candidate objects

  • Two strategies

    • -region-based Approach

    • SES-based Approach

      • SES: Smallest Enclosing Sphere


Strategy 1 region based approach 1
Strategy 1: with Imprecise Locations-Region-Based Approach (1)

  • -region

    • Similar concepts are often used in query processing for uncertain spatial databases

    • Definition: Ellipsoidal region for which the result of the integration becomes 1 – 2 :The ellipsoidal regionis the -region

  • Example:  is specified as 1%,we consider 98%  -region

q

 -region


Strategy 1 region based approach 2
Strategy 1: with Imprecise Locations-Region-Based Approach (2)

  • Query Processing

    • -region for the query is computed at first

      • -region can be derived using r-table:

        • It is constructed for the normal Gaussian (S = I, q = 0): Given , it returns appropriate r

      • Final -region can be obtained by transformation

    • Derive the bounding box of-region

    • Objects whose Voronoiregions overlaps with the box are the candidates

      • {a, c, d, f, g},in this example

a

c

d

g

f

θ-region


Strategy 1 region based approach 3
Strategy 1: with Imprecise Locations-Region-Based Approach (3)

  • Derivation details

  • First, we consider the normal Gaussian

    • Using r-table, we can get the appropriate r for given 

  • Second, a spherical -region is derived based on transformation

    • Transformation is performed by analyzing 

  • Third, the bonding box is calculatedwhere (Σ)iiis the (i, i) entry of Σ

xj

xi

wj

q

r

wj

q

q

wi

wi


Strategy 2 ses based approach 1
Strategy 2: SES-Based Approach (1) with Imprecise Locations

  • SES: Smallest Enclosing Sphere

    • For each Voronoi region Vo, we compute its SES SESo beforehand

SESe: SES for Voronoi region of e


Strategy 2 ses based approach 2
Strategy 2: SES-Based Approach (2) with Imprecise Locations

  • Integration over SESo gives the upper-bound for PrNN(q, o)

    • Integration over a sphere region is more easier to compute

      • We use a table called U-catalog constructed beforehand

pq(x)


Strategy 2 ses based approach 3
Strategy 2: SES-Based Approach (3) with Imprecise Locations

  • What is U-catalog?

    • Given two parameters  and , it returns corresponding integral

    • U-catalog is made for different (,  ) pairs by computing the integral of normal Gaussian over sphere region R


Strategy 2 ses based approach 4
Strategy 2: SES-Based Approach (4) with Imprecise Locations

  • To use U-catalog, another approximation is required since it is only useful for normal Gaussian

    • Use of upper-bounding function pq(x)

    • pq(x) tightly bounds pq(x) and has sphericalisosurfaces

    • For pq(x), we can easily derive its integral over SESo using U-catalog

  • In summary, we use twoapproximations:

T

T

T

T

pq(x)

pq(x)


Strategy 2 ses based approach 5
Strategy 2: SES-Based Approach (5) with Imprecise Locations

  • Bounding Function

    • Original Gaussian PDF

    • Upper-Bounding Functionwhere

is satisfied


Outline3
Outline with Imprecise Locations

  • Background and Problem Formulation

  • Related Work

  • Query Processing Strategies

  • Experimental Results

  • and Conclusions


Setup of experiments 1
Setup of Experiments (1) with Imprecise Locations

  • Target Data

    • 2D point data (50K entries)

    • Based on road line segments in Long Beach

  • Computation of Voronoiregions and SESs

    • LEDA package was used

  • Comparison

    • Strategies 1, 2, and their hybrid approach

    • Evaluation metric: Response time


Setup of experiments 2
Setup of Experiments (2) with Imprecise Locations

  • Default Parameters

    • Covariance matrix

      • For this matrix the shape of the isosurface of pq(x) is an ellipse titled at 30 degrees and the major-to-minor axis ratio is 3:1

      • γ : Parameter for controlling vagueness (default: γ = 10)

    • Probability threshold value: θ = 0.01

    • No. of samples for Monte Carlo method: 1,000,000

pq(x)


Candidate objects in strategy 1
Candidate Objects in Strategy 1 with Imprecise Locations

Bounding box of the θ-region

Query center

Voronoi regions for candidate objects

Voronoi regions for answer objects


Candidate objects in strategy 2
Candidate Objects in Strategy 2 with Imprecise Locations

Query center

Voronoi regions for candidate objects

Voronoi regions for answer objects


Candidate objects in hybrid strategy
Candidate Objects in Hybrid Strategy with Imprecise Locations

Bounding box of the θ-region

Query Center

Voronoi regions for candidate objects

Voronoi regions for answer objects


Experimental results default parameters
Experimental Results with Imprecise Locations - Default Parameters

  • Most of the processing time is spent in numerical integration

    • A strategy that can prune more objects has better performance


Experimental results different values
Experimental Results with Imprecise Locations - Different γValues

γ = 1(almost exact)

γ = 50

(too vague)

γ = 10

Query is costly when impreciseness is high


Experimental results different shapes
Experimental Results with Imprecise Locations - Different Shapes

Circle

Default Ellipse

Narrow Ellipse

Narrow ellipse (correlated distribution) needs to consider additional candidates


Conclusions of experimental results
Conclusions of Experimental Results with Imprecise Locations

  • Most of the processing time is spent in numerical integration

    • A strategy that can prune more objects has better performance

  • Superiority or inferiority of two strategies depends on the given query and the specified parameters

    • No apparent winner

  • The hybrid strategy inherits the benefits of two strategies and shows the best performance


Outline4
Outline with Imprecise Locations

  • Background and Problem Formulation

  • Related Work

  • Query Processing Strategies

  • Experimental Results

  • Conclusions


Conclusions
Conclusions with Imprecise Locations

  • Nearest neighbor query processing methods for imprecise query objects

    • Location of query object is represented by Gaussian distribution

    • Two strategies and their combination

    • Reduction of numerical integration is important

    • Proposal of two pruning strategies and their evaluation

  • Future work

    • Evaluation for multi-dimensional cases (d 3)


Finding probabilistic nearest neighbors for query objects with imprecise locations1

Finding Probabilistic Nearest Neighbors for Query Objects with Imprecise Locations

Yuichi Iijima and Yoshiharu IshikawaNagoya University, Japan


ad