reverse spatial and textual k nearest neighbor search n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Reverse Spatial and Textual k Nearest Neighbor Search PowerPoint Presentation
Download Presentation
Reverse Spatial and Textual k Nearest Neighbor Search

Loading in 2 Seconds...

play fullscreen
1 / 27

Reverse Spatial and Textual k Nearest Neighbor Search - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Reverse Spatial and Textual k Nearest Neighbor Search. Outline. Motivation & Problem Statement Related Work RSTkNN Search Strategy Experiments Conclusion. 1. Motivation. If add a new shop at Q, which shops will be influenced? Influence facts Spatial Distance Results: D, F

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Reverse Spatial and Textual k Nearest Neighbor Search' - jack-sweeney


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline

Motivation & Problem Statement

Related Work

RSTkNN Search Strategy

Experiments

Conclusion

1

motivation
Motivation
  • If add a new shop at Q, which shops will be influenced?
  • Influence facts
    • Spatial Distance
      • Results: D, F
    • Textual Similarity
      • Services/Products...
      • Results: F, C

clothes

food

clothes

clothes

sports

food

clothes

2

problems of finding influential sets
Problems of finding Influential Sets

Traditional query

Reverse k nearest neighbor query (RkNN)

Our new query

Reverse spatial and textual k nearest neighbor query (RSTkNN)

3

problem statement
Problem Statement
  • Spatial-Textual Similarity
  • describe the similarity between such objects based on both spatial proximity and textual similarity.
  • Spatial-Textual Similarity Function

4

problem statement con t
Problem Statement (con’t)
  • RSTkNN query
    • is finding objects which have the query object as one of their k spatial-textual similar objects.

5

outline1
Outline

Motivation & Problem Statement

Related Work

RSTkNN Search Strategy

Experiments

Conclusion

6

related work
Related Work
  • Pre-computing the kNN for each object
    • (Korn ect, SIGMOD2000, Yang ect, ICDE2001)
  • (Hyper) Voronio cell/planes pruning strategy
    • (Tao ect, VLDB2004, Wu ect, PVLDB2008, Kriegel ect, ICDE2009)
  • 60-degree-pruning method
    • (Stanoi ect, SIGMOD2000)
  • Branch and Bound (based on Lp-norm metric space)
    • (Achtert ect, SIGMOD2006, Achtert ect, EDBT2009)

Challenging Features:

  • Lose Euclidean geometric properties.
  • High dimension in text space.
  • k and α are different from query to query.

7

baseline method
Baseline method

For each object o in the database

Precompute

Threshold Algorithm

Object o

q is no more similar than o’

Spatial NNs

Textual NNs

Spatial-textual kNN o’

Give query q, k & α

q is more similar than o’

Inefficient since lacking a novel data structure

8

outline2
Outline

Motivation & Problem Statement

Related Work

RSTkNN Search Strategy

Experiments

Conclusion

9

main idea of search strategy
Main idea of Search Strategy

Prune an entry E in IUR-Tree, when query q is no more similar than kNNL(E).

Report an entry E to be results, when query q is more similar than kNNU(E).

11

how to compute the bounds
How to Compute the Bounds

Similarity approximations

MinST(E, E’):

TightMinST(E, E’):

MaxST(E, E’):

12

example for computing bounds
Example for Computing Bounds

Current traveled entries: N1, N2, N3

Given k=2, to compute kNNL(N1) andkNNU(N1).

effect

N1

N3

N1

N2

Compute kNNU(N1)

Compute kNNL(N1)

TightMinST(N1, N3) = 0.564

MinST(N1, N3) = 0.370

TightMinST(N1, N2) = 0.179

MinST(N1, N2) = 0.095

MaxST(N1, N3) = 0.432

MaxST(N1, N2) = 0.150

decrease

decrease

kNNU(N1) = 0.432

kNNL(N1) = 0.370

13

overview of search algorithm
Overview of Search Algorithm
  • RSTkNN Algorithm:
    • Travel from the IUR-tree root
    • Progressively update lower and upper bounds
    • Apply search strategy:
      • prune unrelated entries to Pruned;
      • report entries to be results Ans;
      • add candidate objects to Cnd.
    • FinalVerification
      • For objects in Cnd, check whether to results or not by updating the bounds for candidates using expanding entries in Pruned.

14

slide16

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

N4

N1

N2

N3

p5

p3

p1

p2

p4

Initialize N4.CLs;

EnQueue(U, N4);

U

N4, (0, 0)

15

slide17

Mutual-effect

N2

N1

N3

N1

N3

N2

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

N4

N1

N2

N3

p5

p3

p1

p2

p4

DeQueue(U, N4)

EnQueue(U, N2)

EnQueue(U, N3)

Pruned.add(N1)

Pruned

N1(0.37, 0.432)

U

N4(0, 0)

N3(0.323, 0.619 )

N2(0.21, 0.619 )

16

slide18

Mutual-effect

p4

N2

p4,N2

p5

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

N4

N2

N3

N1

p5

p3

p1

p2

p4

DeQueue(U, N3)

Answer.add(p4)

Candidate.add(p5)

Pruned

Answer

N1(0.37, 0.432)

p4(0.21, 0.619 )

U

Candidate

N3(0.323, 0.619 )

N2(0.21, 0.619 )

p5(0.374, 0.374)

17

slide19

Mutual-effect

p4,p5

p2

p2,p4,p5

p3

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

N4

N2

N3

N1

p5

p3

p1

p2

p4

DeQueue(U, N2)

Answer.add(p2, p3)

So far since U=Cand=empty,

algorithm ends.

Results: p2, p3, p4.

Pruned.add(p5)

Pruned

Answer

N1(0.37, 0.432)

p4

p2

p3

U

Candidate

N2(0.21, 0.619 )

p5(0.374, 0.374)

18

cluster iur tree ciur tree
Cluster IUR-tree: CIUR-tree

IUR-tree: Texts in an index node could be very different.

CIUR-tree: An enhanced IUR-tree by incorporating textual clusters.

19

optimizations
Optimizations
  • Motivation
    • To give a tighter bound during CIUR-tree traversal
    • To purify the textual description in the index node
  • Outlier Detection and Extraction (ODE-CIUR)
    • Extract subtrees with outlier clusters
    • Take the outliers into special account and calculate their bounds separately.
  • Text-entropy based optimization (TE-CIUR)
    • Define TextEntropy to depict the distribution of text clusters in an entry of CIUR-tree
    • Travel first for the entries with higher TextEntropy,i.e. more diverse in texts.

20

experimental study
Experimental Study
  • Experimental Setup
    • OS: Windows XP; CPU: 2.0GHz; Memory: 4GB
    • Page size: 4KB; Language: C/C++.
  • Compared Methods
    • baseline, IUR-tree, ODE-CIUR, TE-CIUR, and ODE-TE.
  • Datasets
    • ShopBranches(Shop), extended from a small real data
    • GeographicNames(GN), real data
    • CaliforniaDBpedia(CD), generated combining location in California and documents from DBpedia.
  • Metric
    • Total query time
    • Page access number

21

scalability
Scalability

(1) Log-scale version

(2) Linear-scale version

22

effect of k
Effect of k

(a) Query time

(b) Page access

23

conclusion
Conclusion
  • Propose a new query problem RSTkNN.
  • Present a hybrid index IUR-Tree.
  • Present the efficient search algorithm to answer the queries.
  • Show the enhancing variant CIUR-Tree and two optimizations ODE-CIUR and TE-CIUR to further improve search processing.
  • Extensive experiments confirm the efficiency and scalability of our algorithms.

24

a straightforward method
A straightforward method
  • Compute RSkNN and RTkNN, respectively;
  • Combine both results of RSkNN and RTkNN get RSTkNN results.

No sensible way for combination. (Infeasible)