preferential top k search over local data n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Preferential top-k search over local data PowerPoint Presentation
Download Presentation
Preferential top-k search over local data

Loading in 2 Seconds...

play fullscreen
1 / 17

Preferential top-k search over local data - PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on

Preferential top-k search over local data. dissertation thesis RNDr. Martin Šumák supervisor: doc. RNDr . Stanislav Krajči , PhD. consultant: RNDr . Peter Gurský , PhD. Outline. Top-k search motivation and example restrictions and assumptions R-tree-based solution

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Preferential top-k search over local data' - kim


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
preferential top k search over local data

Preferential top-k search over local data

dissertation thesis

RNDr. Martin Šumák

supervisor: doc. RNDr. StanislavKrajči, PhD.

consultant: RNDr. Peter Gurský, PhD.

outline
Outline
  • Top-k search
    • motivation and example
    • restrictions and assumptions
  • R-tree-based solution
    • normalization of data
    • R++-tree
  • Grid file-based solution
  • Experiments
    • Comparison with B+-trees-based solution, table scan, etc.

Preferential top-k search over local data, Dissertationthesis, RNDr. Martin Šumák

top k search
Top-k search
  • Example
    • find top 20 apartments with 3 or 4 rooms, not at first floor, with price about 60000 not exceeding 70000 euro
    • moreover, price is the most important attribute and floor is the least important attribute

Preferential top-k search over local data, Dissertationthesis, RNDr. Martin Šumák

top k query
Top-k query
  • k = 20
  • preferences to attribute’s values – fuzzy functions
  • importance of attributes – weights

wprice = 3 wrooms = 2 wfloor = 1

Preferential top-k search over local data - dissertation thesis - Martin Šumák

top k query1
Top-k query
  • Overall value of object O is

3*fprice(Oprice) + 2*frooms(Orooms) + 1*ffloor(Ofloor)

  • In general

c(fprice(Oprice), frooms(Orooms), ffloor(Ofloor))

Function c has to be monotone!

Preferential top-k search over local data - dissertation thesis - Martin Šumák

the goal of top k search
The goal of top-k search
  • to find top-k objects effectively
    • by processing minimum amount of data
  • restrictions and assumptions
    • all the data is accessible locally
    • all attributes are numerical

Preferential top-k search over local data - dissertation thesis - Martin Šumák

r tree based solution
R-tree-based solution
  • object
    • a vector of n numbers
    • a point of n-dimensional space
    • R-tree, R*-tree, R+-tree, R++-tree

Preferential top-k search over local data - dissertation thesis - Martin Šumák

from knn to top k search
From kNN to top-k search
  • k nearest neighbour
    • known incremental algorithm
    • distance from “query point Z” is the measure of “closeness”

Preferential top-k search over local data - dissertation thesis - Martin Šumák

from knn to top k search1
From kNN to top-k search
  • top-k search
    • overall value (h) is the measure of “goodness”
    • by replacing distance with overall value and reversing order we change the result from kNN to top-k

Preferential top-k search over local data - dissertation thesis - Martin Šumák

analogy of knn and top k search
Analogy of kNN and top-k search

kNN

  • Correctness
  • Efficiency

top-k

Preferential top-k search over local data - dissertation thesis - Martin Šumák

disproportion of attribute values
Disproportion of attribute values
  • floor, area, price – very different ranges
    • solution: normalization – linear transformation of attribute values to interval [0; 1]
  • Another disproportion comes from weights

Preferential top-k search over local data - dissertation thesis - Martin Šumák

normalization applicability
Normalization applicability
  • Useful for
    • R*-tree
  • Meaningless for
    • R-tree (proven for the quadratic split method)
    • R+-tree, R++-tree
    • Grid file

Preferential top-k search over local data - dissertation thesis - Martin Šumák

why the r tree
Why the R++-tree
  • Zero overlaps & minimum bounding rectangles may cause a problem when adding new object
  • R+-tree avoids overlaps at the price of rectangles size

Preferential top-k search over local data - dissertation thesis - Martin Šumák

t he r tree idea
The R++-tree idea
  • Zero overlaps & minimum bounding rectangles may cause a problem when adding new object
  • R++-tree keeps two rectangles for each node – the minimum one and the parent covering one

Preferential top-k search over local data - dissertation thesis - Martin Šumák

the r tree properties
The R++-tree properties
  • Height-balanced
  • Zero overlaps
  • Overflow nodes at leaf level only
  • Minimum node occupancy is 1
  • For the top-k search purposes, attribute values can be strings or any other comparable values (not just numbers)

Preferential top-k search over local data - dissertation thesis - Martin Šumák

top k search over grid file
Top-k search over Grid file
  • Grid file is a spatial index for point data
  • We used static Grid file without extra directory

Preferential top-k search over local data - dissertation thesis - Martin Šumák

top k search over grid file1
Top-k search over Grid file
  • We have proven correctness and efficiency as well

Preferential top-k search over local data - dissertation thesis - Martin Šumák