Skip this Video
Download Presentation
Serkan Kiranyaz and Moncef Gabbouj

Loading in 2 Seconds...

play fullscreen
1 / 27

Serkan Kiranyaz and Moncef Gabbouj - PowerPoint PPT Presentation

  • Uploaded on

Hierarchical Cellular Tree: An Efficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases. Serkan Kiranyaz and Moncef Gabbouj. Objective.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Serkan Kiranyaz and Moncef Gabbouj' - hinto

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Hierarchical Cellular Tree: An Efficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases

Serkan Kiranyaz and Moncef Gabbouj

  • To present the technique of using a Hierarchical Cellular Tree (HCT) as an indexing scheme for content-based retrieval on multimedia databases.
why is this technique important
Why is this technique important?
  • Technological hardware and network improvements
  • Daily usage of Internet
  • Technique reduces costly I/O operations
hct overview
HCT Overview
  • Is a MAM(Metric Access Method) technique.
  • Based off the M-tree
  • Is a dynamic, cell-based, hierarchical structured indexing method
  • Items are partitioned based on distances and stored within cells based on their similarity proximity
  • Self-organized tree implemented via genetic programming principles
indexing technique categories
Indexing Technique Categories

SAM(spatial access method)

  • (dis-)similarity distance only measured through Euclidean distance.
    • Not suited for deep spanning trees

MAM (metric access method)

  • Support black box approach to (dis-)similarity distance.
    • Allows for deep trees
  • Do not support dynamic changes*
m tree similarities
*M-tree Similarities
  • Is a dynamic MAM
  • Has a hierarchical structure based on the mitosis of a cell
    • Tree grows one level upwards whenever a split occurs at the top level
  • Each cell is represented by a nucleus (except the top most cell)
m tree problems
M-tree Problems
  • Achieves a balanced tree with low I/O cost in large datasets
    • Problem: Multimedia databases are seldom balanced at all.
    • HCT: Cells are unbalanced and can vary in size
  • Must know the size of the database entries/Cells before building (capacity M)
    • Problem: All M-tree structures can hit upper limits (size non dynamic)
    • HCT: Removes limit on cell size as long as they keep a definite "compactness" measure
m tree problems1
M-tree Problems
  • M-tree compactness is only measured with respect to distance of nucleus to furthest object (covering radius)
    • Problem: Determining compactness this way does not allow for dynamic sizing of cells.
    • HCT: Uses all cell items and their minimum distances to the cell(instead of a single nucleus item alone), compactness is constantly being updated.
related work in multimedia databases sam trees
Related Work in Multimedia Databases (SAM trees)
  • KD-Trees
    • Hierarchical tree structure
    • Use space-partitioning methods to divide the feature space into predefined hyperplanes
  • R-Trees
    • Feature space divided according to distribution of database items
    • Region overlapping may occur
related work in multimedia databases sam trees1
Related Work in Multimedia Databases (SAM trees)
  • R*-trees
    • Improves the node splitting of R-tree by taking overlapping areas into consideration
  • TV-tree
    • Uses telescope vectors
    • Authors call telescope vectors "so called telescope vectors"
    • Google search does not come up with anything meaningful for telescope vectors
related work in multimedia databases sam trees2
Related Work in Multimedia Databases (SAM trees)
  • X-tree
    • Avoids overlapping of region bounding boxes by using a new organization of the directory
    • Boxes can still intersect at higher levels in the tree
    • Paper does not go into detail on what a bounding box is (assumption bounding box = cell)
  • SS-tree
    • Uses minimum bounding spheres instead of boxes
    • Less intersects at higher levels
related work in multimedia databases mam trees
Related Work in Multimedia Databases (MAM trees)
  • vp-tree(vantage point)
    • organizes feature vectors(data points) into two groups according to their similarity distances with respect to a single point(vantage point)
  • mvp-tree(multiple vantage point)
    • assigns multiple vantage points instead of one
hct structure cell structure
HCT Structure - Cell Structure
  • Basic container in which similar database items are stored.
  • Ground level cells contain the entire database items
  • Cells carry an MST (Minimum Spanning Tree)
    • Holds minimum (dis-)similarity distance of each item to other items within the cell.
    • Used to determine when mitosis should occur.
      • Splits occur at longest branch.
    • This is actually very similar to MVP-tree except every cell is treated as a vantage point.
      • Better idea about the similarity proximity of an item.
hct structure cell structure1
HCT Structure - Cell Structure
  • Cells cannot undergo mitosis before reaching a specific level of maturity
    • This works like real cells
    • Reason for this is not like real cells
  • Nucleus
    • Represents the owner cell of a higher level
    • Nucleus is found through MST
      • Item with maximum number of branches
    • Nucleus is updated with every operation performed
      • M-tree does not do this
hct structure cell structure2
HCT Structure - Cell Structure
  • Cell Compactness
    • How tight focused the clustering for items within the cell
    • High variations are eliminated by using more than a single item(vantage point)
hct structure cell structure3
HCT Structure - Cell Structure
  • Cell Mitosis
    • Two conditions for mitosis
      • Maturity (Nc > Nm)
        • c = number of items in cell
        • m = maturity minimum limit
      • Cell Compactness (CFc > CThrL)
        • CFc = Compactness feature
        • CThrL = current level compactness threshold
    • Cell Mitosis has no cost as the cell is simply split by breaking longest branch
hct structure level structure
HCT Structure - Level Structure
  • Top level always single cell
    • If mitosis occurs on top level, new top level is created to preserve single cell top level.
  • Each level attempts to dynamically maximize compactness of cells
hct structure hct operations
HCT Structure - HCT Operations
  • Three operations
    • Cell mitosis
    • Item insertion
    • Item removal
  • As stated before all three operations cause a recalculation of Compactness
hct structure hct operations1
HCT Structure - HCT Operations
  • Insert
    • First performs the Pre-Emptive cell search
      • recursively descends HCT from top to target level
    • Once target located, insert item into target cell
    • Perform post-processing check
      • Check for mitosis
      • Recalculate compactness for single or multiple cells
    • If mitosis was performed
      • Remove old nucleus item from higher level
      • Consecutively call Insert for new nucleus
hct structure hct indexing
HCT Structure - HCT Indexing
  • HCT can index using any set of available features
    • Must have fusion mechanism
    • Must have similarity measure
  • Consists of two operations
    • Incremental construction
    • Optional periodic fitness check
hct structure hct indexing1
HCT Structure - HCT Indexing
  • HCT Incremental Construction
    • Takes a Database D and appends all new items contained in an Array
    • If an HCT does not already exist for database D
      • All current items of D are inserted into the Array
      • A new HCT body is constructed from D
    • Else if an HCT does exist for database D
      • HCT body is first loaded
      • HCT body is updated with contents of Array
hct structure hct indexing2
HCT Structure - HCT Indexing
  • HCT Fitness Check
    • Aims to minimize corruption which can happen during construction of HCT body
      • Corruption happens because the order of items that are inserted is not handled
    • Outliers Check
      • Reduces the "crowd effect" by removing redundant minority cells
        • minority cells, cells with a few or one item in it
      • All minority cells are reintroduced into the system to see if they fit into another cell
hct structure hct indexing3
HCT Structure - HCT Indexing
  • Cell Merging
    • If a cell merge occurs that is later deemed as not meeting the requirements of cell compactness it can be merged.