Supporting noise free queries in large image databases
1 / 24

Supporting Noise-Free Queries in Large Image Databases - PowerPoint PPT Presentation

  • Uploaded on

Supporting Noise-Free Queries in Large Image Databases. Image Retrieval. Database Images. Query Image. Image Database. Feature Extraction. Feature Extraction. Select. Compare. Metadatabase. Feature Vectors. Query Result. Noise-Free Queries (NFQ’s). NFQ is more precise .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Supporting Noise-Free Queries in Large Image Databases' - mateo

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Image retrieval
Image Retrieval

Database Images

Query Image

Image Database








Feature Vectors

Query Result

Noise free queries nfq s
Noise-Free Queries (NFQ’s)

  • NFQ is more precise.

  • User can specify semantic constraints:

    • Spatial constraints (relative distances)

    • Scaling constraints (relative sizes)

Rectangular query




Less relevant


  • How do we extract features if we do not know the matching areas beforehand ?

  • How do we index the images ?



One solution local color histogram lch
One Solution – Local Color Histogram (LCH)

  • Each subimage has a color histogram.

  • Any combination of the histograms can be selected for comparison with the corresponding color histograms of the query image.

Limitations of lch
Limitations of LCH

  • Dilemma:

    • Using large partitions is not precise

    • Using small partitions is too expensive

  • Limitation:

    • difficult to handle scaling

Sampling based approach
Sampling-Based Approach

  • Idea:

    • Sampling 113 16x16 blocks

    • Comparing only the relevant blocks

  • Low storage overhead

  • Support NFQ’s

  • Robust to translation and scaling

  • Support spatial and scaling constraints


Handling scaling
Handling Scaling

Sampling query at 3 different rates

A fixed sampling rate for all database images (a)

A higher rate to find larger matching objects (b)

The same rate to find matching objects of the same size (c)

A lower rate to find smaller matching objects (d)


  • We slide square windows of sizes 25, 41, 61, 85, and 113 sampled blocks over each database image.

  • 85 indexing subimages are captured at various sliding position.

Signature computation
Signature Computation

  • For each indexing subimage, we compute its signature as the seven average-variance pairs.

    • One from all the enclosed sampled blocks.

    • Four from sampled blocks in the four quarters,

    • Two from sampled blocks along the two diagonals

  • The first component of the signature is called the short signature.


  • For each image, we map its 85 subimages into signature points in a 14-dimensional signature space.

  • For each image, we cluster its 85 signature points into five MBRs (Minimum Bounding Region).

  • We insert these MBRs into an R* tree (height balanced and reinsert when overflow).

Query processing preparation
Query Processing - Preparation

  • Sample the query image at different rates. For each sampling rate, do the following steps.

  • Determine the core area that contains the maximum number of relevant sampled blocks and least noise.

  • Determine the query rectangle

  • Compute the signature of the core area.

Query processing search
Query Processing – Search

  • Retrieve relevant clusters (or MBRs) from the R* tree using the query rectangle.

  • Eliminate irrelevant subimages (in the qualified MBRs) using the short signature.

  • Each subimage passing the above test is compared against the original NFQ by matching the corresponding sampled blocks.

  • Each image with a matching subimage is retrieved.

Query processing summary
Query Processing – Summary

  • Sampling the NFQ at different rates

  • Determine the core area and compute its signature

  • Determine the query rectangle

  • Retrieve relevant clusters (or MBRs) from the R* tree

  • Eliminate candidate subimages using the short signature

  • Matching the sampled blocks of the remaining subimages

Performance comparison
Performance Comparison

  • LCH

    • NFQ-capable

  • Correlogram

    • one of the best whole matching techniques

    • Can Correlogram support NFQ ?

Experimental studies
Experimental Studies

  • Database: 15,808 images of various categories

  • Workload: 100 queries

    • Type 1: Query and database images have the same size; and the NFQ covers less than half of the query image (30 queries)

    • Type 2: Query and database images have the same size; and the NFQ covers more than one half of the query image (20 queries)

    • Type 3: query and database images have different sizes (50 queries)

Type 3 queries
Type-3 Queries

  • Only SamMatch can handle Type-3 queries.

  • In the following example, there is no easy way to match the two identical apples using LCH.

Performance results type 3
Performance Results (Type 3)

The sizes of queries are different from those of database images













Performance metric
Performance Metric

  • Ai denotes a relevant image returned by the system

  • S is the scope of the query (i.e., maximum number of images returned)

  • q is the total number of relevant images in the database.

Rationale: Low-ranked Images do not make it to the user.


Type-1 Results

Type-2 Results

Type-3 Results

Time space
Time & Space

  • Assumption: No quick-and-dirty filtering, no indexing (since the LCH and Correlogram designs do not use them).

  • SamMatch requires much less storage overhead

    • LCH uses 21 color histograms: 21  256  2 bytes

    • Correlogram uses 4 color histograms: 4  256  2 bytes

    • SamMatch uses one byte per sampled block: 113 bytes

  • In terms of exhaustive search, SamMatch is

    • one time faster than Correlogram, and

    • two times faster than LCH

Concluding remarks
Concluding Remarks

  • Reducing noise interference is essential to achieving more reliable image retrieval

  • SamMatch supports NFQs effectively and efficiently

    • Two times faster than LCH, and one time faster than Correlogram

  • Other benefits of SamMatch include:

    • Matching objects at different scales

    • Uncovering translations of the matching areas

    • Handling spatial and scaling constraints

  • SamMatch uses less than 1/16 the storage space required by LCH and Correlogram