1 / 24

Supporting Noise-Free Queries in Large Image Databases

Supporting Noise-Free Queries in Large Image Databases. Image Retrieval. Database Images. Query Image. Image Database. Feature Extraction. Feature Extraction. Select. Compare. Metadatabase. Feature Vectors. Query Result. Noise-Free Queries (NFQ’s). NFQ is more precise .

mateo
Download Presentation

Supporting Noise-Free Queries in Large Image Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Supporting Noise-Free Queries in Large Image Databases

  2. Image Retrieval Database Images Query Image Image Database Feature Extraction Feature Extraction Select Compare Metadatabase Feature Vectors Query Result

  3. Noise-Free Queries (NFQ’s) • NFQ is more precise. • User can specify semantic constraints: • Spatial constraints (relative distances) • Scaling constraints (relative sizes) Rectangular query Noise-free query Similar Less relevant

  4. Challenges • How do we extract features if we do not know the matching areas beforehand ? • How do we index the images ? Noise-free query

  5. One Solution – Local Color Histogram (LCH) • Each subimage has a color histogram. • Any combination of the histograms can be selected for comparison with the corresponding color histograms of the query image.

  6. Limitations of LCH • Dilemma: • Using large partitions is not precise • Using small partitions is too expensive • Limitation: • difficult to handle scaling

  7. Sampling-Based Approach • Idea: • Sampling 113 16x16 blocks • Comparing only the relevant blocks • Low storage overhead • Support NFQ’s • Robust to translation and scaling • Support spatial and scaling constraints Advantages

  8. Handling Scaling Sampling query at 3 different rates A fixed sampling rate for all database images (a) A higher rate to find larger matching objects (b) The same rate to find matching objects of the same size (c) A lower rate to find smaller matching objects (d)

  9. Subimages • We slide square windows of sizes 25, 41, 61, 85, and 113 sampled blocks over each database image. • 85 indexing subimages are captured at various sliding position.

  10. Signature Computation • For each indexing subimage, we compute its signature as the seven average-variance pairs. • One from all the enclosed sampled blocks. • Four from sampled blocks in the four quarters, • Two from sampled blocks along the two diagonals • The first component of the signature is called the short signature.

  11. Indexing • For each image, we map its 85 subimages into signature points in a 14-dimensional signature space. • For each image, we cluster its 85 signature points into five MBRs (Minimum Bounding Region). • We insert these MBRs into an R* tree (height balanced and reinsert when overflow).

  12. Query Processing - Preparation • Sample the query image at different rates. For each sampling rate, do the following steps. • Determine the core area that contains the maximum number of relevant sampled blocks and least noise. • Determine the query rectangle • Compute the signature of the core area.

  13. Query Processing – Search • Retrieve relevant clusters (or MBRs) from the R* tree using the query rectangle. • Eliminate irrelevant subimages (in the qualified MBRs) using the short signature. • Each subimage passing the above test is compared against the original NFQ by matching the corresponding sampled blocks. • Each image with a matching subimage is retrieved.

  14. Query Processing – Summary • Sampling the NFQ at different rates • Determine the core area and compute its signature • Determine the query rectangle • Retrieve relevant clusters (or MBRs) from the R* tree • Eliminate candidate subimages using the short signature • Matching the sampled blocks of the remaining subimages

  15. Performance Comparison • LCH • NFQ-capable • Correlogram • one of the best whole matching techniques • Can Correlogram support NFQ ?

  16. Experimental Studies • Database: 15,808 images of various categories • Workload: 100 queries • Type 1: Query and database images have the same size; and the NFQ covers less than half of the query image (30 queries) • Type 2: Query and database images have the same size; and the NFQ covers more than one half of the query image (20 queries) • Type 3: query and database images have different sizes (50 queries)

  17. Type-3 Queries • Only SamMatch can handle Type-3 queries. • In the following example, there is no easy way to match the two identical apples using LCH.

  18. Performance Results (Type 1) SM Corr. LCH

  19. Performance Results (Type 2)

  20. Performance Results (Type 3) The sizes of queries are different from those of database images Query 4 2 3 5 12 18 Query 3 216 396 2

  21. Performance Metric • Ai denotes a relevant image returned by the system • S is the scope of the query (i.e., maximum number of images returned) • q is the total number of relevant images in the database. Rationale: Low-ranked Images do not make it to the user.

  22. Reliability Type-1 Results Type-2 Results Type-3 Results

  23. Time & Space • Assumption: No quick-and-dirty filtering, no indexing (since the LCH and Correlogram designs do not use them). • SamMatch requires much less storage overhead • LCH uses 21 color histograms: 21  256  2 bytes • Correlogram uses 4 color histograms: 4  256  2 bytes • SamMatch uses one byte per sampled block: 113 bytes • In terms of exhaustive search, SamMatch is • one time faster than Correlogram, and • two times faster than LCH

  24. Concluding Remarks • Reducing noise interference is essential to achieving more reliable image retrieval • SamMatch supports NFQs effectively and efficiently • Two times faster than LCH, and one time faster than Correlogram • Other benefits of SamMatch include: • Matching objects at different scales • Uncovering translations of the matching areas • Handling spatial and scaling constraints • SamMatch uses less than 1/16 the storage space required by LCH and Correlogram

More Related