150 likes | 371 Views
Query by Image and Video Content: The QBIC System. M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995. Presenter: William Conner. Outline. Overview Motivation Design Indexing Representative frames Related Work Critique Demo.
E N D
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995 Presenter: William Conner
Outline • Overview • Motivation • Design • Indexing • Representative frames • Related Work • Critique • Demo
QBIC • System that supports content-based image and video retrieval • Flexible query interface • Results ranked based on similarity • Introduced into commercial products • IBM’s Ultimedia Manager • IBM’s DB2 Image Extenders
Motivation • Many previous image and video retrieval approaches were limited • Only supported queries over meta-data rather than content • File identifiers • Keywords that are input manually • Other text associated with image (e.g., caption) • Yahoo.com and Google.com image search support queries by keyword, size, coloration, file type, and domain
Query Methods • Example images • Sketches and drawings • User-selected color and texture patterns • Camera and object motion
Image Objects Video Objects Feature Extraction Database Filter/Index Query Interface User Matching Engine Ranked Results User System Architecture
R-Trees • Region tree is a multidimensional index • Like a B-tree for multiple dimensions • R*-tree is a variant that re-inserts entries upon overflow rather than splitting nodes • Can be used to index low-dimensional features such as average color and texture • High-dimensional features can be reduced to a lower number of dimensions
R-Trees • 2-D example with only two levels (next slide) • Want query to find to points P1 and P2 • Tree root is a bounding rectangle • Child nodes are also bounding rectangles • Overlap is allowed at same tree level • All regions overlapping with query region must be searched • Possible to have several levels and several dimensions
P2 C P1 A B ROOT R-Trees
R-Frames • Representative frames • Allow image retrieval techniques to help with video retrieval • Video broken up into clips called shots • R-frame is representative of shot • Also, basic unit of video query result • Useful for browsing • Choice • Particular frame from shot • First, last, or middle • Synthesized by creating mosaic of all frames in a shot
Related Work • MIT Photobook • Content-based image retrieval system • Library of matching algorithms • e.g., Euclidean distance, histograms, wavelet tree distances • Interactive learning agent to help determine user’s intent • IBM’s Garlic Project • Managing large-scale multimedia systems • Fagin’s algorithms for merging ranked query results • i.e., Top-k query processing over several multimedia subsystems
Photobook • Query: find images most similar to image in the upper left
Critique • Pros • Flexible query interface for content-based retrieval • Reuses image retrieval techniques for video retrieval • Actually used in commercial products • Cons • Not enough details • e.g., More elaboration on how query plans are developed considering fast filtering and indexing • No performance evaluation • Should include measurements of accuracy and delay
Demo • Russian museum’s online digital collection uses QBIC engine • Supports color and layout search • The State Hermitage Museum