1 / 15

Participation report Photo and Object Retrieval task

This participation report highlights the usage of graph-based segmentation and biclustering for image retrieval in the Photo and Object Retrieval tasks of ImageCLEF. The system features a 3-level segmentation, a 35-feature segment vector, and various post-processing techniques to enhance retrieval accuracy. The biclustering algorithm is applied to improve feature matching and classification for both photo and object retrieval.

willinghamr
Download Presentation

Participation report Photo and Object Retrieval task

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. joint work with András Benczúr, István Bíró, Mátyás Brendel, Károly Csalogány, Dávid Siklósi Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Participation reportPhoto and Object Retrieval task Bálint Daróczy

  2. Common CBIR system 3-level segmentation, 35-feature segment vector Finding similar segments ImageCLEF Photo Task Cross-modal retrieval by text and image feature biclustering ImageCLEF Object Retreival Task Re-Segmented Pre-Classified Images for Object Retrieval Overview

  3. CBIR Segmentation • Pre-segmentation • Resize to 1024x1024 (OpenCV)‏ • Smooth to eliminate noise (OpenCV)‏ • Downsizing with Gaussian kernel (Three-level Gaussian-Laplacian pyramid) • Intra and inter level threshold for joining pixels • Graph-based method [Felzenszwalb, Huttenlocher] • Undirected weighted graph over • neighboring pixels • Bottom-up clustering with dynamic thresholds • Efficient heuristic solution, better than min-cut, close to normalized cut

  4. CBIR Post-processing • Graph based segmentation scenario: • Low initial thresholds: small sized relevant segments disappear • High initial thresholds: too many segments • Solution: • Sobel gradient image for selecting important edges • Result: • # of Graph based segments: in average 1000+ per image • After Post-processing: 100-

  5. Original Picture Sobel Image After Graphbased Method After Post-Processing

  6. CBIR Feature vectors • 35 dimensional real valued vector • Size in pixels (every picture must be resized)‏ • Average color in RGB space, 8bit for every channel – in overall 24 bit • RGB color histogram, five sample for every 8 bit channel – in overall 8x3x5 bit • Shape representation: a grayscale image with the resolution of 4x4 Size Average RGB Histogram R Histogram G Histogram B Shape 4x4

  7. Biclustering for ImageCLEF Photo sea sky … visual features • Matrix of image segments and annotation water tower building terms tf.idf segments

  8. sea building … visual features Biclustering for ImageCLEF Photo • Result of the biclustering procedure water tower building

  9. Biclustering for ImageCLEF Photo sea building … visual features • Term and segment cluster pair weights water tower term clusters building segment clusters cluster-cluster correspondence

  10. Photo Retrieval method Settings for the Biclustering algorithm • Kullbach-Leibler distance on tf.idf an Information theoretic distance of distributions • Eucledian distance over the visual features • Row-column iterated EM, 16 iterations, 1000 segment clusters and 500 word clusters • Query term clusters selected • Corresponding image segment clusters determined • Query image segment weights with low correspondence discarded • CBIR run with remaining segments

  11. ImageCLEF Photo Results • Term match to count number of words occurring from query (description down, location upweighted) • tf.idf breaks ties, improves if no other information • Tie breaking by image biclustering superior to tf.idf • No improvement for other combinations of text and image (No query expansion, no feedback)

  12. ImageCLEFObject Retrieval Task Class of Query Image Pre-classified Images VOC2007 Query Images Original Training Set

  13. ImageCLEF Object • Basic assumptions on Pre-Classified Images • - Sample objects can have different shape and color • Pre-segmentation made by humans • Abstract classes Method • Feature vector of pre-classified objects (VOC2007 dataset) • Search for the most similar object Key Idea • Re-segment objects to improve similarity

  14. ImageCLEF Object: budapest-acad315 • ‏Segment the query image • Classify into the class of the most similar sample segment • Drawback • Granularity of automatic and human segmentation is different • MAP results • - AP results with completely annotated database • MAP: 0.020

  15. ImageCLEF Object: budapest-acad314‏ • Re-segment the pre-classified images • New segments from classified objects • Class representatives formed by segments with over 80% overlap of the training objects • Higher similarities, more adequate classification • MAP results • - AP results with completely annotated database • MAP: 0.031 • - for class of bicycles: MAP: 0.283

More Related