1 / 37

Bundling Features for Large Scale Partial-Duplicate Web Image Search

Bundling Features for Large Scale Partial-Duplicate Web Image Search. Zhong Wu ∗, Qifa Ke , Michael Isard , and Jian Sun CVPR 2009 . Outline. Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion. Outline. Introduction

jacinta
Download Presentation

Bundling Features for Large Scale Partial-Duplicate Web Image Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu∗, QifaKe, Michael Isard, and Jian Sun CVPR 2009

  2. Outline • Introduction • Bundled features • Image Retrieval using bundled feature • Experiments and results • Conclusion

  3. Outline • Introduction • Bundled features • Image Retrieval using bundled feature • Experiments and results • Conclusion

  4. Target • Given a query image, is to locate its near- and partial-duplicate images in a large corpus of web images.

  5. Unlike object-based image retrieval

  6. State-of-the-art • Visual word(quantization) & scalable textual index retrieval schemes • Post-processing • Geometric verification • Bundled feature • Weak geometric verification • Bundled feature = SIFT + SMER

  7. Outline • Introduction • Bundled features • Image Retrieval using bundled feature • Experiments and results • Conclusion

  8. MSER • Maximally Stable Extremal Region

  9. MSER

  10. Bundled features

  11. Discriminative power • Increase discriminative power • Feature region size • Feature dimensionality • Drawbacks • Less repeatable • Localization accuracy • Sensitive to occlusion, photometric, geometric

  12. Matching bundled features

  13. Bundled features

  14. Advantage • More discriminative • Allowed to have large overlap error • Partially match • Robust • Occlusion • Geometric changes • …etc

  15. Outline • Introduction • Bundled features • Image Retrieval using bundled feature • Experiments and results • Conclusion

  16. Feature quantization • Hierarchical k-means • One million visual words from 50K training images

  17. Feature quantization • K-D tree • pointList = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]

  18. Matching bundled features

  19. Matching bundled features

  20. Inverted-file index • Documents • T0 = "it is what it is" • T1 = "what is it" • T2 = "it is a banana" • Index • "a": {2} • "banana": {2} • "is": {0, 1, 2} • "it": {0, 1, 2} • "what": {0, 1}

  21. Indexing and retrieval • Support • 512 bundled features each image • 32 visual word each bundled feature

  22. Indexing and retrieval • Voting

  23. Indexing and retrieval • tf • 100 vocabularies in a document, ‘a’ 3 times • 0.03 (3/100) • idf • 1,000 documents have ‘a’, total number of documents 10,000,000 • 9.21 ( ln(10,000,000 / 1,000) ) • if-idf = 0.28( 0.03 * 9.21)

  24. Outline • Introduction • Bundled features • Image Retrieval using bundled feature • Experiments and results • Conclusion

  25. Dataset • Basic dataset • One million images most frequently clicked in a popular commercial image-search engine • (50K, 200K, 500K) • Ground truth • Manually labeled 780 partial-duplicate web image form 19 groups. • Evaluation dataset = basic dataset + ground truth • Query • 150 images from ground truth

  26. mAP • Mean average precision • EX: • two images A&B • A has 4 duplicate images • B has 5 duplicate images • Retrieval rank A: 1, 2, 4, 7 • Retrieval rank B: 1, 3, 5 • Average precision A = (1/1+2/2+3/4+4/7)/4=0.83 • Average precision B = (1/1+2/3+3/5+0+0)/3=0.45 • mAP= (0.83+0.45)/2=0.64

  27. Evaluation • Baseline • Bag-of-features approach with soft assignment[13] [13] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008.

  28. Evaluation • Compare(HE) • enhance the with hamming embedding [3] by adding a 24-bit hamming code to filter out target features. [3] H. Jegou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV, 2008.

  29. Evaluation baseline0.35 to Bundled(mem)0.40 a 14% improvement baseline0.35 to Bundled 0.49 a 40% improvement baseline0.35 to Bundled+HE0.52 a 49% improvement

  30. Evaluation • Compare(Re-ranking) • Full geometric verification, RANSAC for top 300 candidate images

  31. Evaluation Baseline 0.35 to Bundled+re-rank 0.62 a 77% improvement Baseline+re-rank 0.50 to Bundled+re-rank 0.62 a 24% improvement

  32. Evaluation • Trade-off • Run time • a single CPU on a 3.0GHz Core Duo desktop with 16G memory

  33. Sample results AP from 0.51 to 0.74 a 45% improvement

  34. Sample results

  35. Sample results

  36. Outline • Introduction • Bundled features • Image Retrieval using bundled feature • Experiments and results • Conclusion

  37. Conclusion • Bundled features for large scale partial-duplicate web image search. • Bundled features property • More discriminative than individual SIFT features. • Simple and robust geometric constraints • Partially match two groups of SIFT features • Advantage • Robustness to occlusion, photometric and geometric changes

More Related