1 / 65

Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ),

Efficient SVM based object classification and detection. Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ). Large Visual Data. Cheap capturing, storage and internet devices. Rapid. Video sharing.

rvelasquez
Download Presentation

Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ),

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient SVM based object classification and detection SreekanthVempati (200402044) Advisors: Dr. C. V. Jawahar (IIIT Hyderabad), Dr. Andrew Zisserman (Univ. of Oxford)

  2. Large Visual Data Cheap capturing, storage and internet devices

  3. Rapid Video sharing Rapid growth in the amount of data available In the case of youtube Image sharing

  4. Problems • Object Detection • Find the location of specified categories of scenes/objects • Scene/Object Classification • Find specified categories of scenes/objects Is there a a demonstration/protest in this image? Is there a bus in this image? Output the bounding box of the bus in this image

  5. Challenges Intra class variations Ex: Boat/Ship category Inter class similarity Flowers Cityscape Protest

  6. Challenges View Point variation Occlusions/Truncations

  7. Scalability • We need solutions which can be scalable to large amount of data • For example, if we have to test 1,40,000 images • For best performance • Feature representation (Visual words based) • 6300 dimensions • takes ~50 seconds ->total time would be ~57 days • Classification (SVM with non-linear kernel) • 20 classes • 3 images/second, a total time of ~ 10 days

  8. Overview • Large scale semantic concept retrieval in videos • Modeling subcategories • Efficient detection by using GRBF feature maps • Conclusions

  9. 1. Semantic video retrieval • Given a large set of videos, retrieve the videos of specific category • Ex: Find all the videos containing soccer

  10. Overview of the approach Annotated Video Frames Example Videos Feature Extraction Ex: PHOW, PHOG, GIST Training Testing Feature Extraction Classifier Ex: SVM, Random Forests Ranked Shots Unseen Videos

  11. Features • GIST – Torralba et. al IJCV 01 • Image divided into m x m grid • For each cell, a set of filters (different scales, orientations) are applied • Final descriptor: Average of the filter responses over all blocks Images from “Image Classification for large number of object categories”, Anna Bosch, 2006

  12. Features Pyramid Histogram of Oriented Gradients Images from “Image Classification for large number of object categories”, Anna Bosch, 2006

  13. Pyramid Histogram of Visual Words Using dense SIFT descriptors Scale Invariant Feature Transform Vector Quantization “Beyond bag of features: Spatial pyramid matching for recognizing natural scene categories.”, S. Lazebnik et. al CVPR 2006

  14. Support Vector Machines (SVM) Xi i = 1,..…..,N yi i = 1,……,N Misclassified point  < 1 b Support Vector Support Vector  = 0 w wt(x) + b = -1 wt(x) + b = 0 wt(x) + b = +1

  15. SVM formulation Evaluation function f(x) = wtx + b

  16. Kernel Trick • Use a function which maps input space to feature space. • And then build the classifier in feature space.

  17. Moving to different space Dot product in feature space f(x) = wtx + b = i iyi<(xi) , (x) >+ b

  18. Kernelizing SVMs Replace it with kernel function

  19. Linear : • Polynomial : • Intersection kernel • Generalized RBF kernel : • Weighted combination of multiple kernels Kernels

  20. TRECVID competition • Objective : Rank video shots based on the presence of given concept • Participated in High level feature extraction, TRECVID • Organized by NIST, USA • 2008: around 180 submissions by 40 teams from all over the world

  21. Some of the classes • High-level Feature Extraction • Cityscape • Classroom • Driver • Two People • Emergency Vehicle • Harbor • Kitchen • Nighttime • Singing • Demonstration/Protest • Mountain • Hand • Street • Telephone • Flower • Bridge • Airplane flying • Boat/Ship • Bus • Dog

  22. Data Statistics Evaluation Measure • Average Precision • - Area under Precision-Recall curve

  23. Our Approach • Performance compared using different features and SVM parameters • Use of PHOW with Intersection kernel is efficient • Testing is very fast, with little drop in performance Testing time: ~2lakh frames in 10 seconds “Classification using Intersection kernel SVMs is efficient”, A. Berg et. al, CVPR 2009

  24. Variation with features

  25. Variation with kernels

  26. Results More Results

  27. 1. Summary • Method of visual concept retrieval suitable for large scale data • PHOW with fast intersection kernel is very much useful

  28. 2. Modeling subcategories

  29. Subcategories in real world

  30. What we achieved?

  31. Structural SVM vs SVM • Allows the output label to be a complex variable - Joint feature map between input and output • Our case: Use as a combination of category and • subcategory labels “Support Vector Learning for Interdependent and Structured Output Spaces”, I. Tsochantaridis, , et. al ICML 04

  32. Use of latent variables “Learning structural SVMs with latent variables”, C. N. Yu et. al ICML 2009

  33. Toy Datasets

  34. Real world datasets • TRECVID 2009 dataset • PASCAL VOC (Visual Object Categorization) 2007 • Object Detection dataset

  35. Results on TRECVID dataset

  36. Improvement with latent SVM

  37. Effect of no. of subclasses

  38. 2. Summary • Method for modeling of subcategories using structural SVM • Application of latent structural SVM for further improvements • Improved the performance of linear kernel • Performed various experiments on toy and real data

  39. 3. Generalized RBF feature maps for Efficient Detection

  40. Object Detection aeroplane bicycle cow car horse motorbike

  41. Part3: Outline Introduction: Kernels and Feature maps Explicit feature maps for GRBF kernels Experiments & Results

  42. General Framework for detection Any Image Ex: Car Classifier (Ex: SVM ) Feature representations Non-linear SVM Linear SVM “Multiple Kernel Learning for Object Detection”, Vedaldi et. al, ICCV 2009, “Cascade Object Detection with Deformable Part Models”, Felzenszwalb et. al, CVPR 2010,

  43. Kernels • Fast Linear SVMs • Stochastic SVM (PEGASOS) • Primal SVM (liblinear) • One-slack SVM (SVM-perf) • Linear SVM • Additive kernels • Generalized RBF kernels faster Ex: intersection Kernel Ex: exp- kernel more discriminative

  44. Kernels Problem: GRBF kernels with high computational complexity are required to get good performance Our Solution: Approximate Generalized RBF kernels with a linear one by using a feature map

  45. Speeding up non-linear SVMs • A kernel is a dot product in a high dimensional feature space • Define a feature map approximating the kernel

  46. Explicit feature maps • Feature maps for RBF/multiplicative kernels • [Rahmi and Recht, NIPS 07] • [ F. Li et. al DAGM 2010] • Feature maps for additive kernels • [Maji and Berg, ICCV 09] • [Vedaldi and Zisserman, CVPR 2010] • [Perronin, et. al CVPR 2010] Our Contribution • Feature maps for generalized RBF kernels • 2X to 3X speedup (only a little drop in performance)

  47. Part3: Outline Introduction: Kernels and Feature maps Explicit feature maps for GRBF kernels Experiments & Results

  48. Additive kernels Examples: Hellinger’s, , Intersection kernel

  49. Additive Kernel Maps Feature maps for additive kernels [Vedaldi & Zisserman 10]: closed form function approximated by sampling “Efficient Additive Kernels via Explicit Feature Maps”, A. Vedaldi and A. Zisserman, CVPR 2010

  50. Feature maps for RBF kernels Random Fourier features [Rahimi & Recht 07] “Random Features for Large-Scale Kernel Machines”, Ali Rahimi, Ben Recht NIPS 2007

More Related