1 / 22

Robust Real-Time Object Detection

Robust Real-Time Object Detection. Paul Viola & Michael Jones. Introduction. Frontal face detection is achieved Comparatively satisfactory detection rates Efficient decrease in false positive rate Extremely rapid operation 384*288 pixel image is processed for 15 frames/second.

marcelene
Download Presentation

Robust Real-Time Object Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Robust Real-Time Object Detection Paul Viola & Michael Jones

  2. Introduction • Frontal face detection is achieved • Comparatively satisfactory detection rates • Efficient decrease in false positive rate • Extremely rapid operation • 384*288 pixel image is processed for 15 frames/second

  3. Contribution of The Paper • Integral image • A new image representation • AdaBoost • Effective classifier selection • Cascade structure of complex classifiers • Dramatic decrease in detection time

  4. Simple Rectangle Features • Why not use pixels directly ? • Features encodes domain knowledge that is hard to learn by finite quantity of training data • Features operates much faster than pixel based systems

  5. Integral Image • Double integral of original image • A new representation of image for fast calculation of rectangle features

  6. Integral Image • Sum of pixels in rectangle D from the original image can be defined in integral image as : P(4) -P(3)-P(2)+P(1)

  7. Advantages of Integral Image • Pyramid image • Requires a pyramid of images • A fixed scaled detector works on all those images • Forming the pyramid is computationally expensive • Integral Image • A single feature can be evaluated at any scale and location in a few operations • Integral image is computed in one pass over the original image

  8. Learning Classification Functions • 45,394 features associated with each sub-window • A very small number of these features can be combined to form an effective classifier • A variant of AdaBoost is used to • Select features • Train the classifier

  9. How does AdaBoost work? • Combines a mixture of weak classifiers to form a strong one • Percepton algorithm returns the one having the minimum classification error • The examples are re-weighted in according to the accuracy of the first classifier • The final strong classifier is a weighted combination of weak classifiers

  10. How does AdaBoost work? • First and second features selected by AdaBoost

  11. Attentional Cascade • Increase detection performance & reduce computation time • Calling simpler classifiers before complex ones • A simple classifier example (two-feature): • 100% detection rate • 40% false positives • 60 microprocessor instructions (very efficient)

  12. Attentional Cascade

  13. Training of Cascade of Classifiers • The deeper classifiers are trained with harder examples • Simple classifiers in the first stages, complex ones in the deeper parts of the cascade • Complex classifiers takes more time to compute

  14. Training of Cascade of Classifiers • A general detection algorithm works like 85-95% detection rate & 10-5 - 10-6 % false positive rate • The cascade system works like • With a 10 stage classifier • For each cascade a detection rate 99% and false positive rate 30 % • Overall system runs at • (.9910~) 90% detection rate • (0.3010 ~)6 * 10-6% false positive rate

  15. Requirements • Needs to be determined : • Number of stages • Number of features for each stage • Thresholds for each stage

  16. Practical Implementation • User selects acceptable fi and di for each layer • Each layer is trained for Adaboost • Number of features are increased until target fi and di are met for this level • If overall target F and D is not met for the system we add a new level to the cascade

  17. Results – Structure of Cascade • 32 layers – 4297 features • Weeks spent to train the cascade

  18. Results – Algorithm Details • All sub-windows (training – testing) are variance normalized for lighting conditions • Scaling is achieved by just scaling the detectors rather than the image • Step size of one pixel is used

  19. Results – Algorithm details

  20. Results • Most of the windows are rejected in the first & second cascade • Face detection on a 384x288 image runs in about 0.067 seconds • 15 times faster than Rowley-Baluja-Kanade • 600 times faster than Schneiderman- Kanade

  21. Results 150 images and 507 faces

More Related