Rapid Object Detection using a Boosted Cascade of Simple Features

Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001 (CVPR 2001)

Outline • Introduction • Features • Learning classification functions • The attentional cascade • Result • Conclusion

Introduction • New object detection framework • Motive • Face recognition • Characteristics • Robust • Rapid

Contributions • New image representation • Integral image • Method for constructing a classifier • Selecting a small number of important features using AdaBoost • Method for combining classifiers • in a cascade structure

Application • Rapid face detector can be used in • User interfaces • Image databases • Teleconferencing • Especially, … • Allow for post-processing • When rapid frame-rates are not necessary • Can be implemented on small low power devices • Handhelds, embedded processors

Features • Why not pixels? • The most common reason • Features can encode ad-hoc domain knowledge • The critical reason for this system • Feature based system operates much faster • 3 kind of features used • Two-rectangle feature • Three-rectangle feature • Four-rectangle feature

Integral Image integral image original image ( 0 ,0 ) ( x ,y )

Rectangular sum

Learning classification functions • Hypothesis • Very small number of features can form an effective classifier • How to find • Select the single rectangle feature which best separates the positive and negative examples • Weak classifier • Result • Features selected in early round • Error rate: 0.1~0.3 • Features selected in later round • Error rate: 0.4~0.5 polarity feature threshold

AdaBoost algorithm

Learning result • A frontal face classifier • 200 features (among 180,000) • Detection rate: 95% • False positive rate: 1/14084 • 0.7s to scan an 384*288 pixel image • Not sufficient • First feature selected • The eyes is often darker than the nose and cheeks • Second feature selected • The eyes are darker than the bridge of the nose

The attentional cascade • Constructing goal • Reject many of the negative sub-window • Detect almost all positive instances • False negative rate → 0 • Cascade

Training a cascade of classifiers • Tradeoffs • Features↑ ↔ detection rates ↑ • Features↑ ↔ computational time ↓ • Constructing stages • Training classifiers using AdaBoost • Adjust the threshold to minimize false negative

Result • Face training set • 4916 hand labeled faces • Resolution: 24*24 pixels • Source: random crawl of the WWW • 9544 manually inspected image • 350 million sub-windows • The complete face detection cascade has • 38 stages • 6061 features • 15 times faster than current system

Performance Receiver operating characteristic (ROC) What’s ROC? (please reference http://www.geocities.com/shinyuanclub/update97/lucm0115.html )

Performance comparison Detection rates for various numbers of false positives on the MIT+CMU test set containing 130 images and 507faces

Conclusions • An approach for object detection • Minimize computation time • 15 times faster than any previous approach • Achieve high detection accuracy false negative false positive

Rapid Object Detection using a Boosted Cascade of Simple Features