Face Detection in Crowded Images. Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen May 2002. Face Detection. Ultimate Goal: Detect and locate human face(s) in a crowded color image. Short-term Goal: Determine if a “mug shot” contains a human face (YES or NO). YES.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Face Detection in Crowded Images Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen May 2002
Face Detection Ultimate Goal: Detect and locate human face(s) in a crowded color image. Short-term Goal: Determine if a “mug shot” contains a human face (YES or NO). YES
Neural Network Face Detector Input: Color image. Output: P(w|x) = probability that image contains a face. (Only 1 output node.) Set 1 for face, 0 for no face. P=0 P=1 • 3 Possible Outputs • P > 0.5 FACE • P < 0.5 NOT FACE • P = 0.5 DON’T KNOW
Color-Based Approach For each color image, prepare a YES color histogram. Y = 0.253R + 0.684G + 0.063B E = 0.5R - 0.5G S = 0.25R + 0.25G - 0.5B YES Train the neural network by feeding it many color histograms, telling it which are faces (1) and which are not (0). Idea: Neural network will learn which bins represent flesh tones. (Network develops an internal “chroma chart”.) Note: Technically, this is a flesh detector, not a face detector.
Training Data 100 Faces: Mug shots were chosen to represent different flesh tones and gamma values. 100 Non-Faces: Objects, landscapes, animals, and computer-generated random images not containing flesh tones.
Results Training for 100 iterations took 13 hours. 1.0 0.5 0.0 7 of 100 training faces were mis-classified 2 of 100 training non-faces were mis-classified Network performed favorably on test images.
Face Detection in Crowded Image Now that we have a face detector for mug-shots, how do we detect faces in a general image that could contain many objects and multiple faces? Popular Approach: Windowing Create a small box. Run the face detector in that box. Move the box over one pixel. Repeat. Our Approach: Segmentation Segment the image into its connected components. Run the face detector on each component. Color histogram is shape and size invariant! Pre-assumes size of face in image.
Shape Recovery by Diffusion Generated Motion Jawreth-Lin: Alternately sharpen and diffuse a region, propagating the front towards the object boundaries. Initialize to have 1’s on the image and 0’s on the border. Update where G is Gaussian Note: Instead of convolution, we can apply a digital Gaussian:
Why Does This Work? In smooth regions, so the RHS is very close to 1. Near the boundary of the front, the G averages a 1 with the nearby zeros. So the LHS is < 1. The front will propagate inward until we hit a jump in the image, where . 0000000000 0111111110 0111111110 0000000000
Conclusion: It didn’t work. Although the segmentation and general face detection worked on simple synthetic images, it failed for general photographs. • Problems: • Segmentation fails for noisy backgrounds, overlapping objects, • and objects that intersect the border of the image. • Segmentation would not necessarily pick out just the face (mug), • but also the body that goes along with it. So the neural network • would receive the colors of the clothes as well. (Perhaps this method • can detect naked people though.)