Face Detection and Neural Networks. Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen December 2001. Face Detection. Problem: Given a color image, determine if the image contains a human face. That is, can you tell our governor from a toaster?. vs.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Math 8600: Image Analysis
Prof. Jackie Shen
Problem: Given a color image, determine if the image contains a human face.
That is, can you tell our governor from a toaster?
Answer: The picture on the right contains a human face. I think.
Applications: AI, tracking, automated security, video retrieval
Goal: Given a set of inputs X and desired outputs T, determine the weights s.t. X generates T.
Idea: Similar inputs will give similar outputs.
Training: Set weights to minimize .
Levenberg-Marquad Algorithm (multi-dim steepest descent).
Training is very expensive computationally. If there are x input
nodes, t output nodes, and p hidden nodes, then # weights = (x+t)p.
Input: Color image.
Output: P(w|x) = probability that image contains a face. (Only 1 output node.)
Set 1 for face, 0 for no face.
Input X: The pixel values of the image at N selected grid points.
Original Interpolated Output
Since each pixel has three values (RGB), our input
vector X will have length 3N.
I tried a small case: N=25.
The network took over an hour to train for the
training set on the next slide.
P values for 20 images in training set.
P=0.5 for all training images.
The interpolated images
can’t be interpreted.
Input X: The 3 histograms of the RGB values, appended as 1 vector.
Each histogram has N=20 bins.
So size of input vector is 3N=60.
Idea: Neural network will pick out the frequency of flesh tones.
After 100 iterations (1 hour, 1241 weights), the Levenberg-Marquad algorithm was able to correctly classify all 20 training images.
But on a test set of 13 images, got 7 correct (53.8%).
RGB histograms were too similar.
Y = 0.253R + 0.684G + 0.063B
E = 0.5R - 0.5G
S = 0.25R + 0.25G - 0.5B
Input X: 3 YES histograms appended as one vector.
After training for 100 iterations, 3 images in training set were mis-classified.
But on test set, correctly identified 13 out of 13 images (100%).
You can try my Matlab code: