EMPATH: A Neural Network that Categorizes Facial Expressions

EMPATH: A Neural Network that Categorizes Facial Expressions Matthew N. Dailey and Garrison W. Cottrell University of California, San Diego Curtis Padgett California Institute of Technology

Facial Expression Recognition (Theory 1) • Categorical Perception • Categories are discrete entities • Sharp categorical boundaries • Discrimination of similar pairs of expressive faces is enhanced when near category boundaries

Facial Expression Recognition (Theory 2) • Graded and expressions considered points in a continuous, low-dimensional space • e.g. “Surprise” between “Happiness” and “Fear”

Historical Research (Categorical) • Ekman and Friesen (1976) 10-step photos between pairs of caricatures • Ekman 1999 essay on basic emotions • Harnad, 1987 Categorical Perception • Beale and Keil (1995) morph image sequence with famous faces • Etcoff and Magee (1992) facial expression recognition tied to perceptual mechanism

Historical Research (Continuous) • Schlosberg (1952) category ratings and subjects “errors” predicted accurately by arranging categories around an ellipse • Russell (1980) structure theory of emotions • Russell and Bullock (1986) emotion categories best thought of as fuzzy sets • Russel et al. (1989),Katsikitis (1997), Schiano et al. (2000) continuous multidimensional perceptual space for facial expression perception

Young et al.’s (1997) “Megamix” Experiments • Experiment 1: Subjects identify the emotional category in 10%, 30%, 50%, 70%, and 90% morphs between all pairs of the 6 prototypical expressions • 6-way forced choice identification • Experiment 2: Same as experiment 1 with the addition of the “neutral” face • 7-way forced choice

Young et al.’s (1997) “Megamix” Experiments • Experiment 3: Discriminate pairs of stimuli along the six transitions • Sequential discrimination task (ABX) • Simultaneous discrimination task (same-different) • Experiment 4: Determine what expression is “mixed-in” to a faint morph • Given a morph or prototype stimulus, indicate the most apparent, second-most apparent, and third-most apparent emotion

“Megamix” Experiment Results • Results from experiments 1-3 support the categorical view of facial expression perception • Results from experiment 4 showed that subjects were significantly likely to detect mixed-in emotion at 30%. This supports the continuous, dimentional accounts of facial expression perception • Rather than settling the issue of categorical vs. continuous theories they found evidence to support BOTH theories • Until now, no computational model has ever been able to simultaneously explain these seemingly contradictory data

The Model • Three layer neural network • Perceptual analysis • Object representation • Categorization • Feedforward network (no backpropagation at later levels) • Input is 240 x 292 grayscale face image

Perceptual Analysis Layer • Neurons whose response properties are similar to complex cells in the visual cortex • This is modeled by “Gabor Filters” • Basically, these units do nonlinear edge detection at five different scales and eight different orientations

Object Representation Layer • Extract small set of features from high dimensional data • Equal to an “image compression” network that extracts global representations of the data • Principal components analysis is used to model this layer • 50 linear hidden units

Categorization Layer • Simple perceptron with six outputs (one for each “basic” emotion) • The network is set up so that the output can be interpreted as probabilities (i.e. they are all positive and sum to 1)

The Model

Experiments & Results • Same experiments as the Young et al. “Megamix” experiments • Results • The model and humans find the same expressions difficult or easy to interpret • When presented with morphs between pairs of expressions, the model and humans place similar sharp category boundaries between prototypes • The model and humans are similarly sensitive to mixed-in expressions in morph stimuli

More Results • Network generalization to unseen faces, compared to human agreement on the same face (six-way forced choice)

More Results

Conclusion • This model was able to simulate both the categorical and continuous nature of facial classification consistent with the human experiments conducted by Young et al. • Categorical or Continuous? • Conclusion leans toward both theories being complimentary instead of mutually exclusive • “tapping different computational levels of processing” • Which method is dictated by the task and the data

EMPATH: A Neural Network that Categorizes Facial Expressions