1 / 101

What can computational models tell us about face processing?

What can computational models tell us about face processing?. Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD. Collaborators, Past, Present and Future:

neona
Download Presentation

What can computational models tell us about face processing?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What can computational models tell us about face processing? Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past, Present and Future: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Keiji Yamada, Lingyun Zhang

  2. What can computational models tell us about face processing? Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past, Present and Future: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Keiji Yamada, Lingyun Zhang

  3. What can computational models tell us about face processing? Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past, Present and Future: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Keiji Yamada, Lingyun Zhang

  4. What can computational models tell us about face processing? Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past, Present and Future: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Keiji Yamada, Lingyun Zhang

  5. What can computational models tell us about face processing? Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past, Present and Future: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Keiji Yamada, Lingyun Zhang

  6. What can computational models tell us about face processing? Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past, Present and Future: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Keiji Yamada, Lingyun Zhang

  7. And now for something completely different… • The CIS goal is to “mimic nature for problem solving” • My goal is to mimic nature in order to understand nature • In fact, as a cognitive scientist, I am glad when my models make the same mistakes people do… • Because that means the model is fitting the data better -- so maybe I have a better model! • So - don’t look for a better problem solver here…hopefully, look for some insights into how people process faces. IEEE Computational Intelligence Society 4/12/2006

  8. Why use models to understand thought? • Models rush in where theories fear to tread. • Models can be manipulated in ways people cannot • Models can be analyzed in ways people cannot. IEEE Computational Intelligence Society 4/12/2006

  9. Models rush in where theories fear to tread • Theories are high level descriptions of the processes underlying behavior. • They are often not explicit about the processes involved. • They are difficult to reason about if no mechanisms are explicit -- they may be too high level to make explicit predictions. • Theory formation itself is difficult. • Using machine learning techniques, one can often build a working model of a task for which we have no theories or algorithms (e.g., expression recognition). • A working model provides an “intuition pump” for how things might work, especially if they are “neurally plausible” (e.g., development of face processing - Dailey and Cottrell). • A working model may make unexpected predictions (e.g., the Interactive Activation Model and SLNT). IEEE Computational Intelligence Society 4/12/2006

  10. Models can be manipulated in ways people cannot • We can see the effects of variations in cortical architecture (e.g., split (hemispheric) vs. non-split models (Shillcock and Monaghan word perception model)). • We can see the effects of variations in processing resources (e.g., variations in number of hidden units in Plaut et al. models). • We can see the effects of variations in environment (e.g., what if our parents were cans, cups or books instead of humans? I.e., is there something special about face expertise versus visual expertise in general? (Sugimoto and Cottrell, Joyce and Cottrell)). • We can see variations in behavior due to different kinds of brain damage within a single “brain” (e.g. Juola and Plunkett, Hinton and Shallice). IEEE Computational Intelligence Society 4/12/2006

  11. Models can be analyzed in ways people cannot In the following, I specifically refer to neural network models. • We can do single unit recordings. • We can selectively ablate and restore parts of the network, even down to the single unit level, to assess the contribution to processing. • We can measure the individual connections -- e.g., the receptive and projective fields of a unit. • We can measure responses at different layers of processing (e.g., which level accounts for a particular judgment: perceptual, object, or categorization? (Dailey et al. J Cog Neuro 2002). IEEE Computational Intelligence Society 4/12/2006

  12. How (I like) to build Cognitive Models • I like to be able to relate them to the brain, so “neurally plausible” models are preferred -- neural nets. • The model should be a working model of the actual task, rather than a cartoon version of it. • Of course, the model should nevertheless be simplifying (i.e. it should be constrained to the essential features of the problem at hand): • Do we really need to model the (supposed) translation invariance and size invariance of biological perception? • As far as I can tell, NO! • Then, take the model “as is” and fit the experimental data: 0 fitting parameters is preferred over 1, 2 , or 3. IEEE Computational Intelligence Society 4/12/2006

  13. The other way (I like) to build Cognitive Models • Same as above, except: • Use them as exploratory models -- in domains where there is little direct data (e.g. no single cell recordings in infants or undergraduates) to suggest what we might find if we could get the data. These can then serve as “intuition pumps.” • Examples: • Why we might get specialized face processors • Why those face processors get recruited for other tasks IEEE Computational Intelligence Society 4/12/2006

  14. Outline • Review of our model of face and object processing • Some insights from modeling: • What could “holistic processing” mean? • Does a specialized processor for faces need to be innately specified? • Why would a face area process BMW’s? • Some new directions: • How do we select where to look next? • How is information integrated across saccades? IEEE Computational Intelligence Society 4/12/2006

  15. Outline • Review of our model of face and object processing • Some insights from modeling: • What could “holistic processing” mean? • Does a specialized processor for faces need to be innately specified? • Why would a face area process BMW’s? • Some new directions: • How do we select where to look next? • How is information integrated across saccades? IEEE Computational Intelligence Society 4/12/2006

  16. Happy Sad Afraid Angry Surprised Disgusted . . . Gabor Filtering PCA Neural Net . . . Pixel (Retina) Level Perceptual (V1) Level Object (IT) Level Category Level The Face Processing System IEEE Computational Intelligence Society 4/12/2006

  17. Bob Carol Ted Alice . . . Gabor Filtering PCA Neural Net . . . Pixel (Retina) Level Perceptual (V1) Level Object (IT) Level Category Level The Face Processing System IEEE Computational Intelligence Society 4/12/2006

  18. . . . . . . The Face Processing System Bob Carol Ted Cup Can Book PCA Gabor Filtering Neural Net Pixel (Retina) Level Perceptual (V1) Level Object (IT) Level Category Level Feature level IEEE Computational Intelligence Society 4/12/2006

  19. Bob Carol Ted Cup Can Book . . . LSF PCA HSFPCA Gabor Filtering . . . Neural Net Pixel (Retina) Level Perceptual (V1) Level Object (IT) Level Category Level The Face Processing System IEEE Computational Intelligence Society 4/12/2006

  20. * Convolution Magnitudes The Gabor Filter Layer • Basic feature: the 2-D Gabor wavelet filter (Daugman, 85): • These model the processing in early visual areas Subsample in a 29x36 grid IEEE Computational Intelligence Society 4/12/2006

  21. Principal Components Analysis • The Gabor filters give us 40,600 numbers • We use PCA to reduce this to 50 numbers • PCA is like Factor Analysis: It finds the underlying directions of Maximum Variance • PCA can be computed in a neural network through a competitive Hebbian learning mechanism • Hence this is also a biologically plausible processing step • We suggest this leads to representations similar to those in Inferior Temporal cortex IEEE Computational Intelligence Society 4/12/2006

  22. How to do PCA with a neural network(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Holons (Gestalt layer) Input from Perceptual Layer ... IEEE Computational Intelligence Society 4/12/2006

  23. How to do PCA with a neural network(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Holons (Gestalt layer) Input from Perceptual Layer ... IEEE Computational Intelligence Society 4/12/2006

  24. How to do PCA with a neural network(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Holons (Gestalt layer) Input from Perceptual Layer ... IEEE Computational Intelligence Society 4/12/2006

  25. How to do PCA with a neural network(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Holons (Gestalt layer) Input from Perceptual Layer ... IEEE Computational Intelligence Society 4/12/2006

  26. How to do PCA with a neural network(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Holons (Gestalt layer) Input from Perceptual Layer ... IEEE Computational Intelligence Society 4/12/2006

  27. How to do PCA with a neural network(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Holons (Gestalt layer) Input from Perceptual Layer ... IEEE Computational Intelligence Society 4/12/2006

  28. Holons (Gestalt layer) ... The “Gestalt” Layer: Holons(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) • A self-organizing network that learns whole-object representations (features, Principal Components, Holons, eigenfaces) Input from Perceptual Layer IEEE Computational Intelligence Society 4/12/2006

  29. Holons • They act like face cells (Desimone, 1991): • Response of single units is strong despite occluding eyes, e.g. • Response drops off with rotation • Some fire to my dog’s face • A novel representation: Distributed templates -- • each unit’s optimal stimulus is a ghostly looking face (template-like), • but many units participate in the representation of a single face (distributed). • For this audience: Neither exemplars nor prototypes! • Explain holistic processing: • Why? If stimulated with a partial match, the firing represents votes for this template: Units “downstream” don’t know what caused this unit to fire. (more on this later…) IEEE Computational Intelligence Society 4/12/2006

  30. ... The Final Layer: Classification(Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; Padgett & Cottrell 1996; Dailey & Cottrell, 1999; Dailey et al. 2002) The holistic representation is then used as input to a categorization network trained by supervised learning. Output: Cup, Can, Book, Greeble, Face, Bob, Carol, Ted, Happy, Sad, Afraid, etc. Categories Holons Input from Perceptual Layer • Excellent generalization performance demonstrates the sufficiency of the holistic representation for recognition IEEE Computational Intelligence Society 4/12/2006

  31. The Final Layer: Classification • Categories can be at different levels: basic, subordinate. • Simple learning rule (~delta rule). It says (mild lie here): • add inputs to your weights (synaptic strengths) when you are supposed to be on, • subtract them when you are supposed to be off. • This makes your weights “look like” your favorite patterns – the ones that turn you on. • When no hidden units => No back propagation of error. • When hidden units: we get task-specific features (most interesting when we use the basic/subordinate distinction) IEEE Computational Intelligence Society 4/12/2006

  32. Outline • Review of our model of face and object processing • Some insights from modeling: • What could “holistic processing” mean? • Does a specialized processor for faces need to be innately specified? • Why would a face area process BMW’s? • Some new directions: • How do we select where to look next? • How is information integrated across saccades? IEEE Computational Intelligence Society 4/12/2006

  33. Holistic Processing • Holistic processing refers to a type of processing where visual stimuli are treated “as a piece” -- in fact, we are unable to ignore other apparent “parts” of an image. • Face processing, in particular, is thought to be “holistic” in nature. • We are better at recognizing “Bob’s nose” when it is on his face • Changing the spacing between the eyes makes the nose look different • We are unable to ignore conflicting information from other parts of a face • All of these might be summarized as “context influences perception,” but the context is obligatory. IEEE Computational Intelligence Society 4/12/2006

  34. Who do you see? • Context influences perception IEEE Computational Intelligence Society 4/12/2006

  35. Same Different Task IEEE Computational Intelligence Society 4/12/2006

  36. IEEE Computational Intelligence Society 4/12/2006

  37. IEEE Computational Intelligence Society 4/12/2006

  38. IEEE Computational Intelligence Society 4/12/2006

  39. These look like very different women IEEE Computational Intelligence Society 4/12/2006

  40. But all that has changed is the height of the eyes, right? IEEE Computational Intelligence Society 4/12/2006

  41. Now, what do you see? Take the configural processing test! • What emotion is being shown in the top half of the image below? • Happy, Sad, Afraid, Surprised, Disgusted, or Angry? Answer: Sad IEEE Computational Intelligence Society 4/12/2006

  42. Do Holons explain these effects? • Recall that they are templates -- • each unit’s optimal stimulus is a ghostly looking face (template-like) • What will happen if there is a partial match? • Suppose there is a holon that “likes happy faces”. • The mouth will match, causing this unit to fire. • Units downstream have learned to associate this firing with a happy face. • They will “think” the top of the face is happier than it is… IEEE Computational Intelligence Society 4/12/2006

  43. Do Holons explain these effects? • Clinton/Gore: The outer part of the face votes for Gore. • The nose effect: a match at the eyes votes for that template’s nose. • Expression/identity configural effects: • Split faces: • The bottom votes for one person, the top another, but both vote for the WHOLE face… • Split expressions: • The bottom votes for one expression, the top another… IEEE Computational Intelligence Society 4/12/2006

  44. Gabor Filtering Attenuate Attenuated Pattern Gabor Pattern Attention to half an image Input Pixel Image IEEE Computational Intelligence Society 4/12/2006

  45. Composite vs. non-composite facial expressions (Calder et al. 2000) Human Reaction Times Network Errors (error bars indicate one standard deviation)

  46. Is Configural Processing of Identity and Expression Independent? • Calder et al. (2000) found that adding additional inconsistent information that is not relevant to the task didn’t further slow reaction times. • E.g., when the task is “who is it on the top?”, having a different person’s face on the bottom hurts your performance, but also having a different expression doesn’t hurt you any more. Same Identity, Different Expression Different Identity, Same Expression Different Identity, Different Expression IEEE Computational Intelligence Society 4/12/2006

  47. (Lack of) Interaction between expression and identity Human Reaction Time (ms) Network Reaction Time: 1 – Correct Output Cottrell, Branson, and Calder, 2002 IEEE Computational Intelligence Society 4/12/2006

  48. Happy Sad Afraid Bob Carol Ted . . . Neural Net . . . Attenuated Inconsistent Information here The representation of shifted information here (non-configural)-> Why does this work? Gabor Filtering PCA Leads to a weaker representation here-> Because the Wrong template Is weakly activated Has little Impact here--> because The bottom half doesn’t match any template Pixel (Retina) Level Perceptual (V1) Level Object (IT) Level Category Level IEEE Computational Intelligence Society 4/12/2006

  49. Configural/holistic processing phenomena accounted for • Interference from incorrect information in other half of image. • Lack of interference from misaligned incorrect information. • We have shown this for identity and expression, as well as the lack of interaction between these. • Calder suggested from his data that we must have two representations: one for expression and one for identity: but our model has only one representation. IEEE Computational Intelligence Society 4/12/2006

  50. Outline • Review of our model of face and object processing • Some insights from modeling: • What could “holistic processing” mean? • Does a specialized processor for faces need to be innately specified? • Why would a face area process BMW’s? • Some new directions: • How do we select where to look next? • How is information integrated across saccades? IEEE Computational Intelligence Society 4/12/2006

More Related