1 / 63

Computational approaches to vision science

Computational approaches to vision science. NRS 495 – Neuroscience Seminar Christopher DiMattina , PhD. Marr’s levels of description. David Marr. British computational neuroscientist (1945-1980) C ontributions to cognitive science and machine vision. Understanding vision.

sveta
Download Presentation

Computational approaches to vision science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational approaches to vision science NRS 495 – Neuroscience Seminar Christopher DiMattina, PhD

  2. Marr’s levels of description NRS 495 - Grinnell College - Fall 2012

  3. David Marr • British computational neuroscientist (1945-1980) • Contributions to cognitive science and machine vision NRS 495 - Grinnell College - Fall 2012

  4. Understanding vision • The stated goal of visual neuroscience is to understand how the visual brain works • However it is unclear what is even meant by this • Existing experimental paradigms are inadequate for understanding how it works NRS 495 - Grinnell College - Fall 2012

  5. Neurophysiology • Suppose we found the grandmother cell (WHAT) • Would not tell us HOW response properties are generated from simple neurons • Would not tell us WHY we have such neurons NRS 495 - Grinnell College - Fall 2012

  6. HOW • To get HOW, you need a biologically plausible neural network model – computational neuroscience • Still, this does not tell you WHY NRS 495 - Grinnell College - Fall 2012

  7. WHY • To understand WHY, you need to know what problem the system is trying to solve • Computational theory of visual information processing “A wing would be a most mysterious structure if one did not know that birds flew” - Horace B. Barlow (1961) NRS 495 - Grinnell College - Fall 2012

  8. Machine Vision • To understand the visual system, one must specify the problem and what computations could solve it • If you really know what information processing is going on, you can implement it on a computer NRS 495 - Grinnell College - Fall 2012

  9. Three levels of description • Computational Theory • What is the goal of the computation? What is its logic? • Representation and Algorithm • How can you represent the inputs and outputs? What is the algorithm for the transformation? • Hardware Implementation • How do you implement the representation and algorithm NRS 495 - Grinnell College - Fall 2012

  10. Example • Computation: Addition of numbers for a cash register • Representation: Binary numbers Base 10 numbers Algorithm: The one you learned in grade school • Implementation: Mechanical, computer, etc… NRS 495 - Grinnell College - Fall 2012

  11. Representational framework for vision • Image (intensity) • Primal sketch (edges) • 2 ½ D sketch (surfaces) • 3D model representation NRS 495 - Grinnell College - Fall 2012

  12. Edge detection and the primal sketch • Luminance edges represent changes in intensity • Maxima of first derivative absolute value • Zero-crossings of second-derivative NRS 495 - Grinnell College - Fall 2012

  13. Edges occur on multiple scales • Apply Gaussian blurring to image to select scale NRS 495 - Grinnell College - Fall 2012

  14. Edge detection operator • Apply Gaussian blurring to select scale • Take second derivative (Laplacian) • Find zero-crossings Laplacian of Gaussian operator NRS 495 - Grinnell College - Fall 2012

  15. Linear filtering • Neuron receptive field modeled as a set of numbers indicating spatial arrangement of excitation and inhibition • Predicted response of model neuron given by multiplying this with the image NRS 495 - Grinnell College - Fall 2012

  16. Example operators NRS 495 - Grinnell College - Fall 2012

  17. Laplacian of Gaussian • Resembles retinal ganglion cell receptive fields with center-surround structure NRS 495 - Grinnell College - Fall 2012

  18. Edge detection NRS 495 - Grinnell College - Fall 2012

  19. Edge detection NRS 495 - Grinnell College - Fall 2012

  20. Operators predictions match RGC cell NRS 495 - Grinnell College - Fall 2012

  21. Marr’s framework • Specify problem to be solved (edge detection) • Develop computational theory - derive center-surround operators similar to retinal ganglion • For pixel image, algorithm is operator convolution • Can implement in a variety of ways (brain, computer, etc…) NRS 495 - Grinnell College - Fall 2012

  22. Barlow and efficient coding NRS 495 - Grinnell College - Fall 2012

  23. Horace Barlow • Neurophysiologist and theorist of vision • Two major ideas (related) • Single neuron doctrine • Efficient coding hypothesis NRS 495 - Grinnell College - Fall 2012

  24. Efficient coding • Barlow hypothesizes that sensory relays recode messages so that redundancy is reduced but little information is lost • For instance, linearly arranged retinal ganglion cells may often fire together when there is an edge • V1 recodes this information in a less redundant manner which preserves the essential information NRS 495 - Grinnell College - Fall 2012

  25. Economy of impulses • Stimuli occurring most often should be coded with a small number of spikes • Stimuli occurring less often should be coded with large number of spikes • Over the distribution of stimuli, this economizes the code NRS 495 - Grinnell College - Fall 2012

  26. Neural activity is metabolically expensive • Recent calculations based on cortical metabolism suggest that at most 1% of cells can be firing strongly at any time (Lennie 2003) NRS 495 - Grinnell College - Fall 2012

  27. Experimental Evidence: Laughlin 1981 • Can maximize information transmission about luminance by making sure all response levels are used with equal frequency • Explains contrast response of fly retinal neurons NRS 495 - Grinnell College - Fall 2012

  28. Sparse coding NRS 495 - Grinnell College - Fall 2012

  29. Two ways to exploit redundancy NRS 495 - Grinnell College - Fall 2012

  30. Compact coding • Assume you have two neurons, each sensitive to one dimension • If data is correlated, there is redundancy in their responses NRS 495 - Grinnell College - Fall 2012

  31. Compact coding • One can represent this data more efficiently with one neuron instead of two NRS 495 - Grinnell College - Fall 2012

  32. Sparse coding • Suppose the data are described by a two-armed distribution (non-Gaussian) • What code would represent the data with the fewest active neurons? NRS 495 - Grinnell College - Fall 2012

  33. Sparse coding NRS 495 - Grinnell College - Fall 2012

  34. Kurtosis • One characteristic of a sparse code is that response distributions over all stimuli have high kurtosis • Cells mostly quiet, but respond strongly to only a few stimuli NRS 495 - Grinnell College - Fall 2012

  35. V1 responses to natural images NRS 495 - Grinnell College - Fall 2012

  36. Critical question • If we learn a linear basis for natural images which maximizes the statistical independence and sparseness of cell responses, what would the basis functions look like? NRS 495 - Grinnell College - Fall 2012

  37. Basis representations of images • An image patch can be represented as a sum of basis patches • Fourier transform represents an image in terms of a sum of sine-wave gratings NRS 495 - Grinnell College - Fall 2012

  38. Optimization problem • Accurately reconstruct natural images • Maximize sparseness of neuron responses NRS 495 - Grinnell College - Fall 2012

  39. The learned filters NRS 495 - Grinnell College - Fall 2012

  40. V1 neurons form a sparse code • V1 filters  Sparse code • Sparse code  V1 filters NRS 495 - Grinnell College - Fall 2012

  41. ICA • Similar results obtained when one learns a transformation between a set of inputs and outputs which maximizes output entropy (Bell & Sejnowski 1997). • Exactly the idea proposed by Barlow NRS 495 - Grinnell College - Fall 2012

  42. Learning hierarchical representations NRS 495 - Grinnell College - Fall 2012

  43. Can we apply similar ideas to learn more complex representations? • Over all images, responses of V1 neurons follow a Laplacian distribution with constant variance λ • For particular regions, there are characteristic patterns of variance and covariance in the activity histograms NRS 495 - Grinnell College - Fall 2012

  44. Variance patterns NRS 495 - Grinnell College - Fall 2012

  45. Complicated statistical dependencies NRS 495 - Grinnell College - Fall 2012

  46. Variance patterns NRS 495 - Grinnell College - Fall 2012

  47. Hierarchical model • Assume patterns of variance are generated by a code of sparse, independent variables v • Learn a set of weights B which describes the commonly occurring patterns I natural images NRS 495 - Grinnell College - Fall 2012

  48. Variance components • Learn sensitivity to different higher-order patterns (textures) NRS 495 - Grinnell College - Fall 2012

  49. Unit activities generalize NRS 495 - Grinnell College - Fall 2012

  50. More recent model • Learned patterns of covariance in activities of V1 cells • Replicates many response properties observed in complex cells and nonlinear neurons in V2 • Outputs segregate textures well NRS 495 - Grinnell College - Fall 2012

More Related