1 / 18

A principled way to principal components analysis

Learn how to visualize and transform large datasets, implement basic linear algebra operations, and connect them to neuronal models and brain function using Principal Components Analysis.

tdemoss
Download Presentation

A principled way to principal components analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A principled way to principal components analysis Daniel Zysman Lecturer

  2. Teaching activity objectives • Visualize large data sets. • Transform the data to aid in this visualization. • Clustering data. • Implement basic linear algebra operations. • Connect this operations to neuronal models and brain function.

  3. Context for the activity • Homework Assignment in 9.40 Intro to neural Computation (Sophomore/Junior). • In-class activity 9.014 Quantitative Methods and Computational Models in Neuroscience (1st year PhD).

  4. Data visualization and performing pca:

  5. MNIST data set 28 by 28 pixels 8-bit gray scale images These images live in a 784 dimensional space http://yann.lecun.com/exdb/mnist/

  6. Can we cluster images in the pixel space?

  7. One possible visualization There are more than 300000 possible pairwise pixel plots!!!

  8. Is there a more principled way? • Represent the data in a new basis set. • Aids in visualization and potentially in clustering and dimensionality reduction. • PCA provides such a basis set by looking at directions that capture most variance. • The directions are ranked by decreasing variance. • It diagonalizes the covariance matrix.

  9. Pedagogical approach • Guide them step by step to implement PCA. • Emphasize visualizations and geometrical approach/intuition. • We don’t use the MATLAB canned function for PCA. • We want students to get their hands “dirty”. This helps build confidence and deep understanding.

  10. PCA Mantra • Reshape the data to proper format for PCA. • Center the data performing mean subtraction. • Construct the data covariance matrix. • Perform SVD to obtain the eigenvalues and eigenvectors of the covariance matrix. • Compute the variance explained per component and plot it. • Reshape the eigenvectors and visualize their images. • Project the mean subtracted data onto the eigenvectors basis.

  11. First 9 Eigenvectors

  12. The first two PCs capture ~37% of the variance. The data forms clear clusters that are almost linearly separable Projections onto the first 2 axes

  13. Building models: Synapses and PCA

  14. Hebbian Learning • 1949 book: 'The Organization of Behavior' Theory about the neural bases of learning • Learning takes place at synapses. • Synapses get modified, they get stronger when the pre- and post- synaptic cells fire together. • "Cells that fire together, wire together" Donald Hebb

  15. Building Hebbian synapses Unstable

  16. Oja’s rule Feedback,forgetting term or regularizer Erkki Oja • Stabilizes the Hebbian rule. • Leads to a covariance learning rule: the weights converge to the first eigenvector of the covariance matrix. • Similar to power iteration method. A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology,15:267-273 (1982).

  17. Learning outcomes • Visualize and manipulate a relatively large and complex data set. • Perform PCA by building it step by step. • Gain an intuition of the geometry involved in a change of basis and projections. • Start thinking about basic clustering algorithms. • Discuss on dimensionality reduction and other PCA applications

  18. Learning outcomes (cont) • Discuss the assumptions, limitations and shortcomings of applying PCA in different contexts. • Build a model of how PCA might actually take place in neural circuits. • Follow up: eigenfaces, is the brain doing PCA to recognize faces?

More Related