jeff hansen senior data engineer n.
Skip this Video
Loading SlideShow in 5 Seconds..
Demystifying Dimensionality Reduction PowerPoint Presentation
Download Presentation
Demystifying Dimensionality Reduction

Loading in 2 Seconds...

play fullscreen
1 / 70

Demystifying Dimensionality Reduction - PowerPoint PPT Presentation

  • Updated on

Jeff Hansen Senior Data Engineer. April 2013. Demystifying Dimensionality Reduction. Demystifying Dimensionality Reduction. A Tribute to Johnson and Lindenstrauss. Who is this?. What is this?. How about this?. Hint: It’s for kids…. Some Perspectives are Better than Others.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Demystifying Dimensionality Reduction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Jeff Hansen Senior Data Engineer April 2013 Demystifying Dimensionality Reduction

    2. Demystifying Dimensionality Reduction A Tribute to Johnson and Lindenstrauss

    3. Who is this?

    4. What is this?

    5. How about this? • Hint: It’s for kids…

    6. Some Perspectives are Better than Others Getting a better look

    7. For kids…

    8. Great, but… • What does this have to do with Machine Learning? • How can this help me visualize my data? • How do I use this to recommend new products to new customers? • Can this help me detect fraud?

    9. Dimensions in Data Going beyond 3-D

    10. Samples and Variables Samples are things. Things have numerous: • Features • Characteristics • Attributes • Variables • aka Dimensions

    11. Examples

    12. Distance and Similarity If we • Treat each feature like a dimension • Treat each item like a point Then • Similar items are closer together • Dissimilar items are further apart

    13. User Group Posts

    14. Measures of Distance Various measures of distance with scary math names: • Euclidean Distance • Maximum Distance • Manhattan Distance • L(n) Norm

    15. Curse of Dimensionality • You think more than 3 dimensions are hard? Try a couple million… • Calculating similarity becomes increasingly difficult as a feature set grows.

    16. What to do??

    17. Reduce the Number of Dimensions Johnson-Lindenstrauss Theorem • Number of Dimensions doesn’t matter, the sample size does – approximate item similarity can be maintained with a number of dimensions on the order of log(n) the number of points. English? • Every time you double the number of points you only need to add a constant number of additional dimensions.

    18. Huh?

    19. This is worth Repeating The number of dimensions doesn’t matter. If all you care about is item similarity, you can project an INFINITE number of dimensions onto a lower number of dimensions based on the number of points you want to compare.

    20. A Graphical Explanation

    21. 3 points in 3 dimensions

    22. Bad Projection

    23. Good Projection

    24. Deeper Meaning

    25. Feature Extraction • What if there were unrecorded variables that explain the variables we can see? • Dimensionality Reduction techniques extractthese hidden variables or features. • For example, Topics explain the appearance of words in documents, Genres explain the movies that people watch.

    26. Sounds Great! But how do I do it?

    27. Singular Value What? Unfortunately, the techniques come with tongue twisting unintuitive names: • SVD – Singular Value Decomposition • PCA – Principle Component Analysis • LSA – Latent Semantic Analysis • LDA – Linear Discriminant Analysis • Random Projections • MinHash

    28. A Brief Refresher of Linear Algebra Don’t Panic!

    29. Vectors and Projections *Image courtesy of Wikipedia:

    30. Vector “dot” Products • A . B = (a1 * b1) + (a2 * b2) + (a3 * b3) • A . B = || A || * || B || * cosθ If B is a unitvector (it has a length of 1) then the result is simply the length of A projected onto the line (or dimension) formed by B. Remember that a “good” projection is one where the angle is close to zero, so that cosθis close to 1 and the dot product of A and B is approximately the length of A. This is like projecting the face of a coin onto a surface that’s parallel to the face of the coin – that would be a good projection.

    31. Matrix Multiplication Cell 1,1 = Row 1 times Column 1 = (a1,1 x b1,1) + (a1,2 x b2,1) + (a1,3 x b3,1) Cell 1,2 = Row 1 times Column 2 = … …

    32. An Easier Way to Remember

    33. The “Cubic” View

    34. Distributing the Workload

    35. The “Layered” View

    36. Matrix Division? What if you could factor a matrix?

    37. Matrix Division? What if you could factor a matrix? You Can! Matrix Decompositions: • LU Decomposition • QR Decomposition • Eigen Decomposition • Singular Value Decomposition

    38. Why would you Want to? 1,000,000 x 1,000,000 = 1,000,000,000,000 100 X 1,000,000 + 100 x 1,000,000 = 200,000,000 That’s a MUCH smaller representation!

    39. Factors as Basis for new Space Suppose Cis a Matrix of people who have watched movies. Every Row represents a person and ever column represents a movie. If we can find matrices A and B where A x B approximates C: • Each row of A models a person • The distance between two rows of A models relative similarity • Each column of B models a movie • The distance between two columns of B models relative similarity

    40. Big Data, Smaller Models Movies … People … … …

    41. SingularValueDecomposition

    42. A = U Σ V* • U and V are square orthonormal matrices – rows and columns are all unit vectors. • Σ is a rectangular diagonal matrix with values decreasing from left to right. • U and V can be viewed as projection matrices, Σ as a scaling matrix. • Earlier columns of U and V* capture most of the “action” of A. • If Σ “decays” quickly enough, most of U and V* is insignificant and can be thrown away without significantly affecting the model.

    43. Using “Cubic” Visualization Dark grey indicates zero or very small values. A U Σ V*

    44. Σ A V* U As columns of U get multiplied by decreasing singular values, the result is smaller column vectors.

    45. A U Σ V*

    46. A U V* Σ

    47. U Σ V* = A Σ V* U A

    48. Reconstituting Cells