1 / 15

Topics in Algorithms 2007

Topics in Algorithms 2007. Ramesh Hariharan. Random Projections. Solving High Dimensional Problems. How do we make the problem smaller? Sampling, Divide and Conquer, what else?. Projections.

Download Presentation

Topics in Algorithms 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics in Algorithms 2007 Ramesh Hariharan

  2. Random Projections

  3. Solving High Dimensional Problems • How do we make the problem smaller? • Sampling, Divide and Conquer, what else?

  4. Projections • Can n points in m dimensions be projected to d<<m dimensions while maintaining geometry (pairwise distances)? • Johnson-Lindenstrauss: YES for d ~ log n/ε2 , each distance stretches/contracts by only an O(ε) factor So an algorithm with running time f(n,d) now takes f(n,log n) and results don’t change very much (hopefully!)

  5. Which d dimensions? • Any d coordinates? • Random d coordinates? • Random d dimensional subspace

  6. Random Subspaces • How is this defined/chosen computationally? • How do we choose a random line (1-d subspace) • We need to choose m coordinates • Normals to the rescue Choose independent random variables X1…Xm each N(0,1)

  7. Why do Normals work? • Take 2d: Which points on the circle are more likely? e-x2/2 dx X e-y2/2 dy dx dy

  8. A random d-dim subspace • How do we extend to d dimensions? • Choose d random vectors Choose independent random variables Xi1…Xim each N(0,1), i=1..d

  9. Distance preservation • There are nC2 distances • What happens to each after projection? • What happens to one after projection; consider single unit vector along x axis • Length of projection sqrt[(X11/l1 )2+ … + (Xd1/ld)2]

  10. Orthogonality • Not exactly • The random vectors aren’t orthogonal • How far away from orthogonal are they? • What is the expected dot product? 0! (by linearity of expectation)

  11. Assume Orthogonality • Lets assume orthogonality for the moment • How do we determine bounds on the distribution of the projection length sqrt[(X11/l1 )2+ … + (Xd1/ld)2] • Expected value of [(X11)2+ … + (Xd1)2] is d (by linearity of expectation) • Expected value of each li2 is n (by linearity of expectation) • Roughly speaking, overall expectation is sqrt(d/n) • A distance scales by sqrt(d/n) after projection in the “expected” sense; how much does it depart from this value?

  12. What does Expectation give us • But E(A/B) != E(A)/E(B) • And even if it were, the distribution need not be tight around the expectation • How do we determine tightness of a distribution? • Tail Bounds for sums of independent random variables; summing gives concentration

  13. Tail Bounds • P(|Σk Xi2– k| > εk) < 2e-Θ(ε2k) sqrt[(X11/l1 )2+ … + (Xd1/ld)2] • Each li2 is within (1 +/- ε)n with probability inverse exponential in n • [(X11)2+ … + (Xd1)2] is within (1 +/- ε)d with probability inverse exponential in ε2d • sqrt[(X11/l1 )2+ … + (Xd1/ld)2] is within (1 +/- O(ε)) sqrt(d/n) with probability inverse exponential in ε2d (by the union bound)

  14. One distance to many distances • So one distance D has length D (1 +/- O(ε)) sqrt(d/n) after projection with probability inverse exponential in ε2d • How about many distances (could some of them go astray?) • There are nC2 distances • Each has inverse exponential in d probability of failure, i.e., stretching/compressing beyond D (1 +/- O(ε)) sqrt(d/n) • What is the probability of no failure? Choose d appropriately (union bound again)

  15. Orthogonality • How do you fix this? • Exercise….

More Related