1 / 23

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform. Nir Ailon , Bernard Chazelle (Princeton University). Dimension Reduction. Algorithmic metric embedding technique (R d , L q ) ! (R k , L p ) k << d Useful in algorithms requiring exponential (in d) time/space

daisy
Download Presentation

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximate Nearest Neighborsand theFast Johnson-LindenstraussTransform Nir Ailon, Bernard Chazelle (Princeton University)

  2. Dimension Reduction • Algorithmic metric embedding technique (Rd, Lq) ! (Rk, Lp) k << d • Useful in algorithms requiring exponential (in d) time/space • Johnson-Lindenstrauss for L2 • What is exact complexity?

  3. Dimension Reduction Applications • Approximate nearest neighbor [KOR00, IM98]… • Text analysis [PRTV98] • Clustering [BOR99, S00] • Streaming [I00] • Linear algebra [DKM05, DKM06] • Matrix multiplication • SVD computation • L2 regression • VLSI layout Design [V98] • Learning [AV99, D99, V98] . . .

  4. Three Quick Slides on:Approximate Nearest Neighbor Searching . . .

  5. Approximate Nearest Neighbor P = Set of n points x pmin p dist(x,p) · (1+)dist(x,pmin)

  6. Approximate Nearest Neighbor • d can be very large • -approx beats “curse of dimensionality” • [IM98, H01] (Euclidean), [KOR00] (Cube): • Time O(-2d log n) • Space nO(-2) Bottleneck: Dimension reduction Using FJLT O(d log d + -3 log2 n)

  7. The d-Hypercube Case • [KOR00] • Binary search on distance 2 [d] • For distance  multiply space by random matrix2 Z2k £ d k=O(-2 log n)ij i.i.d. » biased coin • Preprocess lookup tables for x (mod 2) • Our observation:  can be made sparse • Using “handle” to p2 P s.t. dist(x,p)   • Time for each step: O(-2d log n) ) O(d + -2 log n) How to make similar improvement for L2 ?

  8. Back to Euclidean Space andJohnson-Lindenstrauss . . .

  9. History of Johnson-LindenstraussDimension Reduction [JL84] • : Projection of Rd onto random subspace of dimension k=c-2 log n • w.h.p.:8 pi,pj2 P ||  pi -  pj ||2 = (1±O()) ||pi - pj||2 • L2! L2 embedding

  10. History of Johnson-LindenstraussDimension Reduction [FM87], [DG99] • Simplified proof, improved constant c • 2 Rk £ d : random orthogonal matrix 1 ||i||2=1 i¢j = 0 2 k

  11. History of Johnson-LindenstraussDimension Reduction [IM98] • 2 Rk£ d : ij i.i.d. » N(0,1/d) 1 E ||i||22=1 E i¢j = 0 2 k

  12. History of Johnson-LindenstraussDimension Reduction [A03] • Need only tight concentration of |i¢ v|2 • 2 Rk£ d : ij i.i.d. » +1 1/2 -1 1/2 1 E ||i||22=1 E i¢j = 0 2 k

  13. History of Johnson-LindenstraussDimension Reduction [A03] • 2 Rk£ d : ij i.i.d. » • Sparse +1 1/6 0 2/3 -1 1/6 1 E ||i||22=1 E i¢j = 0 2 k

  14. Sparse Johnson-Lindenstrauss • Sparsity parameter: s = Pr[ ij 0 ] • Cannot be o(1) due to “hidden coordinate” 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 v = 2 Rd 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

  15. Uncertainty Principle ^ v sparse ) v dense v = H v ^ • - Walsh - Hadamard matrix • - Fourier transform on {0,1}log2 d • Computable in time O(d log d) • Isometry: ||v||2 = ||v||2 ^

  16. Adding Randomization • H deterministic, invertible) We’re back to square one! • Precondition H with random diagonal D ±1 ±1 ±1 D = . . . • - Computable in time O(d) • Isometry

  17. The l1-Bound Lemma • w.h.p.:8 pi,pj2 P µ Rd : ||HD(pi - pj)||1· O(d-1/2 log1/2 n) ||pi - pj||2 • Rules out: HD(pi – pj) = “hidden coordinate vector” !! instead...

  18. Hidden Coordinate-Set Worst-case v = pi - pj (assuming l1-bound): • 8 j  J: |vj| = (d-1/2 log1/2 n) 8 jJ: vj= 0 J µ [d], |J| = (d/log n) (assume ||v||2 = 1)

  19. Fast J-L Transform FJLT =  H D ij i.i.d» Sparse JL Diag(±1) Hadamard 0 1-s N(0,1) s l2! l2 l2! l1 -1 log n log2 n s  s  d d Bottleneck: Bias of |i¢ v| Bottleneck: Variance of |i¢ v|2

  20. Applications • Approximate nearest neighbor in (Rd, l2) • l2 regression: minimize ||Ax-b||2A 2 Rn £ d over-constrained: d<<n [DMM06] approximate by sampling [Sarlos06] using FJLT ) constructive • More applications...? non-constructive

  21. Interesting Problem I Improvement & lower bound for J-L computation

  22. Interesting Problem II • Dimension reduction is sampling • Sampling by random walk: • Expander graphs for uniform sampling • Convex bodies for volume estimation • [Kac59]: Random walk on orthogonal groupfor t=1..T: pick i,j 2R [d], 2R [0,2) vi vi cos + vj sin vj -vi sin+ vj cos • Output (v1, ..., vk) as dimension reduction of v • How many steps for J-L guarantee? • [CCL01], [DS00], [P99] . . . Thank You ! Ã

  23. Thank You!

More Related