1 / 11

Sublinear Algorihms for Big Data

Slides are available at http://grigory.us/big-data.html. Sublinear Algorihms for Big Data. Lecture 4. Grigory Yaroslavtsev http://grigory.us. Today. Dimensionality reduction AMS as dimensionality reduction Johnson- Lindenstrauss transform Sublinear time algorithms

colman
Download Presentation

Sublinear Algorihms for Big Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Slides are available at http://grigory.us/big-data.html SublinearAlgorihms for Big Data Lecture 4 GrigoryYaroslavtsev http://grigory.us

  2. Today • Dimensionality reduction • AMS as dimensionality reduction • Johnson-Lindenstrauss transform • Sublinear time algorithms • Definitions: approximation, property testing • Basic examples: approximating diameter, testing properties of images • Testing sortedness • Testing connectedness

  3. -norm Estimation • Stream: updates that define vector where . • Example: For • -norm:

  4. -norm Estimation • -norm: • Two lectures ago: • -moment • -moment (via AMS sketching) • Space: • Technique: linear sketches • for random set s • for random signs

  5. AMS as dimensionality reduction • Maintain a “linear sketch” vector , where • Estimator for : • “Dimensionality reduction”: “heavy” tail

  6. Normal Distribution • Normal distribution • Range: • Density: • Mean = 0, Variance = 1 • Basic facts: • If and are independent r.v. with normal distribution then has normal distribution • If are independent, then

  7. Johnson-Lindenstrauss Transform • Instead of let be i.i.d. random variables from normal distribution • We still have because: • “variance of “ • Define and define: • JL Lemma: There exists s.t. for small enough

  8. Proof of JL Lemma • JL Lemma: s.t. for small enough • Assume . • We have and • Alternative form of JL Lemma:

  9. Proof of JL Lemma • Alternative form of JL Lemma: • Let and • For every we have: • By Markov and independence of : • We have , hence:

  10. Proof of JL Lemma • Alternative form of JL Lemma: • For every we have: • Let and recall that • A calculation finishes the proof:

  11. Johnson-Lindenstrauss Transform • Single vector: • Tight: [Woodruff’10] • vectors simultaneously: • Tight: [Molinaro, Woodruff, Y. ’13] • Distances between vectors = vectors:

More Related