Sublinear Algorihms for Big Data

1 / 11

Sublinear Algorihms for Big Data - PowerPoint PPT Presentation

Slides are available at http://grigory.us/big-data.html. Sublinear Algorihms for Big Data. Lecture 4. Grigory Yaroslavtsev http://grigory.us. Today. Dimensionality reduction AMS as dimensionality reduction Johnson- Lindenstrauss transform Sublinear time algorithms

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Sublinear Algorihms for Big Data

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Slides are available at http://grigory.us/big-data.html

SublinearAlgorihms for Big Data

Lecture 4

GrigoryYaroslavtsev

http://grigory.us

Today
• Dimensionality reduction
• AMS as dimensionality reduction
• Johnson-Lindenstrauss transform
• Sublinear time algorithms
• Definitions: approximation, property testing
• Basic examples: approximating diameter, testing properties of images
• Testing sortedness
• Testing connectedness
-norm Estimation
• Stream: updates that define vector where .
• Example: For
• -norm:
-norm Estimation
• -norm:
• Two lectures ago:
• -moment
• -moment (via AMS sketching)
• Space:
• Technique: linear sketches
• for random set s
• for random signs
AMS as dimensionality reduction
• Maintain a “linear sketch” vector

, where

• Estimator for :
• “Dimensionality reduction”: “heavy” tail
Normal Distribution
• Normal distribution
• Range:
• Density:
• Mean = 0, Variance = 1
• Basic facts:
• If and are independent r.v. with normal distribution then has normal distribution
• If are independent, then
Johnson-Lindenstrauss Transform
• Instead of let be i.i.d. random variables from normal distribution
• We still have because:
• “variance of “
• Define and define:
• JL Lemma: There exists s.t. for small enough
Proof of JL Lemma
• JL Lemma: s.t. for small enough
• Assume .
• We have and
• Alternative form of JL Lemma:
Proof of JL Lemma
• Alternative form of JL Lemma:
• Let and
• For every we have:
• By Markov and independence of :
• We have , hence:
Proof of JL Lemma
• Alternative form of JL Lemma:
• For every we have:
• Let and recall that
• A calculation finishes the proof:
Johnson-Lindenstrauss Transform
• Single vector:
• Tight: [Woodruff’10]
• vectors simultaneously:
• Tight: [Molinaro, Woodruff, Y. ’13]
• Distances between vectors = vectors: