- 285 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Algorithmic Aspects of Finite Metric Spaces' - Antony

Download Now**An Image/Link below is provided (as is) to download presentation**

Download Now

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

OutlineOutline

z

y

Metric Space- A set of points X
- Distance function d(x,y)d : X [0…)
- d(x,y) = 0 iff x=y
- d(x,y) = d(y,x) Symmetric
- d(x,z) ≤ d(x,y) + d(y,z)Triangle inequality
- Metric space M(X,d)

Example Metrics: Normed spaces

- x = (x1, x2, …, xd) y = (y1, y2, …, yd)
- ℓpnorm ℓ1ℓ2(Euclidean)ℓ
- ℓpd : ℓpnorm in Rd
- Hamming cube {0,1}d

Example Metrics: domain specific

- Shortest path distances on graph
- Symmetric difference on sets
- Edit distance on strings
- Hausdorff distance, Earth Mover Distance on sets of n points

Metric Embeddings

- General idea: Map complex metrics to simple metrics
- Why ? richer algorithmic toolkit for simple metrics
- Simple metrics
- normed spaces ℓp
- low dimensional normed spaces ℓpd
- tree metrics
- Mapping should not change distances much (low distortion)

Low Distortion Embeddings

f

- Metric spaces (X1,d1) & (X2,d2),embedding f: X1 X2 has distortion D if ratio of distances changes by ≤ D x,y X1:

http://humanities.ucsd.edu/courses/kuchtahum4/pix/earth.jpg

http://www.physast.uga.edu/~jss/1010/ch10/earth.jpg

Applications

- High dimensional Low dimensional(Dimension reduction)
- Algorithmic efficiency (running time)
- Compact representation (storage space)
- Streaming algorithms
- Specific metrics normed spaces
- Nearest neighbor search
- Optimization problems
- General metrics tree metrics
- Optimization problems, online algorithms

Solve problems on very large data sets in one pass using a very small amount of storage

A (very) Brief History:fundamental results

- Metric spaces studied in functional analysis
- n point metric embeds into ℓnwithno distortion[Frechet]
- n point metric embeds into ℓp with distortion log n [Bourgain ’85]
- Dimension reduction fornpoint Euclidean metric with distortion 1+ε[Johnson, Lindenstrauss ’84]

A (very) Brief History:applications in Computer Science

- Optimization problems
- Application to graph partitioning [Linial, London, Rabinovich ‘95][Arora, Rao, Vazirani ’04]
- n point metrics into tree metrics[Bartal ’96 ‘98] [FRT ’03]
- Efficient algorithms
- Dimension reduction
- Nearest neighbor search, Streaming algorithms

Outline

metric as data

- Dimension reduction
- Streaming data model
- Compact representation
- Finite metrics in optimization
- graph partitioning and clustering

Embedding theorems for finite metrics

metric as model

Disclaimer

- This is not an attempt at a survey
- Biased by my own interests
- Much more relevant and related work than I can do justice do in limited time.
- Goal: Give glimpse of different applications of finite metric spaces
- Core ideas, no messy details

Disclaimer: Community Bias

- Theoretical viewpoint
- Focus on algorithmic techniques with performance guarantees
- Worst case guarantees

Outline

metric as data

- Dimension reduction
- Streaming data model
- Compact representation
- Finite metrics in optimization
- graph partitioning and clustering

Embedding theorems for finite metrics

metric as model

Metric as data

- What is the data ?
- Mathematical representation of objects (e.g. documents, images, customer profiles, queries).
- Sets, vectors, points in Euclidean space, points in a metric space, vertices of a graph.
- Metric is part of data

Johnson Lindenstrauss [JL84]

- n points in Euclidean space (ℓ2 norm) can be mapped down to O((log n)/2) dimensions with distortion at most 1+.
- Quite simple [JL84, FM88, IM98, AV99, DG99, Ach01]
- Project onto random unit vectors
- projection of (u-v) onto one random vector behaves like Gaussian scaled by ||u-v||2
- Need log n dimensionsfor tight concentration bounds
- Even a random {-1,+1} vector works…

Dimension reduction for ℓ2

- Two interesting properties:
- Linear mapping
- Oblivious – choice of linear mapping does not depend on point set
- Many applications …
- Making high dimensional problems tractable
- Streaming algorithms
- Learning mixtures of gaussians [Dasgupta ’99]
- Learning robust concepts [Arriaga,Vempala ’99] [Klivans,Servedio ’04]

Dimension reduction for ℓ1

- [C,Sahai ‘02]Linear embeddings are not good for dimension reduction in ℓ1
- There exist n points in ℓ1in n dimensions, such that any linear mapping with distortion needs n/2dimensions

Dimension reduction for ℓ1

- [C, Brinkman ‘03]Strong lower boundsfor dimension reduction in ℓ1
- There exist n points in ℓ1, such that anyembedding with constant distortion needs n1/2dimensions
- Alternate, simpler proof [Lee, Naor ’03]

Outline

metric as data

- Dimension reduction
- Streaming data model
- Compact representation
- Finite metrics in optimization
- graph partitioning and clustering

Embedding theorems for finite metrics

Solve problems on very large data sets in one pass using a very small amount of storage

metric as model

Frequency Moments

[Alon,Matias,Szegedy ‘99]

Data stream is sequence of elements in [n]

ni : frequency of element i

Fk =nik: kth frequency moment

F0 = number of distinct elements

F2= skewness measure of data stream

Goal:

Given a data stream, estimate Fk in one pass and sub-linear space

Estimating F2

- Consider a single counter c and randomly chosen xi{ +1, -1} for each i in [n]
- On seeing each element i, update c += xi
- c = ni •xi
- Claim: E[c2] = ni2= F2Var[c2] 2(F2)2(4-wise independence)
- Average 1/2 copies of this estimator to get (1+) approximation

Differences between data streams

- ni : frequency of element i in stream 1
- mi : frequency of element i in stream 2
- Goal: measure (ni –mi)2
- F2 sketches are additive ni •xi - mi •xi = (ni –mi)•xi
- Basically, dimension reduction in ℓ2 norm
- Very useful primitivee.g. frequent items [C, Chen, Farach-Colton ’02]

Estimate ℓ1 norms ?

[Indyk ’00]

- p-stable distribution:Distribution over R such that ni •xi distributed as ( |ni|p)1/p X
- Cauchy distribution: c(x)=1/(1+x2) 1-stable
- Gaussian distribution 2-stable
- As before, c = ni •xi
- Cauchy does not have finite expectation !
- Estimate scale factor by taking median

Outline

metric as data

- Dimension reduction
- Streaming data model
- Compact representation
- Finite metrics in optimization
- graph partitioning and clustering

Embedding theorems for finite metrics

metric as model

Similarity Preserving Hash Functions

- Similarity function sim(x,y)
- Family of hash functions F with probability distribution such that

Applications

- Compact representation scheme for estimating similarity
- Approximate nearest neighbor search [Indyk,Motwani ’98] [Kushilevitz,Ostrovsky,Rabani ‘98]

Estimating Set Similarity

[Broder,Manasse,Glassman,Zweig,’97]

[Broder,C,Frieze,Mitzenmacher,’98]

- Collection of subsets

Existence of SPH schemes [C ’02]

- sim(x,y) admits an SPH scheme if family of hash functions F such that

Theorem: If sim(x,y) admits an SPH scheme then 1-sim(x,y) satisfies triangle inequality; embeds into ℓ1

- Rounding procedures for LPs and SDPs yield similarity and distance preserving hashing schemes.

Earth Mover Distance (EMD)

LP Rounding algorithms for optimization problem (metric labelling) yield log n approximate estimator for EMD on n points.

Implies that EMD embeds into ℓ1 with distortion log n

P

Q

EMD(P,Q)

metric as data

- Dimension reduction
- Streaming data model
- Compact representation
- Finite metrics in optimization
- graph partitioning and clustering

Embedding theorems for finite metrics

metric as model

V

Graph partitioning problems- Given graph, partition into U,V
- Maximum cut maximize |E(U,V)|
- Sparsest cut

minimize

his

The

secretary

he

Saddam Hussein

Correlation clustering[Cohen,Richman,’02][Bansal,Blum,Chawla,’02]

Similar (+)

Dissimilar (-)

example courtesy Shuchi Chawla

Mr. Rumsfeld

his

The

secretary

he

Saddam Hussein

V

1

0

0

Graph partitioning as metric problem- Partitioning is equivalent to finding appropriate {0,1} metric
- possibly additional constraints
- Objective function linear in metric
- Find best {0,1} metric

cut metric

relaxation

Metric relaxation approaches

- Max Cut [Goemans,Williamson ’94]
- map vertices to points on unit sphere (SDP)
- exploit geometry to get good solution(random hyperplane cut)
- Sparsest Cut [Linial,London,Rabinovich ’95]
- LP gives best metric; need ℓ1 metric
- [Bourgain ’84] embeds any metric into ℓ1 with distortion log n
- Existential theorem can be made algorithmic
- log n approximation
- recent SDP based log n approximation[Arora,Rao,Vazirani ’04]

Metric relaxation approaches

- Correlation clustering [C,Guruswami,Wirth,’03] [Emanuel,Fiat,’03] [Immorlica,Karger,’03]
- Find best [0,1] metric from similarity/dissimilarity data via LP
- Use metric to guide clustering
- close points in same cluster
- distant points in different clusters
- “Learning” best metric ?
- Note: In many cases, LP/SDP can be eliminated to yield efficient algorithms

metric as data

- Dimension reduction
- Streaming data model
- Compact representation
- Finite metrics in optimization
- graph partitioning and clustering

Embedding theorems for finite metrics

metric as model

Some connections to learning

- Dimension reduction in ℓ2 :
- Learning mixtures of Gaussians [Dasgupta ’99]Random projections make skewed gaussians more spherical, making learning easier
- Learning with large margin[Arriaga,Vempala ’99] [Klivans,Servedio ’04]Random projections preserve margin,large margin few dimensions
- Kernel methods for SVMs
- mappings to ℓ2

Ongoing developments

- Notion of intrinsic dimensionality of metric space[Gupta,Krauthgamer,Lee,’03][Krauthgamer,Lee,Mendel,Naor,’04]
- Doubling dimension: How many balls of radius R needed to cover ball of radius 2R ?
- Complexity measure of metric space
- natural parameter for embeddings
- Open: Can every metric of constant doubling dimension in ℓ2 be embedded into ℓ2 with O(1) dimensions and O(1) distortion ?
- Not true for ℓ1
- related to learning low dimension manifolds, PCA, MDS, LLE, Isomap

Some things I didn’t mention

- Approximating general metrics via tree metrics
- modified notion of distortion
- useful for approximation, online algorithms
- Many mathematically appealing questions
- Embeddings between normed spaces
- Spectral methods for approximating matrices (SVD, LSI)
- PCA, MDS, LLE, Isomap

Conclusions

- Whirlwind tour of finite metrics
- Rich algorithmic toolkit for finite metric spaces
- Synergy between Computer Science and Mathematics
- Exciting area of active research
- range from practical applications to deep theoretical questions
- Many more applications to be discovered

Download Presentation

Connecting to Server..