1 / 41

Algorithmic Aspects of Finite Metric Spaces

Algorithmic Aspects of Finite Metric Spaces. Moses Charikar Princeton University. x. z. y. Metric Space. A set of points X Distance function d(x,y) d : X [0…) d(x,y) = 0 iff x=y d(x,y) = d(y,x) Symmetric d(x,z) ≤ d(x,y) + d(y,z) Triangle inequality

Antony
Download Presentation

Algorithmic Aspects of Finite Metric Spaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithmic Aspects of Finite Metric Spaces Moses Charikar Princeton University

  2. x z y Metric Space • A set of points X • Distance function d(x,y)d : X [0…) • d(x,y) = 0 iff x=y • d(x,y) = d(y,x) Symmetric • d(x,z) ≤ d(x,y) + d(y,z)Triangle inequality • Metric space M(X,d)

  3. Example Metrics: Normed spaces • x = (x1, x2, …, xd) y = (y1, y2, …, yd) • ℓpnorm ℓ1ℓ2(Euclidean)ℓ • ℓpd : ℓpnorm in Rd • Hamming cube {0,1}d

  4. Example Metrics: domain specific • Shortest path distances on graph • Symmetric difference on sets • Edit distance on strings • Hausdorff distance, Earth Mover Distance on sets of n points

  5. Metric Embeddings • General idea: Map complex metrics to simple metrics • Why ? richer algorithmic toolkit for simple metrics • Simple metrics • normed spaces ℓp • low dimensional normed spaces ℓpd • tree metrics • Mapping should not change distances much (low distortion)

  6. Low Distortion Embeddings f • Metric spaces (X1,d1) & (X2,d2),embedding f: X1 X2 has distortion D if ratio of distances changes by ≤ D x,y  X1: http://humanities.ucsd.edu/courses/kuchtahum4/pix/earth.jpg http://www.physast.uga.edu/~jss/1010/ch10/earth.jpg

  7. Applications • High dimensional  Low dimensional(Dimension reduction) • Algorithmic efficiency (running time) • Compact representation (storage space) • Streaming algorithms • Specific metrics  normed spaces • Nearest neighbor search • Optimization problems • General metrics  tree metrics • Optimization problems, online algorithms Solve problems on very large data sets in one pass using a very small amount of storage

  8. A (very) Brief History:fundamental results • Metric spaces studied in functional analysis • n point metric embeds into ℓnwithno distortion[Frechet] • n point metric embeds into ℓp with distortion log n [Bourgain ’85] • Dimension reduction fornpoint Euclidean metric with distortion 1+ε[Johnson, Lindenstrauss ’84]

  9. A (very) Brief History:applications in Computer Science • Optimization problems • Application to graph partitioning [Linial, London, Rabinovich ‘95][Arora, Rao, Vazirani ’04] • n point metrics into tree metrics[Bartal ’96 ‘98] [FRT ’03] • Efficient algorithms • Dimension reduction • Nearest neighbor search, Streaming algorithms

  10. Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics metric as model

  11. Disclaimer • This is not an attempt at a survey • Biased by my own interests • Much more relevant and related work than I can do justice do in limited time. • Goal: Give glimpse of different applications of finite metric spaces • Core ideas, no messy details

  12. Disclaimer: Community Bias • Theoretical viewpoint • Focus on algorithmic techniques with performance guarantees • Worst case guarantees

  13. Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics metric as model

  14. Metric as data • What is the data ? • Mathematical representation of objects (e.g. documents, images, customer profiles, queries). • Sets, vectors, points in Euclidean space, points in a metric space, vertices of a graph. • Metric is part of data

  15. Johnson Lindenstrauss [JL84] • n points in Euclidean space (ℓ2 norm) can be mapped down to O((log n)/2) dimensions with distortion at most 1+. • Quite simple [JL84, FM88, IM98, AV99, DG99, Ach01] • Project onto random unit vectors • projection of (u-v) onto one random vector behaves like Gaussian scaled by ||u-v||2 • Need log n dimensionsfor tight concentration bounds • Even a random {-1,+1} vector works…

  16. Dimension reduction for ℓ2 • Two interesting properties: • Linear mapping • Oblivious – choice of linear mapping does not depend on point set • Many applications … • Making high dimensional problems tractable • Streaming algorithms • Learning mixtures of gaussians [Dasgupta ’99] • Learning robust concepts [Arriaga,Vempala ’99] [Klivans,Servedio ’04]

  17. Dimension reduction for ℓ1 • [C,Sahai ‘02]Linear embeddings are not good for dimension reduction in ℓ1 • There exist n points in ℓ1in n dimensions, such that any linear mapping with distortion  needs n/2dimensions

  18. Dimension reduction for ℓ1 • [C, Brinkman ‘03]Strong lower boundsfor dimension reduction in ℓ1 • There exist n points in ℓ1, such that anyembedding with constant distortion  needs n1/2dimensions • Alternate, simpler proof [Lee, Naor ’03]

  19. Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics Solve problems on very large data sets in one pass using a very small amount of storage metric as model

  20. Frequency Moments [Alon,Matias,Szegedy ‘99] Data stream is sequence of elements in [n] ni : frequency of element i Fk =nik: kth frequency moment F0 = number of distinct elements F2= skewness measure of data stream Goal: Given a data stream, estimate Fk in one pass and sub-linear space

  21. Estimating F2 • Consider a single counter c and randomly chosen xi{ +1, -1} for each i in [n] • On seeing each element i, update c += xi • c =  ni •xi • Claim: E[c2] = ni2= F2Var[c2]  2(F2)2(4-wise independence) • Average 1/2 copies of this estimator to get (1+) approximation

  22. Differences between data streams • ni : frequency of element i in stream 1 • mi : frequency of element i in stream 2 • Goal: measure  (ni –mi)2 • F2 sketches are additive ni •xi -  mi •xi =  (ni –mi)•xi • Basically, dimension reduction in ℓ2 norm • Very useful primitivee.g. frequent items [C, Chen, Farach-Colton ’02]

  23. Estimate ℓ1 norms ? [Indyk ’00] • p-stable distribution:Distribution over R such that ni •xi distributed as ( |ni|p)1/p X • Cauchy distribution: c(x)=1/(1+x2) 1-stable • Gaussian distribution 2-stable • As before, c =  ni •xi • Cauchy does not have finite expectation ! • Estimate scale factor by taking median

  24. Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics metric as model

  25. Similarity Preserving Hash Functions • Similarity function sim(x,y) • Family of hash functions F with probability distribution such that

  26. Applications • Compact representation scheme for estimating similarity • Approximate nearest neighbor search [Indyk,Motwani ’98] [Kushilevitz,Ostrovsky,Rabani ‘98]

  27. Estimating Set Similarity [Broder,Manasse,Glassman,Zweig,’97] [Broder,C,Frieze,Mitzenmacher,’98] • Collection of subsets

  28. Minwise Independent Permutations

  29. Existence of SPH schemes [C ’02] • sim(x,y) admits an SPH scheme if family of hash functions F such that Theorem: If sim(x,y) admits an SPH scheme then 1-sim(x,y) satisfies triangle inequality; embeds into ℓ1 • Rounding procedures for LPs and SDPs yield similarity and distance preserving hashing schemes.

  30. Earth Mover Distance (EMD) LP Rounding algorithms for optimization problem (metric labelling) yield log n approximate estimator for EMD on n points. Implies that EMD embeds into ℓ1 with distortion log n P Q EMD(P,Q)

  31. Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics metric as model

  32. U V Graph partitioning problems • Given graph, partition into U,V • Maximum cut maximize |E(U,V)| • Sparsest cut minimize

  33. Mr. Rumsfeld his The secretary he Saddam Hussein Correlation clustering [Cohen,Richman,’02][Bansal,Blum,Chawla,’02] Similar (+) Dissimilar (-) example courtesy Shuchi Chawla Mr. Rumsfeld his The secretary he Saddam Hussein

  34. U V 1 0 0 Graph partitioning as metric problem • Partitioning is equivalent to finding appropriate {0,1} metric • possibly additional constraints • Objective function linear in metric • Find best {0,1} metric cut metric relaxation

  35. Metric relaxation approaches • Max Cut [Goemans,Williamson ’94] • map vertices to points on unit sphere (SDP) • exploit geometry to get good solution(random hyperplane cut) • Sparsest Cut [Linial,London,Rabinovich ’95] • LP gives best metric; need ℓ1 metric • [Bourgain ’84] embeds any metric into ℓ1 with distortion log n • Existential theorem can be made algorithmic • log n approximation • recent SDP based log n approximation[Arora,Rao,Vazirani ’04]

  36. Metric relaxation approaches • Correlation clustering [C,Guruswami,Wirth,’03] [Emanuel,Fiat,’03] [Immorlica,Karger,’03] • Find best [0,1] metric from similarity/dissimilarity data via LP • Use metric to guide clustering • close points in same cluster • distant points in different clusters • “Learning” best metric ? • Note: In many cases, LP/SDP can be eliminated to yield efficient algorithms

  37. Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics metric as model

  38. Some connections to learning • Dimension reduction in ℓ2 : • Learning mixtures of Gaussians [Dasgupta ’99]Random projections make skewed gaussians more spherical, making learning easier • Learning with large margin[Arriaga,Vempala ’99] [Klivans,Servedio ’04]Random projections preserve margin,large margin  few dimensions • Kernel methods for SVMs • mappings to ℓ2

  39. Ongoing developments • Notion of intrinsic dimensionality of metric space[Gupta,Krauthgamer,Lee,’03][Krauthgamer,Lee,Mendel,Naor,’04] • Doubling dimension: How many balls of radius R needed to cover ball of radius 2R ? • Complexity measure of metric space • natural parameter for embeddings • Open: Can every metric of constant doubling dimension in ℓ2 be embedded into ℓ2 with O(1) dimensions and O(1) distortion ? • Not true for ℓ1 • related to learning low dimension manifolds, PCA, MDS, LLE, Isomap

  40. Some things I didn’t mention • Approximating general metrics via tree metrics • modified notion of distortion • useful for approximation, online algorithms • Many mathematically appealing questions • Embeddings between normed spaces • Spectral methods for approximating matrices (SVD, LSI) • PCA, MDS, LLE, Isomap

  41. Conclusions • Whirlwind tour of finite metrics • Rich algorithmic toolkit for finite metric spaces • Synergy between Computer Science and Mathematics • Exciting area of active research • range from practical applications to deep theoretical questions • Many more applications to be discovered

More Related