1 / 31

Advances in Metric Embedding Theory

Advances in Metric Embedding Theory. Ofer Neiman Ittai Abraham Yair Bartal Hebrew University. Talk Outline. Current results: New method of embedding. New partition techniques. Constant average distortion. Extend notions of distortion. Optimal results for scaling embeddings.

gamba
Download Presentation

Advances in Metric Embedding Theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advances in Metric Embedding Theory Ofer Neiman Ittai Abraham Yair Bartal Hebrew University

  2. Talk Outline Current results: • New method of embedding. • New partition techniques. • Constant average distortion. • Extend notions of distortion. • Optimal results for scaling embeddings. • Tradeoff between distortion and dimension. Work in progress: • Low dimension embedding for doubling metrics. • Scaling distortion into a single tree. • Nearest neighbors preserving embedding.

  3. Embedding Metric Spaces • Metric spaces (X,dX), (Y,dy) • Embedding is a function f:X→Y • For non-contracting Embedding f, Given u,v in X let • Distortion c if max{u,v  X} distf(u,v) ≤ c

  4. Low-Dimension Embeddingsinto Lp For arbitrary metric space on n points: • [Bourgain 85]: distortion O(log n) • [LLR 95]: distortion Θ(log n) dimension O(log2 n) • Can the dimension be reduced? • For p=2, yes using [JL]to dimension O(log n) • Theorem: embedding into Lpwith distortion O(log n),dimension O(log n) for any p. • Theorem: distortion O(log1+θ n),dimension Θ(log n/ (θ loglog n))

  5. Average Distortion Embeddings • In many practical uses, the quality of an embedding is measured by its average distortion • Network embedding • Multi-dimensional scaling • Biology • Vision • Theorem: Every n point metric space can be embedded into Lpwith average distortion O(1), worst-case distortion O(log n) and dimension O(log n).

  6. Variation on distortion: The Lq distortion of an embedding • Given a non-contracting embedding f from (X,dX) to (Y,dY): • Define it’s Lq-distortion Thm: Lq-distortion is bounded by O(q)

  7. Partial & Scaling Distortion • Definition: A (1-ε)-partial embedding has distortion D(ε), if at least 1-ε of the pairs satisfy dist(u,v)<D(ε). • Definition: An embedding has scaling distortion D(·) if it is a 1-ε partial embedding with distortion D(ε), for all ε>0 simultaneously. • [KSW 04]: • Introduce the problem in context of network embeddings. • Initial results. • [A+ 05]: • Partial distortion and dimensionO(log(1/ε)) for all metrics. • Scaling distortion O(log(1/ε)) for doubling metrics. • Thm: Scaling distortion O(log(1/ε)) for all metrics.

  8. Lq-Distortion Vs Scaling Distortion • Lower boundΩ(log 1/ε) on partial distortion implies: Lq-distortion = Ω(min{q,log n}). • Upper boundO(log 1/ε) on Scaling distortion implies: • Lq-distortion = O(min{q,log n}). • Average distortion = O(1). • Distortion = O(log n). • For any metric: • ½ of pairs distortion are ≤ c log(2) = c • +¼ ofpairsdistortion are ≤ c log(4)= 2c • +⅛ ofpairsdistortion are ≤ c log(8) = 3c • …. • +1/n2 ofpairsdistortion are ≤ 2c log(n) • For ε<1/n2, no pairs are ignored.

  9. Probabilistic Partitions • P={S1,S2,…St} is a partition of X if • P(x)is the cluster containing x. • P is Δ-bounded if diam(Si)≤Δfor all i. • A probabilistic partitionP is a distribution over a set of partitions. • P is η-padded if

  10. Partitions and Embedding • Let Δi=4ibe the scales. • For each scale i, create a probabilistic Δi-boundedpartitions Pi,that are η-padded. • For each cluster choose σi(S)~Ber(½) i.i.d. fi(x)= σi(Pi(x))·d(x,X\Pi(x)) • Repeat O(log n) times. • Distortion : O(η-1·log1/pΔ). • Dimension : O(log n·log Δ). diameter of X =Δ Δi 8 4 x d(x,X\P(x))

  11. Upper Bound fi(x)= σi(Pi(x))·d(x,X\Pi(x)) • For all x,yєX: • Pi(x)≠Pi(y)implies d(x,X\Pi(x))≤d(x,y) • Pi(x)=Pi(y)implies d(x,A)-d(y,A)≤d(x,y)

  12. Lower Bound: y x • Take a scale i such that Δi≈d(x,y)/4. • It must be that Pi(x)≠Pi(y) • With probability ½ :d(x,X\Pi(x))≥ηΔi • With probability ¼ : σi(Pi(x))=1 and σi(Pi(y))=0

  13. η-padded Partitions • The parameter η determines the quality of the embedding. • [Bartal 96]:η=Ω(1/log n) for any metric space. • [Rao 99]:η=Ω(1) used to embed planar metrics into L2. • [CKR01+FRT03]:Improved partitions with η(x)=log-1(ρ(x,Δ)). • [KLMN 03]:Used to embed general + doubling metrics into Lp : distortion O(η-(1-1/p)log1/pn), dimension O(log2n) The local growth rate of x at radius r is:

  14. Uniform Probabilistic Partitions • In a Uniform Probabilistic Partition η:X→[0,1] • All points in a cluster have the same padding parameter. • Uniform partition lemma: There exists a uniform probabilistic Δ-bounded partition such that for any , η(x)=log-1ρ(v,Δ),where C1 C2 v2 v1 v3 η(C1)  η(C2) 

  15. Embeddinginto one dimension • Let Δi=4i. • For each scale i, create uniformly padded probabilistic Δi-boundedpartitions Pi. • For each cluster choose σi(S)~Ber(½) i.i.d. , fi(x)= σi(Pi(x))·ηi-1(x)·d(x,X\Pi(x)) • Upper bound : |f(x)-f(y)| ≤ O(log n)·d(x,y). • Lower bound: E[|f(x)-f(y)|] ≥Ω(d(x,y)) • ReplicateD=Θ(log n) times to get high probability.

  16. Upper Bound:|f(x)-f(y)| ≤ O(log n) d(x,y) • For all x,yєX: - Pi(x)≠Pi(y)implies fi(x)≤ ηi-1(x)· d(x,y) - Pi(x)=Pi(y)impliesfi(x)-fi(y)≤ ηi-1(x)· d(x,y) Use uniform padding in cluster

  17. Lower Bound: y x • Take a scale i such that Δi≈d(x,y)/4. • It must be that Pi(x)≠Pi(y) • With probability ½ : fi(x)= ηi-1(x)d(x,X\Pi(x))≥Δi

  18. Lower bound : E[|f(x)-f(y)|] ≥ d(x,y) • Two cases: • R < Δi/2 then • prob. ⅛: σi(Pi(x))=1 and σi(Pi(y))=0 • Then fi(x) ≥Δi ,fi(y)=0 • |f(x)-f(y)| ≥Δi/2 =Ω(d(x,y)). • R ≥Δi/2 then • prob. ¼: σi(Pi(x))=0 and σi(Pi(y))=0 • fi(x)=fi(y)=0 • |f(x)-f(y)| ≥Δi/2 =Ω(d(x,y)).

  19. Coarse Scaling Embedding into Lp • Definition: For uєX, rε(u) is the minimal radius such that |B(u,rε(u))| ≥εn. • Coarse scaling embedding: For each uєX,preserves distances outsideB(u,rε(u)). rε(w) w rε(u) u rε(v) v

  20. Scaling Distortion • Claim: If d(x,y) > rε(x) then 1 ≤ distf(x,y) ≤ O(log 1/ε) • Let l be the scale d(x,y) ≤Δl < 4d(x,y) • Lower bound: E[|f(x)-f(y)|] ≥ d(x,y) • Upper bound for high diameter terms • Upper bound for low diameter terms • ReplicateD=Θ(log n) times to get high probability.

  21. Upper Bound for high diameter terms:|f(x)-f(y)| ≤ O(log 1/ε) d(x,y) Scale l such that rε(x)≤d(x,y) ≤Δl < 4d(x,y).

  22. Upper Bound for low diameter terms:|f(u)-f(v)| =O(1) d(u,v) Scale l such that d(x,y) ≤Δl < 4d(x,y). • All lower levels i ≤ l are bounded by Δi.

  23. Embedding into Lp • Partition P is (η,δ)-padded if • Lemma: there exists (η,δ)-padded partitions with η(x)=log-1(ρ(v,Δ))·log(1/δ), where v=minuєP(x){ρ(u,Δ)}. • Hierarchical partition : every cluster in level i is a refinement of cluster in level i+1. • Theorem: Every n point metric space can be embedded into Lp with dimension O(ep log n). For every q:

  24. Embedding into Lp • Embedding into Lp with scaling distortion: • Use partitions with small probability of padding : δ=e-p. • Hierarchical Uniform Partitions. • Combination with Matousek’s sampling techniques.

  25. Low Dimension Embeddings • Embedding with distortion O(log1+θ n),dimension Θ(log n/ (θ loglog n)). • Optimal trade-off between distortion and dimension. • Use partitions with high probability of padding : δ=1-log-θn.

  26. Additional Results: Weighted Averages • Embedding with weighted average distortion O(log Ψ) for weights with aspect ratio Ψ • Algorithmic applications: • Sparsest cut, • Uncapacitated quadratic assignment, • Multiple sequence alignment.

  27. Low Dimension EmbeddingsDoubling Metrics • Definition: A metric space has doubling constant λ, if any ball with radius r>0 can be covered with λ balls of half the radius. • Doubling dimension = log λ. • [GKL03]:Embedding doubling metrics, with tight distortion. • Thm: Embedding arbitrary metrics into Lp with distortion O(log1+θn), dimensionO(log λ). • Same embedding, with similar techniques. • Use nets. • Use Lovász Local Lemma. • Thm: Embedding arbitrary metrics into Lp with distortion O(log1-1/pλ·log1/p n), dimension Õ(log n·logλ). • Use hierarchical partitions as well.

  28. Scaling Distortion into trees • [A+ 05]:ProbabilisticEmbedding intoa distribution of ultrametrics with scaling distortion O(log(1/ε)). • Thm: Embedding into an ultrametric with scaling distortion . • Thm: Every graph contains a spanning tree with scaling distortion . • Imply : • Average distortion = O(1). • L2-distortion = • Can be viewed as a network design objective. • Thm: ProbabilisticEmbedding intoa distribution of spanning trees with scaling distortion Õ(log2(1/ε)).

  29. New Results:Nearest-Neighbors Preserving Embeddings • Definition: x,y are k-nearest neighbors if |B(x,d(x,y))|≤k. • Thm: Embedding into Lp with distortion Õ(log k) on k-nearest neighbors, for all k simultaneously, and dimension O(log n). • Thm: For fixed k, embedding into Lp distortion O(log k) and dimension O(log k). • Practically the same embedding. • Every level is scaled down, higher levels more aggressively. • Lovász Local Lemma.

  30. Nearest-Neighbors Preserving Embeddings • Thm: Probabilistic embedding into a distribution of ultrametrics with distortion Õ(log k) for all k-nearest neighbors. • Thm: Embedding into an ultrametric with distortion k-1 for all k-nearest neighbors. • Applications : • Sparsest-cut with “neighboring” demand pairs. • Approximate ranking / k-nearest neighbors search.

  31. Conclusions • Unified framework for embedding arbitrary metrics. • New measures of distortion. • Embeddings with improved properties: • Optimal scaling distortion. • Constant average distortion. • Tight distortion-dimension tradeoff. • Embedding metrics in their doubling dimension. • Nearest-neighbors preserving embedding. • Constant average distortion spanning trees.

More Related