- 120 Views
- Uploaded on
- Presentation posted in: General

Manifold Learning Via Homology

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Manifold Learning Via Homology

Presenter: Ronen Talmon

Topological Methods in Electrical Engineering and Networks

January 19, 2011

- Sources

P. Niyogi, S. Smale, and S. Weinberger, “Finding the homology of submanifolds with high confidence from random samples”, Combinatorial and Discrete Geometry, 2008.

P. Niyogi, S. Smale, and S. Weinberger, “A topological view of unsupervised learning from noisy data”, to appear, 2011.

Manifold Learning via Homology

Contents

Introduction & Preliminaries

Main Results

Practice

Conclusion

- Modern Data Analysis

Data points in high-dimensional space

Examples from image and audio processing, finance, neuroscience

The data cannot fill up the high-dimensional space uniformly

Usually, the space dimensionality is arbitrary chosen by the user or the

acquisition system

The data lie on a low dimensional structure, conveying its intrinsicdegrees of freedom

Manifold Learning via Homology

- Manifold Learning

Principal Component Analysis (PCA)

Linearly project the data into a lower dimensional subspace

Along the directions of maximal variations

Manifold Learning via Homology

- Manifold Learning

Common manifold learning techniques

ISOMAP [Tenenbaum, 00’], LLE [Roweis & Saul, 00’]LaplacianEignemaps[Belkin & Niyogi, 01’]Hessian Eigenmaps[Donoho & Grimes, 02’]

Define pair-wise affinity metric (kernel)

Spectral characterization of the manifold via Spectral Graph Theory

Manifold Learning via Homology

- Manifold Learning via Homology

Goal: Identify the homology of the submanifold from random samples

Natural topological invariants the provide good characterization:

The dimension of the 0th homology group is the # of connected components(clustering tasks)

The largest non-trivial homology gives the dimension of the submanifold

While graph-based techniquescharacterize the samples (graph),this method characterizes themanifold via the samples.

Manifold Learning via Homology

- Condition Number

Definition (Normal Bundle)

[Wikipedia]

Manifold Learning via Homology

- Condition Number

Theorem (Tubular Neighborhood)

Manifold Learning via Homology

- Condition Number

Definition (Condition Number)

Meaning: The largest s.t. the tube of radius around has no intersections.

Manifold Learning via Homology

- Computing the Homology

Acquire uniform i.i.d. samples of the manifold

Construct

Exploit Čech Complex to compute the homology of

Example:

Manifold Learning via Homology

- Computing the Homology

Definition (Čech Complex)

Computing the simplicial complex: for any set of points,determine whether balls of radius around these points intersect.

Given a set of points, find the ball with the smallest radius enclosingall these points

iff this smallest radius

Manifold Learning via Homology

- The Homology

Theorem (main result)

Note: in practice, the manifold is unknown, and hence,

its condition number

Manifold Learning via Homology

- Dense Sampling

Definition

Manifold Learning via Homology

- Retract

Definition (retract)

Manifold Learning via Homology

- Deformation Retract

Definition (Deformation Retract)

Meaning: a deformation retract is a homotopybetween a retraction and the identity map on

A deformation retract is a special case ofhomotopy equivalence

It implies that and have the same

homology groups.

[Wikipedia]

Manifold Learning via Homology

- Deterministic Setting

Proposition

Intuition:

For “complexed” manifolds(small ):

Requires dense sampling

is “tight” to

Manifold Learning via Homology

- The Deformation Retract

Proposition

Intuition:

For “complexed” manifolds(small ):

Requires dense sampling

is “tight” to

Manifold Learning via Homology

- Probability Bounds

Proposition

where

Manifold Learning via Homology

- Practical Extensions

In practice:

The samples are not uniformly distributed on the manifold

Noisy data – the samples concentrate around the manifold,but do not lie exactly on it

“Clean” noisy samples and simplify the construction of the complex

The Combinatorial Laplacian

Manifold Learning via Homology

- Probability Model

Probability Model

Key question: whether the homology of can be inferred from examplesdrawn according to

Investigated under the strong variance condition:

Manifold Learning via Homology

- Mixture of Gaussians

Consider the widely-used probability distribution:

with

Relate to the setting:

A manifold consisting of points

The probability distribution on the manifold

The probability distribution of the noise is a singleGaussian with mean 0 and variance

Manifold Learning via Homology

- Mixture of Gaussians

Given a collection of points sampled from a mixture of Gaussians

Learning the homology of the underlying manifold yields:

The # of connected components (0thBetti number) equals the #of Gauassians (and the # of clusters)

Through higher Betti numbers, whether the connected components retractto a point

Manifold Learning via Homology

- The Algorithm

Cleaning procedure:

The # of samples is chosen to guarantee that the manifold is well coveredby balls around these points

One randomly tends to oversample certain regions on the way to coverage

Therefore the “extra” sample points may be disregarded

Disregard the more noisy samples

Choosing a minimal covering set from the data makes the associatedsimplicial complex simpler, and the boundary maps in the chaincomplex sparse

Manifold Learning via Homology

- The Algorithm

Manifold Learning via Homology

- The Algorithm - details

Choice of parameters:

The radius is :

The threshold is:

The nerve scale:

Manifold Learning via Homology

- Main Theorem

Theorem

Notes:

The probability distribution is supported on all of

However, it concentrates around a low-dimensional structure(for sufficiently small noise variance )

Manifold Learning via Homology

- Example

Mixture of Gaussians:

The manifold is a set of points

The manifold condition number is given by

The homology can be computed (e.g., the task of clustering), when the variance of the Gaussians is small relative to the distance between their means.

Manifold Learning via Homology

- The Combinatorial Laplacian

Recall:

Manifold Learning via Homology

- The Combinatorial Laplacian

Definition (Combinatorial Laplacian)

Remark: corresponds to the standard graph-Laplacian

is the set of functions on the vertex set of the complex

is the set of edges of the complex

Manifold Learning via Homology

- Example

4

4

2

1

3

1

2

3

The boundary operator :

2

1

1

Manifold Learning via Homology

- Example

4

4

2

1

3

1

2

3

The boundary operator :

Manifold Learning via Homology

- Example

4

2

1

3

The graph-Laplacian:

Manifold Learning via Homology

- The Combinatorial Laplacian

Claim [Friedman, 1998]

Remark:

The dimensionality of the null-space of the graph-Laplaciangives the # of connected components of the graph

Related to the # of connected components of the manifold

In practice, the # of connected components is interpreted asthe # of clusters (in classical spectral clustering)

Manifold Learning via Homology

- Conclusion

Computing the homology of a manifold from samples

Sufficient sampling density w.r.t. the manifold “complexity”

Amount of uniform random samples to obtain sufficient density

A more practical scenario

Presence of noise

Simplified algorithm adjusted to both the “complexity” ofthe manifold and the noise variance

The construction of the combinatorial Laplacian based on the homology,and the connection to common manifold learning techniques

Manifold Learning via Homology

Thank you

Manifold Learning via Homology