Entropic graphs: Applications

Entropic graphs: Applications Alfred O. Hero Dept. EECS, Dept BME, Dept. Statistics University of Michigan - Ann Arbor hero@eecs.umich.edu http://www.eecs.umich.edu/~hero • Dimension reduction and pattern matching • Entropic graphs for manifold learning • Simulation studies • Applications to face and digit databases

1.Dimension Reduction and Pattern Matching • 128x128 images of faces • Different poses, illuminations, facial expressions • The set of all face images evolve on a lower dimensional imbedded manifold in R^(16384)

Face Manifold

Classification on Face Manifold

Manifold Learning:What is it good for? • Interpreting high dimensional data • Discovery and exploitation of lower dimensional structure • Deducing non-linear dependencies between populations • Improving detection and classification performance • Improving image compression performance

Background on Manifold Learning • Manifold intrinsic dimension estimation • Local KLE, Fukunaga, Olsen (1971) • Nearest neighbor algorithm, Pettis, Bailey, Jain, Dubes (1971) • Fractal measures, Camastra and Vinciarelli (2002) • Packing numbers, Kegl (2002) • Manifold Reconstruction • Isomap-MDS, Tenenbaum, de Silva, Langford (2000) • Locally Linear Embeddings (LLE), Roweiss, Saul (2000) • Laplacian eigenmaps (LE), Belkin, Niyogi (2002) • Hessian eigenmaps (HE), Grimes, Donoho (2003) • Characterization of sampling distributions on manifolds • Statistics of directional data, Watson (1956), Mardia (1972) • Data compression on 3D surfaces, Kolarov, Lynch (1997) • Statistics of shape, Kendall (1984), Kent, Mardia (2001)

Sampling on a Domain Manifold 2dim manifold Embedding Sampling distribution Domain Sampling A statistical sample Observed sample

Learning 3D Manifolds Ref: Tenenbaum&etal (2000) Ref: Roweiss&etal (2000) Swiss Roll N=400 S-Curve N=800 • Sampling density fy = Uniform on manifold

Sampled S-curve Geodesic from A to B is shortest path A B Euclidean Path is poor approximation What is shortest path between points A and B along manifold?

Geodesic Graph Path Approximation Dykstra’s shortest path approximates geodesic B A k-NNG skeleton k=4

ISOMAP (PCA) Reconstruction • Compute k-NN skeleton on observed sample • Run Dykstra’s shortest path algorithm between all pairs of vertices of k-NN • Generate Geodesic pairwise distance matrix approximation • Perform MDS on • Reconstruct sample in manifold domain

ISOMAP Convergence • When domain mapping is an isometry, domain is open and convex, and true domain dimension d is known (de Silva&etal:2001): • How to estimate d? • How to estimate attributes of sampling density?

How to Estimate d? Landmark-ISOMAP residual curve For Abilene Netflow data set

2. Entropic Graphs • in D-dimensionalEuclidean space • Euclidean MST with edge power weighting gamma: • pairwise distance matrix over • edge length matrix of spanning trees over • Euclidean k-NNG with edge power weighting gamma: • When obtain Geodesic MST

Example: Uniform Planar Sample

Example: MST on Planar Sample

Example: k-NNG on Planar Sample

Convergence of Euclidean MST Beardwood, Halton, Hammersley Theorem:

GMST Convergence Theorem Ref: Costa&Hero:TSP2003

k-NNG Convergence Theorem

Shrinkwrap Interpretation n=400 n=800 Dimension = “Shrinkage rate” as vary number of resampled points on M

Joint Estimation Algorithm • Convergence theorem suggests log-linear model • Use bootstrap resampling to estimate mean graph length and apply LS to jointly estimate slope and intercept from sequence • Extract d and H from slope and intercept

3. Simulation Studies: Swiss Roll K=4 GMST kNN • n=400, f=Uniform on manifold

Estimates of GMST Length Bootstrap SE bar (83% CI)

loglogLinear Fit to GMST Length

GMST Dimension and Entropy Estimates • From LS fit find: • Intrinsic dimension estimate • Alpha-entropy estimate ( ) • Ground truth:

MST/kNN Comparisons MST MST n=800 n=400 kNN kNN n=800 n=400

Entropic Graphs on S2 Sphere in 3D • n=500, f=Uniform on manifold GMST kNN

k-NNG on Sphere S4 in 5D • k=7 for all algorithms • kNN resampled 5 times • Length regressed on 10 or 20 samples at end of mean length sequence • 30 experiments performed • ISOMAP always estimates d=5 Histogram of resampled d-estimates of k-NNG N=1000 points uniformly distributed on S4 (sphere) in 5D n Table of relative frequencies of correct d estimate

kNN/GMST Comparisons Table of relative frequencies of correct d estimate True Entropy Estimated entropy (n = 600)

kNN/GMST Comparisons for Uniform Hyperplane GMST 4-NN

Improve Performance by Bootstrap Resampling • Main idea: Averaging of weak learners • Using fewer (N) samples per MST estimate, generate large number (M) of weak estimates of d and H • Reduce bias by averaging these estimates (M>>1,N=1) • Better than optimizing estimate of MST length (M=1,N>>1) Illustration of bootstrap resampling method: A,B: N=1 vs C: M=1

kNN/GMST Comparisons for Uniform Hyperplane Table of relative frequencies of correct d estimate using the GMST, with (N = 1) and without (M = 1) bias correction.

4. Application: ISOMAP Face Database d=4 H=21.8 bits d=3 H=21.1 bits Mean GMST Length Function Resampling Histogram of d hat Mean kNNG (k=7) length • http://isomap.stanford.edu/datasets.html • Synthesized 3D face surface • Computer generated images representing 700 different angles and illuminations • Subsampled to 64 x 64 resolution (D=4096) • Disagreement over intrinsic dimensionality • d=3 (Tenenbaum) vs d=4 (Kegl)

Application: Yale Face Database • Description of Yale face database 2 • Photographic folios of many people’s faces • Each face folio contains images at 585 different illumination/pose conditions • Subsampled to 64 by 64 pixels (4096 extrinsic dimensions) • Objective: determine intrinsic dimension and entropy of a typical face folio

Samples from Face database B

GMST for 3 Face Folios

Dimension Estimator Histograms for Face database B Real valued intrinsic dimension estimates using 3-NN graph for face 1. Real valued intrinsic dimension estimates using 3-NN graph for face 2.

Remarks on Yale Facebase B • GMST LS estimation parameters • Local Geodesic approximation used to generate pairwise distance matrix • Estimates based on 25 resamplings over 18 largest folio sizes • To represent any folio we might hope to attain • factor > 600 reduction in degrees of freedom (dim) • only 1/10 bit per pixel for compression • a practical parameterization/encoder?

Application: MNIST Digit Database Sample: MNIST Handwritten Digits

MNIST Digit Database Histogram of intrinsic dimension estimates: GMST (left) and 5-NN (right) (M = 1, N = 10, Q = 15). Estimated intrinsic dimension

MNIST Digit Database ISOMAP (k = 6) residual variance plot. The digits database contains nonlinear transformations, such as width distortions of each digit, that are not adequately modeled by ISOMAP!

Conclusions • Entropic graphs give accurate global and consistent estimators of dimension and entropy • Manifold learning and model reduction • LLE, LE, HE estimate d by finding local linear representation of manifold • Entropic graph estimates d from global resampling • Initialization of ISOMAP… with entropic graph estimator • Computational considerations • GMST, kNN with pairwise distance matrix: O(E log E) • GMST with greedy neighborhood search: O(d n log n) • kNN with kdb tree partitioning: O(d n log n)

References • A. O. Hero, B. Ma, O. Michel and J. D. Gorman, “Application of entropic graphs,” IEEE Signal Processing Magazine, Sept 2002. • H. Neemuchwala, A.O. Hero and P. Carson, “Entropic graphs for image registration,” to appear in European Journal of Signal Processing, 2003. • J. Costa and A. O. Hero, “Manifold learning with geodesic minimal spanning trees,” to appear in IEEE T-SP (Special Issue on Machine Learning), 2004. • A. O. Hero, J. Costa and B. Ma, "Convergence rates of minimal graphs with random vertices," submitted to IEEE T-IT, March 2001. • J. Costa, A. O. Hero and C. Vignat, "On solutions to multivariate maximum alpha-entropy Problems", in Energy Minimization Methods in Computer Vision and Pattern Recognition (EMM-CVPR), Eds. M. Figueiredo, R. Rangagaran, J. Zerubia, Springer-Verlag, 2003

Entropic graphs: Applications