1 / 11

# Maximum likelihood estimation of intrinsic dimension - PowerPoint PPT Presentation

Maximum likelihood estimation of intrinsic dimension. Authors: Elizaveta Levina & Peter J. Bickel presented by: Ligen Wang. Plan. Problem Some popular methods MLE approach Statistical behaviors Evaluation Conclusions. Problem. Facts:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Maximum likelihood estimation of intrinsic dimension' - tyrell

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Maximum likelihood estimation of intrinsic dimension

Authors: Elizaveta Levina & Peter J. Bickel

presented by: Ligen Wang

• Problem

• Some popular methods

• MLE approach

• Statistical behaviors

• Evaluation

• Conclusions

• Facts:

• Many real-life high-D data are not truly high-dimensional

• Can be effectively summarized in a space of much lower dimension

• Why discover this low-D structure?

• Help to improve performance in classification and other applications

• Our target:

• How much is this lower dimension exactly, i.e., the intrinsic dimension

• Importance of this lower dimension:

• If our estimation is too low, features are collapsed onto the same dimension

• If too high, the projection becomes noisy and unstable

• PCA

• Decides the dimension by users by how much covariance they want to preserve

• LLE

• User provides the manifold dimension

• ISOMAP

• Provides error curves that can be ‘eyeballed’ to estimate dimension

• Etc.

• MLE produces good results on a range of simulated (both non-noisy and noisy) and read datasets

• Outperforms two other methods

• Suffers from a negative bias for high dimensions

• Reason: approximation is based on observations falling in a small sphere, which requires very large sample size when the dimension is high

• Good news: in reality, the intrinsic dimensions are low for most interesting applications