1 / 46

Estimating Dependency and Significance for High-Dimensional Data

Estimating Dependency and Significance for High-Dimensional Data. Michael R. Siracusa* Kinh Tieu*, Alexander T. Ihler §, John W. Fisher *§, Alan S. Willsky § * Computer Science and Artificial Intelligence Laboratory § Laboratory for Information and Decision Systems.

dshawn
Download Presentation

Estimating Dependency and Significance for High-Dimensional Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Estimating Dependency and Significance for High-Dimensional Data Michael R. Siracusa* Kinh Tieu*, Alexander T. Ihler §, John W. Fisher *§, Alan S. Willsky § * Computer Science and Artificial Intelligence Laboratory § Laboratory for Information and Decision Systems

  2. Do these depend on each other (and how)?

  3. Premise : In many high-dimensional data sources, statistical dependency can be well explained by a lower dimensional latent variable: • Intuition: The complexity of the problem is influenced more by the the hypothesis rather than the data. • How do we estimate the dependency? • From a single realization? • How do we avoid strong modeling assumptions? • How do we estimate significance?

  4. Dependency Structure(Graphical Model) Parameterization(Nuisance)

  5. VS Dependence: An example

  6. Factorization Test (In General)

  7. Asymptotics Statistical Dependence Model Differences Model Differences Statistical Dependence Independent vs Some Dependency: 1. : data is independent 2. We don’t have the true distributions 3. We are only give a single realization

  8. Factorization Test (cont) • Questions: • How do we obtain samples under each factorization? • How do we estimate D(||) when x is high dimensional? • How do we estimate significance?

  9. Drawing Samples From a single realization • Only have 1 realization to estimate the joint But, • Can obtain N! sample draws from H0 permutations

  10. High Dimensional Data VS From the Data Processing Inequality:

  11. High Dimensional Data (cont) Sufficiency: For High dimensional data Maximize left side of bound • Gaussian w/ Linear Projections • Close form solution (Eigenvalue problem): Kullback 68 • Nonparametric • Gradient descent : Ihler and Fisher 03

  12. Swiss Roll PCA 2D Projection MaxKL 2D Optimization 3D Data

  13. Measuring significance p-value

  14. Synthetic data Noise in High Dim Space High Dim Obs Distracter Low Dim Latent Var Dependency via M: Controls that number of dimensions dependency info is uniformly distributed over D: Controls the total dimensionality of our K observations

  15. Experiments • 100 Trial w/ Samples of Dependent Data • 100 Trials w/ Samples of Independent Data • Each trial gives a statistic and significance p-value

  16. Gaussian Data

  17. Gaussian

  18. 3D Ball Data

  19. Significance Results

  20. Multi-camera

  21. Conclusions • We presented a method for estimating statistical dependency across high-dimensional measurements via factorization tests. • Exploited a bound on lower dimensional projections. • We made use of permutations for drawing from the alternate hypothesis given a single realization. • We also made use of permutations to get reliable significance estimates. • This was done using a small number of samples relative to the dimensionality of the data • Finally we presented some brief analysis on synthetic and real data.

  22. Thank You Questions?

  23. Problem Statement Given N i.i.d. observations for K sources Determine if the K sources are independent or not: • Obtain a dependency measure • Estimate the significance of this measurement

  24. Applications

  25. Hypothesis Test Two Hypotheses: Assuming we know the distributions: Given N i.i.d. observations:

  26. Factorization Test Two Factorizations: But we don’t we know the distributions: Our best approximation (like GLR): Notation Simplification:

  27. Factorization Test (cont) True Joint Dist Est Joint True Independent Dist Est Prod Est Joint True Independent Dist Est Prod True Independent Dist

  28. Significance

  29. Applications • What Vision Problems Can We Solve w/ Accurate Measures of Dependency? • Data Association, Correspondence • Feature Selection • Learning Structure • We will specifically discuss: • Correspondence (for multi-camera tracking) • Audio-visual Association

  30. Audio-Visual Association • Useful For: • Speaker Localization • - Help improve Human-Computer Interaction • - Help Source Separation • Automatic Transcription of Archival Video • - Who is speaking? • - Are they seen by the camera?

  31. Multi-camera Tracking

  32. VS Hypotheses Camera X Camera Y

  33. Maximal Correspondence

  34. Distributions of Transition Times Transition time

  35. Discussion and Future Work • Dependence underlies various vision related problems. • We studied a framework for measuring dependence. • Measure significance (how confident are you) • Make it more robust.

  36. For 2 variable case Math (oh no!)

  37. Outline • Applications: (for computer vision) • Problem Formulation: (Hypothesis Testing) • Computation: (Non-parametric entropy estimation) • Curse of Dimensionality: (Informative Statistics) • Correspondence: (Markov Chain Monte Carlo)

  38. Previous Talks • Greg: Model dependence between features and class • Kristen: Model dependence between features and a scene Ariadna: Model dependency between intra-class features • Wanmei: Dependency between protocol signal and voxel response • Chris: Audio and video dependence with events • Antonio: Contextual Dependence • Corey: “Inferring Dependencies”

More Related