1 / 55

Lecture 8 The Principle of Maximum Likelihood

Lecture 8 The Principle of Maximum Likelihood. Syllabus.

hidi
Download Presentation

Lecture 8 The Principle of Maximum Likelihood

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 8 The Principle of Maximum Likelihood

  2. Syllabus Lecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized Inverses Lecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value Decomposition Lecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems

  3. Purpose of the Lecture Introduce the spaces of all possible data, all possible models and the idea of likelihood Use maximization of likelihood as a guiding principle for solving inverse problems

  4. Part 1The spaces of all possible data,all possible models and the idea of likelihood

  5. viewpoint the observed data is one point in the space of all possible observations or dobs is a point in S(d)

  6. plot of dobs d3 d2 d1 O

  7. plot of dobs d3 d2 dobs d1 O

  8. now suppose … the data are independent each is drawn from a Gaussian distribution with the same mean m1 and variance σ2 (but m1 and σ unknown)

  9. plot of p(d) d3 d2 d1 O

  10. plot of p(d) d3 cloud centered on d1=d2=d3 with radius proportional to σ d2 d1 O

  11. now interpret … p(dobs) as the probability that the observed data was in fact observed L = log p(dobs) called the likelihood

  12. find parameters in the distribution maximize p(dobs) with respect to m1 and σ maximize the probability that the observed data were in fact observed the Principle of Maximum Likelihood

  13. Example

  14. solving the two equations

  15. solving the two equations usual formula for the sample mean almost the usual formula for the sample standard deviation

  16. these two estimates linked to the assumption of the data being Gaussian-distributed might get a different formula for a different p.d.f.

  17. example of a likelihood surface L(m1, σ) maximum likelihood point σ m1

  18. likelihood maximization process will fail if p.d.f. has no well-defined peak (A) (B) p(d1, ,d1 ) p(d1, ,d1 ) d2 d2 d1 d1

  19. Part 2Using the maximization of likelihood as a guiding principle for solving inverse problems

  20. linear inverse problem for with Gaussian-distibuted datawith known covariance [covd]assumeGm=dgives the mean d T

  21. principle of maximum likelihoodmaximize L = log p(dobs)minimize T with respect to m

  22. principle of maximum likelihoodmaximize L = log p(dobs)minimize T E = This is just weighted least squares

  23. principle of maximum likelihoodwhen data Gaussian-distributedsolve Gm=d with weighted least squareswith weighting of

  24. special case of uncorrelated dataeach datum with a different variance[covd]ii = σdi2minimize

  25. special case of uncorrelated dataeach datum with a different variance[covd]ii = σdi2minimize errors weighted by their certainty

  26. but what about a priori information?

  27. probabilistic representation of a priori information probability that the model parameters are near m given by p.d.f. pA(m)

  28. probabilistic representation of a priori information probability that the model parameters are near m given by p.d.f. pA(m) centered at a priori value <m>

  29. probabilistic representation of a priori information probability that the model parameters are near m given by p.d.f. pA(m) variance reflects uncertainty in a priori information

  30. uncertain certain <m2> <m2> m2 m2 <m1> <m1> m1 m1

  31. <m2> m2 <m1> m1

  32. linear relationship approximation with Gaussian <m2> m2 m2 <m1> m1 m1

  33. m2 p=constant m1 p=0

  34. assessing the information contentin pA(m) Do we know a little about m or a lot about m ?

  35. Information Gain, S -S called Relative Entropy,

  36. Relative Entropy, Salso called Information Gain null p.d.f. state of no knowledge

  37. Relative Entropy, Salso called Information Gain uniform p.d.f. might work for this

  38. probabilistic representation of data probability that the data are near d given by p.d.f. pA(d)

  39. probabilistic representation of data probability that the data are near d given by p.d.f. p(d) centered at observed data dobs

  40. probabilistic representation of data probability that the data are near d given by p.d.f. p(d) variance reflects uncertainty in measurements

  41. probabilistic representation of both prior information and observed data assume observations and a priori information are uncorrelated

  42. Example of map model,m dobs datum,d

  43. the theoryd = g(m)is a surface in the combined space of data and model parameterson which the estimated model parameters and predicted data must lie

  44. the theoryd = g(m)is a surface in the combined space of data and model parameterson which the estimated model parameters and predicted data must liefor a linear theorythe surface is planar

  45. the principle of maximum likelihood says maximize on the surface d=g(m)

  46. (A) model,m dobs dpre d=g(m) datum,d map mest (B) p(s) smax position along curve,s

  47. model,m dobs dpre d=g(m) datum,d mest≈map p(s) smax position along curve,s

  48. (A) model,m dpre≈dobs d=g(m) datum,d map mest (B) p(s) position along curve,s smax

  49. minimize principle of maximum likelihoodwithGaussian-distributed dataGaussian-distributed a priori information

  50. this is just weighted least squareswith so we already know the solution

More Related