1 / 78

Maximum Likelihood Estimates and the EM Algorithms II

Maximum Likelihood Estimates and the EM Algorithms II. Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm. Part 1 Computation Tools. Include Functions in R. source("file path") Example In MME.R:.

odina
Download Presentation

Maximum Likelihood Estimates and the EM Algorithms II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm

  2. Part 1Computation Tools

  3. Include Functions in R • source("file path") • Example • In MME.R: • In R: MME=function(y1, y2, y3, y4) { n=y1+y2+y3+y4; phi1=4.0*(y1/n-0.5); phi2=1-4*y2/n; phi3=1-4*y3/n; phi4=4.0*y4/n; phi=(phi1+phi2+phi3+phi4)/4.0; print("By MME method") return(phi); # print(phi); } > source(“C:/MME.R”) > MME(125, 18, 20, 24) [1] "By MME method" [1] 0.5935829

  4. Part 2Motivation Examples

  5. A a b B A b a A a A a B b B B b Example 1 in Genetics (1) • Two linked loci with alleles A and a, and B and b • A, B: dominant • a, b: recessive • A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab 1/2 1/2 1/2 1/2

  6. A a b B A b a A a A a B b B B b Example 1 in Genetics (2) • Probabilities for genotypes in gametes 1/2 1/2 1/2 1/2

  7. Example 1 in Genetics (3) • Fisher, R. A. and Balmukand, B. (1928). The estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92. • More: http://en.wikipedia.org/wiki/Geneticshttp://www2.isye.gatech.edu/~brani/isyebayes/bank/handout12.pdf

  8. Example 1 in Genetics (4)

  9. Example 1 in Genetics (5) • Four distinct phenotypes: A*B*, A*b*, a*B* and a*b*. • A*: the dominant phenotype from (Aa, AA, aA). • a*: the recessive phenotype from aa. • B*: the dominant phenotype from (Bb, BB, bB). • b*: the recessive phenotype from bb. • A*B*: 9 gametic combinations. • A*b*: 3 gametic combinations. • a*B*: 3 gametic combinations. • a*b*: 1 gametic combination. • Total: 16 combinations.

  10. Example 1 in Genetics (6) • Let , then

  11. Example 1 in Genetics (7) • Hence, the random sample of n from the offspring of selfed heterozygotes will follow a multinomial distribution: We know that and So

  12. Example 1 in Genetics (8) • Suppose that we observe the data of which is a random sample from Then the probability mass function is

  13. Maximum Likelihood Estimate (MLE) • Likelihood: • Maximize likelihood: Solve the score equations, which are setting the first derivates of likelihood to be zeros. • Under regular conditions, the MLE is consistent, asymptotic efficient and normal! • More: http://en.wikipedia.org/wiki/Maximum_likelihood

  14. MLE for Example 1 (1) • Likelihood • MLE:

  15. MLE for Example 1 (2) A B C

  16. MLE for Example 1 (3) • Checking: • 1. • 2. • 3. Compare ?

  17. Part 3Numerical Solutions for the Score Equations of MLEs

  18. A Banach Space • A Banach space is a vector space over the field such that • Every Cauchy sequence of converges in (i.e., is complete). • More: http://en.wikipedia.org/wiki/Banach_space

  19. Lipschitz Continuous • A closed subset and mapping • is Lipschitz continuous on with if • is a contraction mapping on if is Lipschitz continuous and • More: http://en.wikipedia.org/wiki/Lipschitz_continuous

  20. Fixed Point Theorem (1) • If is a contraction mapping on if is Lipschitz continuous and • has an unique fixed point such that • initial

  21. Fixed Point Theorem (2) • More: http://en.wikipedia.org/wiki/Fixed-point_theorem http://www.math-linux.com/spip.php?article60

  22. Applications for MLE (1) • Numerical solution is a contraction mapping : initial value that Then

  23. Applications for MLE (2) • How to choose s.t. is a contraction mapping? • Optimal ?

  24. Parallel Chord Method (1) • Parallel chord method is also called simple iteration.

  25. Parallel Chord Method (2)

  26. Plot Parallel Chord Method by R (1) ### Simple iteration ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 # First and second derivatives of log likelihood # f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} x = c(10:80)*0.01 y = f1(x) plot(x, y, type = 'l', main = "Parallel chord method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0)

  27. Plot Parallel Chord Method by R (2) phi0 = 0.25 # Given the initial value 0.25 # segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2) segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2) # Use the tangent line to find the intercept b0 # b0 = f1(phi0)-f2(phi0)*phi0 curve(f2(phi0)*x+b0, add = T, col = "red") phi1 = -b0/f2(phi0) # Find the closer phi # segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2) segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2) # Use the parallel line to find the intercept b1 # b1 = f1(phi1)-f2(phi0)*phi1 curve(f2(phi0)*x+b1, add = T, col = "red")

  28. Define Functions for Example 1 in R • We will define some functions and variables for finding the MLE in Example 1 by R # Fist, second and third derivatives of log likelihood # f1 = function(y1, y2, y3, y4, phi){y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 = function(y1, y2, y3, y4, phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} f3 = function(y1, y2, y3, y4, phi) {2*y1*(2+phi)^(-3)-2*(y2+y3)*(1-phi)^(-3)+2*y4*(phi)^(-3)} # Fisher Information # I = function(y1, y2, y3, y4, phi) {(-1)*(y1+y2+y3+y4)*(1/(4*(2+phi))+1/(2*(1-phi))+1/(4*phi))} y1 = 125; y2 = 18; y3 = 20; y4 = 24; initial = 0.9

  29. Parallel Chord Method by R (1) > fix(SimpleIteration) function(y1, y2, y3, y4, initial){ phi = NULL; i = 0; alpha = -1.0/f2(y1, y2, y3, y4, initial); phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-5){ i = i+1; phi1 = phi2; phi2 = alpha*f1(y1, y2, y3, y4, phi1)+phi1; phi[i] = phi2; } print("By parallel chord method(simple iteration)"); return(list(phi = phi2, iteration = phi)); }

  30. Parallel Chord Method by R (2) > SimpleIteration(y1, y2, y3, y4, initial)

  31. Parallel Chord Method by C/C++

  32. Newton-Raphson Method (1) • http://math.fullerton.edu/mathews/n2003/Newton'sMethodMod.html • http://en.wikipedia.org/wiki/Newton%27s_method

  33. Newton-Raphson Method (2)

  34. Plot Newton-Raphson Method by R (1) ### Newton-Raphson Method ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 # First and second derivatives of log likelihood # f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} x = c(10:80)*0.01 y = f1(x) plot(x, y, type = 'l', main = "Newton-Raphson method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0)

  35. Plot Newton-Raphson Method by R (2) # Given the initial value 0.25 # phi0 = 0.25 segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2) segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2) # Use the tangent line to find the intercept b0 # b0 = f1(phi0)-f2(phi0)*phi0 curve(f2(phi0)*x+b0, add = T, col = "purple", lwd = 2) # Find the closer phi # phi1 = -b0/f2(phi0) segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2) segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2)

  36. Plot Newton-Raphson Method by R (3) # Use the parallel line to find the intercept b1 # b1 = f1(phi1)-f2(phi0)*phi1 curve(f2(phi0)*x+b1, add = T, col = "red") curve(f2(phi1)*x-f2(phi1)*phi1+f1(phi1), add = T, col = "blue", lwd = 2) legend(0.45, 250, c("Newton-Raphson", "Parallel chord method"), col = c("blue", "red"), lty = c(1, 1))

  37. Newton-Raphson Method by R (1) > fix(Newton) function(y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/(f2(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Newton-Raphson method"); return (list(phi = phi2, iteration = phi)); }

  38. Newton-Raphson Method by R (2) > Newton(125, 18, 20, 24, 0.9) [1] "By Newton-Raphson method" $phi [1] 0.577876 $iteration [1] 0.8193054 0.7068499 0.6092827 0.5792747 0.5778784 0.5778760 0.5778760

  39. Newton-Raphson Method by C/C++

  40. Halley’s Method • The Newton-Raphson iteration function is • It is possible to speed up convergence by using more expansion terms than the Newton-Raphson method does when the object function is very smooth, like the method by Edmond Halley (1656-1742): (http://math.fullerton.edu/mathews/n2003/Halley'sMethodMod.html)

  41. Halley’s Method by R (1) > fix(Halley) function( y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/(f2(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1)*1.0/(1.0-f1(y1, y2, y3, y4, phi1)*f3(y1, y2, y3, y4, phi1)/(f2(y1, y2, y3, y4, phi1)*f2(y1, y2, y3, y4, phi1)*2.0)); phi[i] = phi2; } print("By Halley method"); return (list(phi = phi2, iteration = phi)); }

  42. Halley’s Method by R (2) > Halley(125, 18, 20, 24, 0.9) [1] "By Halley method" $phi [1] 0.577876 $iteration [1] 0.5028639 0.5793692 0.5778760 0.5778760

  43. Halley’s Method by C/C++

  44. Bisection Method (1) • Assume that and that there exists a number such that . If and have opposite signs, and represents the sequence of midpoints generated by the bisection process, thenand the sequence converges to . • That is, . • http://en.wikipedia.org/wiki/Bisection_method

  45. Bisection Method (2) 1

  46. Plot the Bisection Method by R ### Bisection method ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 f <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} x = c(1:100)*0.01 y = f(x) plot(x, y, type = 'l', main = "Bisection method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0) abline(v = 0.5, col = "red") abline(v = 0.75, col = "red") text(0.49, 2200, labels = "1") text(0.74, 2200, labels = "2")

  47. Bisection Method by R (1) > fix(Bisection) function(y1, y2, y3, y4, A, B) # A, B is the boundary of parameter # { Delta = 1.0E-6; # Tolerance for width of interval # Satisfied = 0; # Condition for loop termination # phi = NULL; YA = f1(y1, y2, y3, y4, A); # Compute function values # YB = f1(y1, y2, y3, y4, B); # Calculation of the maximum number of iterations # Max = as.integer(1+floor((log(B-A)-log(Delta))/log(2))); # Check to see if the bisection method applies # if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){ print("The values of function in boundary point do not differ in sign.");

  48. Bisection Method by R (2) print("Therefore, this method is not appropriate here."); quit(); # Exit program # } for(K in 1:Max){ if(Satisfied == 1) break; C = (A+B)/2; # Midpoint of interval YC = f1(y1, y2, y3, y4, C); # Function value at midpoint # if((K-1) < 100) phi[K-1] = C; if(YC == 0){ A = C; # Exact root is found # B = C; } else{ if((YB*YC) >= 0 ){

  49. Bisection Method by R (3) B = C; # Squeeze from the right # YB = YC; } else{ A = C; # Squeeze from the left # YA = YC; } } if((B-A) < Delta) Satisfied = 1; # Check for early convergence # } # End of 'for'-loop # print("By Bisection Method"); return(list(phi = C, iteration = phi)); }

  50. Bisection Method by R (4) > Bisection(125, 18, 20, 24, 0.25, 1) [1] "By Bisection Method" $phi [1] 0.5778754 $iteration [1] 0.4375000 0.5312500 0.5781250 0.5546875 0.5664062 0.5722656 0.5751953 [8] 0.5766602 0.5773926 0.5777588 0.5779419 0.5778503 0.5778961 0.5778732 [15] 0.5778847 0.5778790 0.5778761 0.5778747 0.5778754

More Related