1 / 59

Lecture #5: MAPS WITH GAPS-- Small geographic area estimation, kriging, and kernel smoothing

Lecture #5: MAPS WITH GAPS-- Small geographic area estimation, kriging, and kernel smoothing. Spatial statistics in practice Center for Tropical Ecology and Biodiversity, Tunghai University & Fushan Botanical Garden. Topics for today’s lecture. The E-M algorithm The spatial E-M algorithm

hope-combs
Download Presentation

Lecture #5: MAPS WITH GAPS-- Small geographic area estimation, kriging, and kernel smoothing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture #5:MAPS WITH GAPS--Small geographic area estimation, kriging, and kernel smoothing Spatial statistics in practice Center for Tropical Ecology and Biodiversity, Tunghai University & Fushan Botanical Garden

  2. Topics for today’s lecture • The E-M algorithm • The spatial E-M algorithm • Kriging in ArcGIS • geographically weighted regression (GWR) • approaches to map smoothing

  3. THEOREM 1 When missing values occur only in a response variable, Y, then the iterative solution to the EM algorithm produces the regression coefficients calculated with only the complete data. PF: Let b denote the vector of regression coefficients that is converged upon. Then if ,

  4. THEOREM 2 When missing values occur only in a response variable, Y, then by replacing the missing values with zeroes and intro-ducing a binary 0/-1 indicator variable covariate -Im for each missing value m, such that Im is 0 for all but missing value observation m and 1 for missing value observation m, the estimated regression coefficient bm is equivalent to the point estimate for a new obser-vation, and hence furnishes EM algorithm imputations. PF:Let bm denote the vector of regression coefficients for the missing values, and partition the data matrices such that

  5. The EM algorithm solution where: the missing values are replaced by 0 in Y, and Im is an indicator variable for missing value m that contains n-m 0s and a single 1

  6. THEOREM 3 For imputations computed based upon Theorem 2, each standard error of the estimated regression coefficients bm is equivalent to the conventional standard deviation used to construct a prediction interval for a new observation, and as such furnishes the corresponding EM algorithm imputation standard error. PF:

  7. What is the set of equations for the following case?

  8. Some preliminary assessments

  9. simulations

  10. simulated imputations

  11. EM algorithm solution for aggregated georeferenced data: vandalized turnips plots

  12. MTB > regress c4 8 c7-c14Regression Analysis: C4 versus C7, C8, C9, C10, C11, C12, C13, C14The regression equation isC4 = 28.9 - 6.32 C7 - 18.2 C8 - 1.10 C9 - 11.4 C10 - 10.1 C11 + 28.9 C12 + 18.8 C13 + 27.8 C14Predictor Coef SE Coef T PConstant 28.900 2.404 12.02 0.000C7 [I1-I6] -6.317 3.254 -1.94 0.063C8 [I2-I6] -18.200 3.254 -5.59 0.000C9 [I3-I6] -1.100 3.399 -0.32 0.749C10 [I4-I6] -11.400 3.254 -3.50 0.002C11 [I5-I6] -10.100 3.399 -2.97 0.006C12 [plot(6,5)] 28.900 5.887 4.91 0.000C13 [plot(5,6)] 18.800 5.887 3.19 0.004C14 [plot(6,6)] 27.800 5.887 4.72 0.000

  13. Analysis of Variance for C4 Source DF SS MS F PC5 5 1289.0 257.8 8.92 0.000Error 27 779.9 28.9Total 32 2068.9Individual 95% CIs For Mean Based on Pooled StDevLevel N Mean StDev ---+---------+---------+---------+---1 5 28.900 4.407 (-----*-----) 2 6 22.583 6.391 (----*-----) 3 6 10.700 2.585 (----*-----) 4 5 27.800 5.082 (-----*-----)5 6 17.500 6.648 (-----*-----) 6 5 18.800 5.922 (------*-----) ---+---------+---------+---------+---Pooled StDev = 5.375 8.0 16.0 24.0 32.0

  14. Residual spatial autocorrelation What does this mean?

  15. SAR-based missing data estimation where ym is a missing value (replaced by 0 in Y), Im is an indicator variable for ym, and is the mth column of geographic weights matrix W

  16. The Jacobian term NOTE: denominator becomes (n-nm)

  17. What is the set of equations for the following case?

  18. spatial autoregressive (AR) kriging estimate with semivariogram model fit semivariogram model with

  19. The pure spatial autocorrelation CAR model NOTE: exactly the same algebraic structure as the kriging equation Dispersed missing values: Imputation = the observed mean plus a weighted average of the surrounding residuals

  20. Employing rook’s adjacency and a CAR model, what is the equation for the following imputation?

  21. The spatial filter EM algorithm solution where: the missing values are replaced by 0 in Y, and Im is an indicator variable for missing value m that contains n-m 0s and a single 1

  22. Imputation of turnip production in 3 vandalized field plots

  23. Cressie’s PA coal ash

  24. Missing 1992 georeferenced density of milk production in Puerto Rico: constrained (total = 1918) predictions Moran scatterplot

  25. USDA-NASSestimation ofPennsylvaniacrop production covariate total constraints map gaps

  26. USDA-NASS estimation of Michigan crop production If this is 2% milk, how much am I paying for the other 98%?

  27. Michigan imputations different response variable specifications

  28. USDA-NASS estimation of Tennessee crop production

  29. Tennesseeimputations

  30. An EM specification when some data for both Y and the Xs are missing

  31. Concatenation results:

  32. The spatial model covariate spatial autocorrelation power transformation totals constraints

  33. Imputation of turnip production in 3 vandalized field plots

  34. Cross-validation of spatial filter for observed turnip data

  35. Kriging: best linear unbiased spatial interpolator (i.e., predictor) The accompanying table contains a test set of sixteen random samples (#17-32) used to evaluate three maps. The “Actual” column lists the measured values for the test locations identified by “Col, Row” coordinates. The difference between these values and those predicted by the three interpolation techniques form the residuals shown in parentheses. The “Average” column compares the whole field arithmetic mean of 23 (guess 23 everywhere) for each test location.

  36. ArcGIS: Geostatistical Wizard density of German workers anisotropy check

  37. Cross-validation check of krigged values This is one use of the missing spatial data imputation methods.

  38. Unclipped krigged surface exponential semivariogram model values increase with darkness of brown extrapolation krigged (mean response) surface prediction error surface

  39. Clipped krigged surface krigged (mean response) surface values increase with darkness of brown prediction error surface

  40. Detrended population density across China anisotropy check

  41. Cross-validation check of krigged values This is one use of the missing spatial data imputation methods.

  42. Unclipped krigged surface exponential semivariogram model values increase with darkness of brown extrapolation krigged (mean response) surface prediction error surface

  43. Clipped krigged surface krigged (mean response) surface values increase with darkness of brown prediction error surface

  44. THEOREM 4 The maximum likelihood estimate for missing georeferenced values described by a spatial autoregressive model specification is equivalent to the best linear unbiased predictor kriging equation of geostatistics.

  45. Geographically weighted regression: GWR Spatial filtering enables easier implementation of GWR, as well as proper assessment of its dfs • Step #1: compute the eigenvectors of a geographic connectivity matrix, say C • Step #2: compute all of the interactions terms XjEk for the P covariates times the K candidate eigenvectors (e.g., with MC > 0.25) • Step #3: select from the total set, including the individual eigenvectors, with stepwise regression

More Related