1 / 26

Sean Canavan David Hann Oregon State University

Toward a Characterization of Measurement Error. Sean Canavan David Hann Oregon State University. Recall: Measurement error enters into forestry in many different ways and forms The errors can have very negative effects on model

jeannewhite
Download Presentation

Sean Canavan David Hann Oregon State University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward a Characterization of Measurement Error Sean Canavan David Hann Oregon State University

  2. Recall: • Measurement error enters into forestry in many different • ways and forms • The errors can have very negative effects on model • parameters, model estimates, and the variances of model • parameters and model estimates. • Correction techniques do exist for countering the effects of • measurement errors in many situations, but typically require • knowing something about the form of the errors. • People have generally made the assumption that the errors • are Normal in distribution.

  3. Study Data:

  4. Study Data: • Dbh: • n = 2175 • < 0 : 529, = 0 : 368, > 0 : 1278 • 0.8” – 72.1” • Species: DF, TF, PP, SP, IC • Ht: • n = 1238 • < 0 : 722, = 0 : 30, > 0 : 486 • 8.4’ – 231.7’ • Species: DF, TF, PP, SP, IC

  5. The Normal Assumption: • It is often assumed that measurement errors follow a Normal • distribution - (Nester 1981, Garcia 1984, Smith 1986, Päivinen • & Yli-Kojola 1989, Gertner 1991, McRoberts et al. 1994, • Kozak 1998, Kangas 1998, Kangas & Kangas 1999, Phillips • et al. 2000, Williams & Schreuder 2000) • Bias assumption: μ = 0 • Variance assumption: homogeneous (σ2 constant) • heterogeneous (σ2 not constant)

  6. Normal(0,1) PDF 0.45 0.4 0.35 0.3 0.25 f(x) 0.2 0.15 0.1 0.05 0 -4 -3 -2 -1 0 1 2 3 4 5 -5 x

  7. The Normal Assumption: • It is often assumed that measurement errors follow a Normal • distribution - (Nester 1981, Garcia 1984, Smith 1986, Päivinen • & Yli-Kojola 1989, Gertner 1991, McRoberts et al. 1994, • Kozak 1998, Kangas 1998, Kangas & Kangas 1999, Phillips • et al. 2000, Williams & Schreuder 2000) • Bias assumption: μ = 0 • Variance assumption: homogeneous (σ2 constant) • heterogeneous (σ2 not constant) • What happens when there are many correct measurements? • example: Dbh measured to a tenth of an inch

  8. 1.0 1.0 0.9 0.9 50% Correct 25% Correct 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 nsig = 115 0.4 nsig = 50 0.3 0.3 0.2 0.2 Cumulative Probability 0.1 0.1 0.0 0.0 -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 1 1 0.9 0.9 100% Correct 75% Correct 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 nsig = 12 nsig = 6 0.3 0.3 0.2 0.2 0.1 0.1 Measurement Error Value 0 0 -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5

  9. Error Distribution Modeling: • First Approach: PDF modeling • Second Approach: CDF modeling • Part 1: Modeling Error Type Probabilities • Part 2: Modeling the Positive and Negative error • portions of the curve

  10. Normal(0,1) CDF 1 0.9 0.8 0.7 0.6 F(x) = P(X < x) 0.5 0.4 0.3 0.2 0.1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 x

  11. Empirical Dbh Error CDF Surface 1.00 0.75 Cumulative Probability 0.50 0.25 0.0 10.0 20.0 Dbh (inches) 30.0 40.0 1.8 1.2 0.6 0.0 -0.6 Error (inches)

  12. Error Distribution Modeling: • First Approach: PDF modeling • Second Approach: CDF modeling • Part 1: Modeling Error Type Probabilities • Part 2: Modeling the Positive and Negative error • portions of the curve

  13. } 1.00 Pr(ε > 0) } 0.80 0.60 Pr(ε = 0) Cumulative Probabiility 0.40 } 0.20 Pr(ε < 0) 0.00 -1.5 -1 -0.5 0 0.5 1 1.5 Error Size Fitted CDF Equation: { Pr(ε < 0)*Negative Error CDF ε < 0 Pr(ε < 0) + Pr(ε = 0) ε = 0 Pr(ε < 0) + Pr(ε = 0) + Pr(ε > 0)*Positive Error CDF ε > 0 P(X = x) =

  14. f ( Dbh ) i e f ( Dbh ) f ( Dbh ) + + 1 2 1 e e 1 f ( Dbh ) f ( Dbh ) + + 1 2 1 e e • Part 1: Error Type Probability Modeling • Multinomial Regression in S-Plus • GLM with a Poisson link function • Overdispersion/Quasilikelihood • Counts by 1-inch Dbh Classes / 5-ft. & 10-ft. Ht Classes • Candidate predictors: Dbh, Dbh½, Dbh2, Dbh-1 • Ht, Ht½, Ht2, Ht-1 • Probability model forms:

  15. 100% 90% 80% 70% 60% P(e > 0) P(e = 0) Probability 50% P(e < 0) 40% 30% 20% 10% 0% 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 Dbh (inches) Fits of Error Type Probabilities

  16. Part 2: Modeling Positive and Negative CDFs • Negative Errors: • CDFs by Dbh class • Step 1: Exponential fits • model form: exp(β*Error Size) • actually fit: 1 – exp(β*Error Size) • Step 2: Parameter Modeling • βi = f(Dbh) • Step 3: Combined Equation Fit • 1 – exp(f(Dbh)*Error Size)

  17. 1 1 0.8 0.8 21.5” Class 2.5” Class 0.6 0.6 0.4 0.4 0.2 0.2 0 0 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 Cumulative Probability 1 0.8 45.0” Class 0.6 0.4 0.2 0 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 Error Size (inches)

  18. Part 2: Modeling Positive and Negative CDFs • Negative Errors: • CDFs by Dbh class • Step 1: Exponential fits • model form: exp(β*Error Size) • actually fit: 1 – exp(β*Error Size) • Step 2: Parameter Modeling • βi = f(Dbh) • Step 3: Combined Equation Fit • 1 – exp(f(Dbh)*Error Size)

  19. 1200 1000 120 100 80 Fitted Exponential Coefficients 60 40 20 0 0 5 10 15 20 25 30 35 40 45 50 Dbh Class Parameter Modeling: 10.04exp(-0.03Dbh + 1.77Dbh-1 + 0.59Dbh-2)

  20. Part 2: Modeling Positive and Negative CDFs • Negative Errors: • CDFs by Dbh class • Step 1: Exponential fits • model form: exp(β*Error Size) • actually fit: 1 – exp(β*Error Size) • Step 2: Parameter Modeling • βi = f(Dbh) • Step 3: Combined Equation Fit • 1 – exp(f(Dbh)*Error Size)

  21. Combined equation fit: • Variable power on error size: • 1 – exp[b0*exp(b1Dbh + b2Dbh-1 + b3Dbh-2)*(error size)c1] • Resulting CDF equation: • exp[10.04*exp(-0.03*Dbh + 1.77*Dbh-1)*(error size)0.59] • adjusted R2 = 0.8664

  22. Fitted Dbh Error CDF Surface 1.00 0.75 0.50 Cumulative Probability 0.25 0.00 8 16 Dbh (inches) 24 32 40 2.00 1.00 0.00 -1.00 Error (inches)

  23. Alternative Surfaces (Dbh): • Normal 1: Unbiased, homogeneous Normal: • μ = 0.0, σ = 0.2237 • Normal 2: Constant bias, homogeneous Normal: • μ = 0.0901, σ = 0.2237 • Normal 3: Non-constant bias, homogeneous Normal: • μ = 0.003983*Dbh + 0.000121*Dbh2, σ = 0.2237 • Normal 4: Unbiased, heterogeneous Normal: • μ = 0.0, σ = σD*exp[0.1145*Dbh] • Normal 5: Non-constant bias, heterogeneous Normal: • μ = μ = 0.003983*Dbh + 0.000121*Dbh2, • σ = σD*exp[0.1145*Dbh]

  24. Comparison of Surface Fits:

  25. Conclusions: • Case of many correct measurements • Case of few correct measurements • Drawing random samples • Species differences • Changing precision levels: • Dbh: 0.1”  1.0”  368 1087 out of 2175 • Ht: 0.1’  1.0’  30 274 out of 1238

  26. "Sampling gets you to the final answer, if you do it often enough. • Measuring everything correctly gets you to the correct answer. • Don't get those mixed up." • Olde Statistical Sayings • Inventory and Cruising Newsletter • Issue No. 32, October 1995

More Related