WHY ARE YOU USING THAT REGRESSION?

1 / 32

# WHY ARE YOU USING THAT REGRESSION? - PowerPoint PPT Presentation

WHY ARE YOU USING THAT REGRESSION?. Western Mensurationist Meeting Jim Flewelling July, 2003. FOCUS. POPULATIONS VARIANCE IN RELATIONSHIPS OBJECTIVES USE OF REGRESSION TECHNIQUES are SECONDARY. TWO WORLDS. SURVEY SAMPLING Fixed Populations Objective refers to Population

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## WHY ARE YOU USING THAT REGRESSION?

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### WHY ARE YOU USING THAT REGRESSION?

Western Mensurationist Meeting

Jim Flewelling July, 2003

FOCUS
• POPULATIONS
• VARIANCE IN RELATIONSHIPS
• OBJECTIVES
• USE OF REGRESSION
• TECHNIQUES are SECONDARY
TWO WORLDS
• SURVEY SAMPLING
• Fixed Populations
• Objective refers to Population
• REGRESSION ANALYSIS
• Relationships between variables
• Objectives refer to individuals or populations
SURVEY SAMPLING
• Fixed Population.
• Specified probability-sampling processes.
• Estimation of population parameters
• unbiased estimators.
SURVEY SAMPLING

“If we are to infer from sample to population, the selection process is an integral part of the inference.” - Stuart (1984, p. 4)

REGRESSIONS IN SURVEY SAMPLING
• AUXILIARY INFORMATION (X)
• known for population.
• Increased precision.
• MODEL-ASSISTED ESTIMATORS COMMON (Särndal et al.,1992)
• MODEL-BASED ESTIMATORS
MODEL-ASSISTED SURVEY SAMPLING

Ratio of Means Estimator:

Asymptotically unbiased,

whether or not y proportional to x.Could be used to estimate individual y’s.

No claim of unbiasedness here.

MODEL-BASED SURVEY SAMPLING
• Assumptions from Regression Analysis.
• True model
• E(e|x) = 0
• Errors are independent.
• Random selection avoids a source of bias.
• Inference from regression theory, not the distribution of samples.
• Theory from Royall (1970).
REGRESSION ANALYSIS
• Least Squares - Legrendre (1805) and Gauss.
• Sir Francis Galton (1877, 1885):

Offspring of seeds “did not tend to resemble their parent seeds in size, but to be always more mediocre [i.e., more average] than they - to be smaller than the parents, if the parents were large … the mean filial regression towards mediocrity was directly proportional to the parental deviation from it.” (quoted from Draper & Smith)

GEOMETRIC MEAN REGRESSION

Preserves Variance

Discussion by Ricker (1984)

HEIGHT-AGE CURVES
• Site Curves (Curtis)
• Site Index Prediction Functions
• Geometric Mean Regression
• Stochastic Differential Equation
• Height Growth Models
• Percentile Models
Site Curves and SI Prediction Functions
• Curtis et al. (1974)
• Site Curve - Yield table construction
• H = f(A, SI).
• SI Prediction Function - Site Classification
• SI = f(A, H).
SITE CURVES, SI PREDICTION, and GMR

SI = H (index age)

HA = H (age A)

3 Lines:

All at mean (HA, SI)

Slope = SI/HA { , 1, 1/ }

Straight-line assumption valid for bivariate normal.

Stochastic Differential Equation (Garcia, 1979)
• dH/dt = (b/c)H{(a/H)c -1}
• b is plot-specific, (a, c) are global.
• Integrates to Chapman-Richards.
• Add Wiener process error to growth.
• Add measurement errors at intervals.
• Fit with Maximum Likelihood.
• It’s a growth-model; also base-age invariant site curve.
Height Growth Model
• Family of H-A curves.
• From any one age, predict height difference to next or previous age.
• Parameters adjusted to minimize errors in predicted growth. (Bonnor et al, 1995), Flewelling et al (2001).
• Crude, ignores measurement errors, and correlations between periods. Flexible model form.
• It’s a growth model - attempts to model H-A trajectories of plots. Base-age invariant.
Percentile Models
• Concept by Pienaar and Clutter (Clutter et al, 1983).
• Example by Bi (2002).
• Extends to irregular data. (Flewelling, 1982, unpublished).
• Current econometrics theory, rich history.
Percentile Models
• Pienaar and Clutter:

Percentiles as a labeling device: “useful in illustrating the fact that index age is not a fundamental or required concept in the use of site index to express site quality.”

Percentile Models, Example
• Bi et al ( 2002)
• Temporary plots (age and site assumed orthogonal).
• H(t) assumed to have normal distribution.
• Q0.75 and Q0.25, fit as functions of t.
• methodology from Koenker and Bassett (1978)
• Mean H(t) fit with weighted regression.
Percentile Model, Irregular data.
• Sectioned tree data, height every year.
• Younger ages: full data set.
• Older ages: reduced data set.
• Establish tree percentiles at young age.
• Reassign censored percentiles older ages.
• Compute (and model) means and standard deviations from heights and percentiles.
Percentile models, econometrics
• Koenker (2000):
• wonderful discussion of least squares, alternative methods, and statistical history.
• Minimization of summed absolute errors dates from 1760’s.
Height-Age Curves. Questions
• Should height growth models be the same as constant percentile curves?
• Are regressions from one age to another wanted?
• Is there any use for an index age other than as a label?
POPULATIONS

WHICH PROJECTION IS WANTED?

TREE GROWTH MODELS
•  DBH
• Mortality fractions.
• What ensures that the variance of projected stand table is correct?
• Need variance models as constraints?
• Different fitting techniques?
• Good luck and occasional checking?
RIGHT INDEPENDENT VARIABLES?

Regional H-DBH

Curves.

Biased by Age or

position in stand.

Alternative:

local curves,

another variable.

Bayesian Regression
• Neglected in Forestry?
• Empirical Bayes used in volume equations (Green and Strawderman, 1985).
• Taper and volume equations by forest district (McTague, Stansfield and Lan, 1992).
• Other opportunities?
Bayesian Opportunity
• Fit y = a0 + a1x1 + a2x2 + a3x3 + …..
• Often by species or other category.
• Coefficients tested and omitted if non-significant.
• Or, selected coefficients fit in common for all species.
• Bayesian regression or other methods better?
OTHER REGRESSION TECHNIQUES
• ML with better error characterization.
• Mixed models.
• Systems: Seemingly unrelated regression, 2SLS, 3SLS ……..
• Generally are more efficient, better estimates of parameter variance, possibly avoid some biases. Necessary?
• Imputation?
SUMMARY
• What does population look like?
• What should be described?
• What techniques allow that?
REFERENCES
• Bi, H., A.D. Kozek and I.S. Ferguson. 2002. Quantile-based site index curves: a brief introductory note. Proc of IUFRO Symposium on Statistics and Technology in Forestry, Sept 8-12, 2002 Blacksburg. [ May be a related 2003 paper in J of Agr, Biological, and Environmental Statistics.]
• Bonnor, G.M., R. J. DeJong, P. Boudewyn and J. Flewelling. 1995. A guide to the STIM growth model. Nat. Res. Canada. Info Rpt X-353.
• Clutter, J.L., J.C. Fortson, L.V. Pienaar, G.H. Brister and R.L. Bailey. 1983. Timber management: a quantitative approach. Krieger Publ., Malamar, FL. 333 p.
• Curtis, R.O., D.J. Demars, F.R. Herman 1974. Which dependent variable is site index - height - age relationships? For. Sci. 20: 74-87
• Draper, N. R. and H. Smith. 1998. Applied Regression Analysis. Wiley. New York. 706 p.
• Flewelling, J. 1982. Dominant height trends for plantations of loblolly pine at the Mississippi/Alabama region of Weyerhaeuser Company. Research Rpt 050-3415/3. Weyerhaeuser Forestry Research, Hot Springs. (unpublished)
• Flewelling, J., R. Collier, B. Gonyea, D. Marshall and E. Turnblom. 2001. Height-age curves for planted stands of Douglas fir, with adjustments for density. SMC Working Paper No. 1, Univ. of WA, Seattle.
REFERENCES
• Garcia, O., 1979. A stochastic Differential Equation Model for height growth of forest stands. Biometrics 39: 1059-1072.
• Green, E. and W.E. Strawderman. 1985. The use of Bayes/Empirical Bayes Estimation in Individual Tree Volume Equation Development. For. Sci. 31: 975-990.
• Koenker, R. 2000. Galton, Edgeworth, Frisch, and prospects for quantile regression in econometrics. J of Econometrics 95: 347-374.
• Koenker, R.W. and G.W. Basset. 1978. Regression Quantiles. Econometrica 50, 43-61.
• McTague, J.P., W.F. Stansfield, Z. Lan. 1992. Southwestern ponderosa pine, Douglas fir and white fir volume and taper functions. Report to USFS. Northern Arizona University.
• Ricker, W.E. Computation and uses of central trend lines. Can. J. Zool. 62:1897-1905
• Royall, R.M. 1970. On finite population sampling theory under certain linear regression models. Biometrika 57: 377-387.
• Särndal, C., B. Swensson, J. Wretman . 1992. Model assisted survey sampling. Springer-Verlag, New York. 694 p.
• Stuart, A. 1984. The ideas of sampling. Macmillan, New York. 91 p.