Lecture :Apply Gauss Markov Modeling Regression with One Explanator (Chapter 3.1–3.5, 3.7Chapter 4.1–4.4)
Agenda • Finding a good estimator for a straight line through the origin: Chapter 3.1–3.5, 3.7 • Finding a good estimator for a straight line with an intercept: Chapter 4.1–4.4
Where Are We? (範例) • We wish to uncover quantitative features of an underlying process, such as the relationship between family income and financial aid. • 更精準些 How much less aid will I receive on average for each dollar of additional family income? • DATA, a sample of the process, for example observations on 10,000 students’ aid awards and family incomes.
隨機項 • Other factors (e), such as number of siblings, influence any individual student’s aid, so we cannot directly observe the relationship between income and aid. • We need a rule for making a good guess about the relationship between income and financial aid, based on the data.
Guess • A good guess is a guess which is right on average. • We also desire a guess which will have a low variance around the true value.
估計式 • Our rule is called an “estimator.” • We started by brainstorming a number of estimators and then comparing their performances in a series of computer simulations. • We found that the Ordinary Least Squares estimator dominated the other estimators. • Why is Ordinary Least Squares so good?
工具 • To make more general statements, we need to move beyond the computer and into the world of mathematics. • Last time, we reviewed a number of mathematical tools: summations, descriptive statistics, expectations, variances, and covariances.
DGP • As a starting place, we need to write down all our assumptions about the way the underlying process works, and about how that process led to our data. • These assumptions are called the “Data Generating Process.” • Then we can derive estimators that have good properties for the Data Generating Process we have assumed.
Model • The DGP is a model to approximate reality. We trade off realism to gain parsimony and tractability. • Models are to be used, not believed.
DGP assumptions • Much of this course focuses on different types of DGP assumptions that you can make, giving you many options as you trade realism for tractability.
Two Ways to Screw Up in Econometrics • Your Data Generating Process assumptions missed a fundamental aspect of reality (your DGP is not a useful approximation); or • Your estimator did a bad job for your DGP. • Today we focus on picking a good estimator for your DGP.
GMT • Today, we will focus on deriving the properties of an estimator for a simple DGP: the Gauss–Markov Assumptions. • First we will find the expectations and variances of any linear estimator under the DGP. • Then we will derive the Best Linear Unbiased Estimator (BLUE).
Our Baseline DGP: Gauss–Markov(Chapter 3) • Y =bX +e • E(ei ) = 0 • Var(ei ) =s2 • Cov(ei ,ej) = 0, for i ≠ j • X ’s fixed across samples (so we can treat them like constants). • We want to estimateb
A Strategy for Inference • The DGP tells us the assumed relationships between the data we observe and the underlying process of interest. • Using the assumptions of the DGP and the algebra of expectations, variances, and covariances, we can derive key properties of our estimators, and search for estimators with desirable properties.
Linear Estimators • bg1 is unbiased. Can we generalize? • We will focus on linear estimators. • Linear estimator: a weighted sum of the Y’s.
Linear Estimators (weighted sum) • Linear estimator: • Example: bg1 is a linear estimator.
A class of Linear Estimators • All of our “best guesses” are linear estimators!
Check others • A linear estimator is unbiased if SwiXi = 1 • Are bg2 and bg4 unbiased?
Better unbiased estimator • Similar calculations hold for bg3 • All 4 of our “best guesses” are unbiased. • But bg4 did much better than bg3. Not all unbiased estimators are created equal. • We want an unbiased estimator with a low mean squared error.
First: A Puzzle….. • Suppose n = 1 • Would you like a big X or a small X for that observation? • Why?
(Stat. significant)? • bg1 puts more weight on observations with low values of X. • bg3 puts more weight on observations with low values of X, relative to neighboring observations. • These estimators did very poorly in the simulations.
What Observations Receive More Weight? (cont.) • bg2 weights all observations equally. • bg4 puts more weight on observations with high values of X. • These observations did very well in the simulations.
Why Weight More Heavily Observations With High X’s? • Under our Gauss–Markov DGP the disturbances are drawn the same for all values of X…. • To compare a high X choice and a low X choice, ask what effect a given disturbance will have for each.
Linear Estimators and Efficiency • For our DGP, good estimators will place more weight on observations with high values of X • Inferences from these observations are less sensitive to the effects of the same e • Only one of our “best guesses” had this property. • bg4 (a.k.a OLS) dominated the other estimators. • Can we do even better?
Min. MSE • Mean Squared Error = Variance + Bias2 • To have a low Mean Squared Error, we want two things: a low bias and a low variance.
Need Variance • An unbiased estimator with a low variance will tend to give answers close to the true value of b • Using the algebra of variances and our DGP, we can calculate the variance of our estimators.
Algebra of Variances • One virtue of independent observations is that Cov( Yi ,Yj ) = 0, killing all the cross-terms in the variance of the sum.
Back again to Our Baseline DGP: Gauss–Markov • Our benchmark DGP: Gauss–Markov • Y =bX +e • E(ei ) = 0 • Var(ei ) =s2 • Cov(ei,ej ) = 0, for i ≠ j • X’s fixed across samples We will refer to this DGP (very) frequently.
Variance of OLS (cont.) • Note: the higher theSXk2, the lower the variance.
Variance of a Linear Estimator • More generally:
Variance of a Linear Estimator (cont.) • The algebras of expectations and variances allow us to get exact results where the Monte Carlos gave only approximations. • The exact results apply to ANY model meeting our Gauss–Markov assumptions.
Variance of a Linear Estimator (cont.) • We now know mathematically that bg1–bg4 are all unbiased estimators of bunder our Gauss–Markov assumptions. • We also think from our Monte Carlo models that bg4 is the best of these four estimators, in that it is more efficient than the others. • They are all unbiased (we know from the algebra), but bg4 appears to have a smaller variance than the other 3.
Variance of a Linear Estimator (cont.) • Is there an unbiased linear estimator better (i.e., more efficient) than bg4? • What is the Best, Linear, Unbiased Estimator? • How do we find the BLUE estimator?
BLUE Estimators • Mean Squared Error = Variance + Bias2 • An unbiased estimator is right “on average” • In practice, we don’t get to average. We see only one draw from the DGP.
BLUE Estimators (Trade-off ??) • Some analysts would prefer an estimator with a small bias, if it gave them a large reduction in variance • What good is being right on average if you’re likely to be very wrong in your one draw?
BLUE Estimators (cont.) • Mean Squared Error = Variance + Bias2 • In a particular application, there may be a favorable trade-off between accepting a little bias in return for a lot less variance. • We will NOT look for these trade-offs. • Only after we have made sure our estimator is unbiased will we try to make the variance small.
BLUE Estimators (cont.) A Strategy for Finding the Best Linear Unbiased Estimator: • Start with linear estimators: SwiYi • Impose the unbiasedness condition SwiXi=1 • Calculate the variance of a linear estimator: Var(SwiYi) =s2Swi2 • Use calculus to find the wi that give the smallest variance subject to the unbiasedness condition Result: the BLUE Estimator for Our DGP
BLUE Estimators (cont.) • OLS is a very good strategy for the Gauss–Markov DGP. • OLS is unbiased: our guesses are right on average. • OLS is efficient: it has a small variance (or at least the smallest possible variance for unbiased linear estimators). • Our guesses will tend to be close to right (or at least as close to right as we can get; the minimum variance could still be pretty large!)
BLUE Estimator (cont.) • According to the Gauss–Markov Theorem, OLS is the BLUE Estimator for the Gauss–Markov DGP. • We will study other DGP’s. For any DGP, we can follow this same procedure: • Look at Linear Estimators • Impose the unbiasedness conditions • Minimize the variance of the estimator