320 likes | 577 Views
Econometrics. Session 2 – Introduction: Inference Amine Ouazad, Asst. Prof. of Economics. Outline of the course. Introduction: Identification Introduction: Inference Linear Regression Identification Issues in Linear Regressions Inference Issues in Linear Regressions.
E N D
Econometrics Session 2 – Introduction: Inference Amine Ouazad, Asst. Prof. of Economics
Outline of the course • Introduction: Identification • Introduction: Inference • Linear Regression • Identification Issues in Linear Regressions • Inference Issues in Linear Regressions
Previous session: Identification • Golden Benchmark: Randomization • D = E(Y(1)|D=1) – E(Y(0)|D=0) • We do not in fact observe E(Y(d)|D=d)… • But we observe:
This sessionIntroduction: Inference • What problems appear because of the limited number of observations? • Hands-on problem #1: • At the dinner table, your brother-in-law suggests playing heads or tails using a coin. You suspect he is cheating. How do you prove that the coin is unbalanced?
This sessionIntroduction: Inference • Hands-on problem #2: • Using a survey of 1,248 subjects in Singapore, you determine that the average income is $29,041 per year. How close is this mean to the true average income of Singaporeans? Do we have enough data?
This sessionIntroduction: Inference • Convergence • The Law of Large Numbers • The Central Limit Theorem • Hypothesis Testing • Inference for the estimation of treatment effects.
Session 2 - Inference 1. convergence
Warning (you can ignore this) • Proofs of the LLN and the CLT are omitted since most of their details are irrelevant to daily econometric practice. • There are multiple flavors of the LLN and the CLT. I only introduce one flavor per theorem. I will introduce more versions as needed in the following sessions, but do not put too much emphasis on the distinctions (Appendix D of the Greene).
Notations • An estimator of a quantity is a function of the observations in the sample. • Examples: • Estimator of the fraction of women in Singapore. • Estimator of the average salary of Chinese CEOs. • Estimator of the effect of a medication of patients’ health. • An estimator is typically noted with a hat. • An estimator sometimes has an index n for the number of observations in the sample.
Convergence • Convergence in probability. • An estimator qn of q is converging in probability to qif for all epsilon, P(|qn-q|>e) -> 0 as n->∞. • We write plimqn = q • An estimator of q is consistent if it converges in probability to q.
Session 2 - Inference 2. Law of large numbers
Law of Large Numbers • Let X1, …, Xn be an independent sequence of random variables, with finite expected value mu = E(Xj), and finite variance sigma^2 = V(Xj). Let Sn = X1+…+Xn. Then, for any epsilon>0, • As n->infinity
Law of large numbers • The empirical mean of a series of random variables X1, …, Xn converges in probability to the actual expectancy of the sequence of random variables. • Application: What is the fraction of women in Singapore? • Xi = 1 if an individual is a woman. • EXi is the fraction of women in the population. • Empirical mean is arbitrarily close to the true fraction of women in Singapore. • Subtlety?
Another application • Load the micro census data. • Take 100, 5% samples of the dataset. Calculate the fraction of women in the dataset, for each dataset. • Consider the approximation that the fraction of women is 51% exactly. • Illustrate that for epsilon = 0.5%, the number of samples with a mean above 51+-0.5 is shrinking as the size of the sample increases.
Session 2 - Inference 2. Central limit theorem
Central Limit Theorem Lindeberg-Levy Central Limit Theorem: • If x1,…,xn are an independent random sample from a probability distribution with finite mean m and finite variance s2, and then • Proof: Rao (1973, p.127) using characteristic functions
Applications: Central Limit Theorem • Exercise #1: • You observe heads,tails,tails,heads,tails,heads. • Give an estimate of the probability of heads, with a 95% confidence interval. • Exercise #2: • Solve the hands-on problem #2 at the beginning of these slides. • Discuss the assumptions of the CLT.
Session 2 - Inference 3. Hypothesis testing
Hypothesis testing • Null hypothesis H0. • Alternative hypothesis Ha. • Unknown parameter q. • Typical null hypothesis: • Is q = 0 ? • Is q > 3 ? • Is q = f ? (if f is another unknown parameter). • Is q = 4 f ?
Hypothesis Testing: Applications • Application #1 (Coin toss): is the coin balanced? • Write the null hypothesis. • Given the information presented before, can we reject the null hypothesis at 95%? • Application #2 (Average income): is the average income greater than $29,000 ? • Write the null hypothesis. • Given the information presented before, can we reject the null hypothesis at 90%?
t-test • From the Central Limit Theorem, if the standard deviation were known, under the null hypothesis: • But the s.d. is estimated, and, under the null hypothesis:
Critical region • Region for which the null hypothesis is rejected. • If the null hypothesis is true, then the null is rejected in 5% of cases if the critical region is: • Where cq is the qthquantile of the student distribution with n-1 degrees of freedom.
Flavors of t-tests • One-sample, two-sided. • See previous slides. • One-sample, one-sided. • Two-sample, two-sided. • Equal and unequal variances. • Two-sample, one-sided. • Equal and unequal variances.
Errors • E.g. in judicial trials, medical tests, security checks. • Power of a test 1-b: probability of rejecting the null when the null is false. • Size of a test a: proba of type I error.
Quirky • Many papers run a large number of tests on the same data. • Many papers report only significant tests… • What is wrong with this approach? • Many papers run “robustness checks”, i.e. tests where the null hypothesis should not to be rejected. • What is wrong with this approach? • Conclusion: • This is wrong, but common practice. For more , see January 2012 of Strategic Management Journal.
Session 2 - Inference 4. Inference for treatment effects
Treatment effects:Inference (inspired by Lazear) • There are two groups, a treatment and a control group. • 128 employees are randomly allocated to the treatment and to the control. • Treatment employees: piece rate payoff. • Control employees: fixed pay. • Treatment workers is 38.3 pieces per hour in the treatment group, and is 23.1 in the control group.
Questions • Why do we perform a randomized experiment? • Do we have enough information to get an estimator of the treatment effect? • Is the estimator consistent? • Is the estimator asymptotically normal? • Do we have enough information to get a 95% confidence interval around the estimator of the treatment effect? • Test the hypothesis that the medication is effective at raising the health index.