260 likes | 369 Views
Learn about confidence intervals, central limit theorem, linear regression, and classical statistical inference for random signals in this lecture by Dr. Farinaz Koushanfar at Rice University. Explore key concepts and examples in classical statistical inference.
E N D
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010
Lecture outline • Reading: 9.1-9.2 • Confidence Intervals • Central limit theorem • Student t-distribution • Linear regression
Confidence interval • Consider an estimator for unknown • We fix a confidence level, 1- • For every replace the single point estimator with a lower estimate and upper one s.t. • We call , a 1- confidence interval
Confidence interval - example • Observations Xi’s are i.i.d normal with unknown mean and known variance /n • Let =0.05 • Find the 95% confidence interval
Confidence interval (CI) • Wrong: the true parameter lies in the CI with 95% probability…. • Correct: Suppose that is fixed • We construct the CI many times, using the same statistical procedure • Obtain a collection of n observations and construct the corresponding CI for each • About 95% of these CIs will include
A note on Central Limit Theorem (CLT) • Let X1, X2, X3, ... Xn be a sequence of n independent and identically distributed RVs with finite expectation µ and variance σ2 > 0 • CLT: as the sample size n increases, PDF of the sample average of the RVs approaches N(µ,σ2/n) irrespective of the shape of the original distribution
CLT A probability density function Density of a sum of two variables Density of a sum of three variables Density of a sum of four variables
CLT • Let the sum of n random variables be Sn, given by Sn = X1 + ... + Xn. Then, defining a new RV • The distribution of Zn converges towards the N(0,1) as n approaches (this is convergence in distribution),written as • In terms of the CDFs
Confidence interval approximation • Suppose that the observations Xi are i.i.d with mean and variance that are unknown • Estimate the mean and (unbiased) variance • We may estimate the variance /n of the sample mean by the above estimate • For any given , we may use the CLT to approximate the confidence interval in this case From the normal table:
Confidence interval approximation • Two different approximations in effect: • Treating the sum as if it is a normal RV • The true variance is replaces by the estimated variance from the sample • Even in the special case where the Xi’s are i.i.d normal, the variance is an estimate and the RV Tn (below) is not normally distributed
t-distribution • For normal Xi, it can be shown that the PDF of Tn does not depend on and • This is called t-distribution with n-1 degrees of freedom
t-distribution • Its is also symmetric and bell-shaped (like normal) • The probabilities of various intervals are available in tables • When the Xi’s are normal and n is relatively small, a more accurate CI is (z=1-/2)
Example • The weight of an object is measured 8 times using an electric scale • It reports true weight + random error ~N(0,) .5547, .5404, .6364, .6438, .4917, .5674, .5564, .6066 • Compute the 95% confidence interval Using the t-distribution
Linear regression • Building a model of relation between two or more variables of interest • Consider two variables x and y, based on a collection of data points (xi,yi), i=1,…,n • Assume that the scatter plot of these two variables show a systematic, approximately linear relationship between xi and yi • It is natural to build a model: y0+1x
Linear regression • Often, we cannot build a model, but we can estimate the parameters: • The i-th residual is:
Linear regression • The parameters are chosen to minimize the sum of squared residuals • Always keep in mind that the postulated model may not be true • To perform the optimization, we set the partial derivatives to zero w.r.t0and 1
Linear regression • Given n data pairs (xi,yi), the estimates that minimize the sum of the squared residuals are
Example • The leaning tower of Pisa continuously tilts • Measurements bw 1975-1987 • Find the linear regression
Justification of the least square • Maximum likelihood • Approximation of Bayesian linear LMS (under a possibly nonlinear model) • Approximation of Bayesian LMS estimation (linear model)
Maximum likelihood justification • Assume that xi’s are given numbers • Assume yi’s are realizations of a RV Yias below where Wi’s are i.i.d ~N(0,2) Yi = 0 + 1xi + Wi • The likelihood function has the form • ML is equivalent to minimizing the sum of square residuals
Approximate Bayesian linear LMS • Assume xi and yi are realizations of RVs Xi & Yi, • (Xi,Yi) pairs are i.i.d with unknown joint PDF • Assume an additional independent pair (X0,Y0) • We observe X0 and want to estimate Y0 by a linear estimator • The linear estimator is of the form
Approximate Bayesian LMS • For the previous scenario, make the additional assumption of linear model Yi = 0 + 1xi + Wi • Wi’s are i.i.d ~N(0,2), independent of Xi • We know that E[Y0|X0] minimizes the mean squared estimation error, for E[Y0|X0]=0+1Xi • As n,
Multiple linear regression • Many phenomena involve multiple underlying variables, also called explanatory variables • Such models are called multiple regression • E.g., for a triplet of data points (xi,yi,zi) we wish to estimate the model: y0 + 1x + 2z • Minimize: i (yi - 0 - 1xi - 2zi)2 • In general, we can consider the model y 0 + j jhj(x)
Nonlinear regression • Sometimes the expression is nonlinear in the unknown parameter • Variables x and y obey the form yh(x;) • Min i (yi – h(xi ;))2 • The minimization is not typically closed-form • Assuming Wi’s are N(0,2), Yi = h(xi;) + Wi • The ML function
Practical considerations • Heteroskedasticity • Nonlinearity • Multicollinearity • Overfitting • Causality