1 / 0

Contact Information

Contact Information. Dr. Daniel Simons Vancouver Island University Faculty of Management Building 250 - Room 416 Office Hours: MWF: 12:00 – 13:00 simonsd@viu.ca. Suggestions for Best Individual Performance. Attend all classes

raven
Download Presentation

Contact Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Contact Information Dr. Daniel Simons Vancouver Island University Faculty of Management Building 250 - Room 416 Office Hours: MWF: 12:00 – 13:00 simonsd@viu.ca
  2. Suggestions for Best Individual Performance Attend all classes Take notes. Course covers a lot of material and your notes are essential Complete all assignments (not for grade) Read the book Participate, enrich class discussion, provide feedback and ask questions Revise materials between classes, integrate concepts, make sure you understand the tools and their application Don’t hesitate to contact me if necessary
  3. Evaluation Method Tests have a mix of problems that evaluate Concepts Problem sets (assignments) Class applications Readings New applications Closed book time constrained tests to reward knowledge and speed Each test covers slides, assignments, and required readings. Evaluation system may not be perfect but it works
  4. Chapter 1 An Introduction to Econometrics
  5. Chapter Contents 1.1 Why Study Econometrics 1.2 What is Econometrics About 1.3 The Econometric Model 1.4 How Are Data Generated 1.5 Economic Data Types 1.6 The Research Process 1.7 Writing An Empirical Research Paper 1.8 Sources of Economic Data
  6. 1.1 Why Study Econometrics
  7. 1.1 Why Study Econometrics Econometrics fills a gap between being a “student of economics” and being a “practicing economist” It lets you tell your employer: “I can predict the sales of your product” “I can estimate the effect on your sales if your competition lowers its price by $1 per unit” “I can test whether your new ad campaign is actually increasing your sales” Helps you develop “intuition” about how things work and is invaluable if you go to graduate school
  8. Brief Overview of the Course
  9. 1.2 What is Econometrics About
  10. 1.2 What is Econometrics About Econometrics is about how we can use theory and data from economics, business, and the social sciences, along with tools from statistics, to answer ‘‘how much’’ questions.
  11. 1.2 What is Econometrics About In economics we express our ideas about relationships between economic variables using the mathematical concept of a function
  12. 1.2 What is Econometrics About Every day, decision-makers face ‘‘how much’’: A city council ponders the question of how much violent crime will be reduced if an additional million dollars is spent putting uniformed police on the street The owner of a local Pizza Hut must decide how much advertising space to purchase in the local newspaper, and thus must estimate the relationship between advertising and sales Louisiana State University must estimate how much enrollment will fall if tuition is raised by $300 per semester, and thus whether its revenue from tuition will rise or fall The CEO of Proctor & Gamble must estimate how much demand there will be in ten years for the detergent Tide, and how much to invest in new plant and equipment 1.2.1 Some Examples
  13. 1.2 What is Econometrics About Every day, decision-makers face ‘‘how much’’ questions (Continued): A real estate developer must predict by how much population and income will increase to the south of Baton Rouge, Louisiana, over the next few years, and whether it will be profitable to begin construction of a gambling casino and golf course You must decide how much of your savings will go into a stock fund, and how much into the money market. This requires you to make predictions of the level of economic activity, the rate of inflation, and interest rates over your planning horizon A public transportation council in Melbourne, Australia, must decide how an increase in fares for public transportation (trams, trains, and buses) will affect the number of travelers who switch to car or bike, and the effect of this switch on revenue going to public transportation 1.2.1 Some Examples
  14. 1.3 The Econometric Model
  15. 1.3 The Econometric Model An econometric model consists of a systematic part and a random and unpredictable component e that we will call a random error
  16. 1.3 The Econometric Model The coefficients β1, β2, …, β5 are unknown parameters of the model that we estimate using economic data and an econometric technique The functional form represents a hypothesis about the relationship between the variables In any particular problem, one challenge is to determine a functional form that is compatible with economic theory and the data
  17. 1.3 The Econometric Model The systematic portion is the part we obtain from economic theory, and includes an assumption about the functional form The random component represents a ‘‘noise’’ component, which obscures our understanding of the relationship among variables, and which we represent using the random variable e
  18. 1.3 The Econometric Model We use the econometric model as a basis for statistical inference The ways in which statistical inference are carried out include: Estimating economic parameters, such as elasticities, using econometric methods Predicting economic outcomes, such as the enrollment in two-year colleges in the United States for the next ten years Testing economic hypotheses, such as the question of whether newspaper advertising is better than store displays for increasing sales
  19. 1.4 How are Data Generated?
  20. 1.4 How Are Data Generated? We must have data Where do data come from? What type of real processes generate data? Economists and other social scientists work in a complex world in which data on variables are ‘‘observed’’ and rarely obtained from a controlled experiment This makes the task of learning about economic parameters all the more difficult
  21. 1.4 How Are Data Generated? One way to acquire information about the unknown parameters of economic relationships is to conduct or observe the outcome of an experiment Such controlled experiments are rare in business and the social sciences There are some examples of planned experiments in the social sciences A notable example of a planned experiment is Tennessee’s Project Star 1.4.1 Experimental Data
  22. 1.4 How Are Data Generated? An example of nonexperimental data is survey data Data on all variables are collected simultaneously, and the values are neither fixed nor repeatable These are nonexperimental data 1.4.2 Non-experimental Data
  23. 1.5 Economic Data Types
  24. 1.5 Economic Data Types Economic data comes in a variety of ‘‘flavors.” Data may be collected at various levels of aggregation: Micro or Macro Data may also represent a flow or a stock: Flow: measured over a period of time Stock: measured at a particular point in time Data may be quantitative or qualitative: Quantitative: expressed as numbers Qualitative: expressed as an ‘‘either-or’’ situation
  25. 1.5 Economic Data Types A time-series is data collected over discrete intervals of time The key feature of time-series data is that the same economic quantity is recorded at a regular time interval 1.5.1 Time-Series Data
  26. 1.5 Economic Data Types Table 1.1 Annual GDP of Real 2005 Dollars 1.5.1 Time-Series Data
  27. 1.5 Economic Data Types Figure 1.1 Real U.S. GDP, 1980–2008 1.5.1 Time-Series Data
  28. 1.5 Economic Data Types A cross-section of data is collected across sample units in a particular time period The ‘‘sample units’’ are individual entities and may be firms, persons, households, states, or countries 1.5.2 Cross-Section Data
  29. 1.5 Economic Data Types Table 1.2 Cross Section Data: CPS August 2009 1.5.2 Cross-Section Data
  30. 1.5 Economic Data Types 1.5.3 Panel or Longitudinal Data A ‘‘panel’’ of data, also known as ‘‘longitudinal’’ data, has observations on individual micro-units who are followed over time The key aspect of panel data is that we observe each micro-unit for a number of time periods If we have the same number of time period observations for each micro-unit, we have a balanced panel Usually the number of time series observations is small relative to the number of micro-units, but not always
  31. 1.5 Economic Data Types Table 1.3 Panel Data from Two Rice Farms 1.5.3 Panel or Longitudinal Data
  32. STATISTICAL PRINCIPLES A review of the basic principles of statistics used in business settings. A BRIEF REVIEW OF QUME 232
  33. Basic Statistical Concepts Important that students are comfortable with the following: Concept of random variable (whether discreet or continuous) and their associated probability functions Cumulative, marginal, conditional and joint probability functions Mathematical expectations, concept of independence Probability Distributions, Binomial, Poisson, Uniform, Normal Standardized Variables z, t, F and χ2 distributions Sampling and Sampling Distributions Estimation, hypotheses testing, & confidence intervals
  34. Summations The S symbol is a shorthand notation for discussing sums of numbers. It works just like the + sign you learned about in elementary school.
  35. Algebra of Summations
  36. Summations: A Useful Trick
  37. Double Summations The “Secret” to Double Summations: keep a close eye on the subscripts.
  38. Descriptive Statistics How can we summarize a collection of numbers? Mean: the arithmetic average. The mean is highly sensitive to a few large values (outliers). Median: the midpoint of the data. The median is the number above which lie half the observed numbers and below which lie the other half. The median is not sensitive to outliers.
  39. Descriptive Statistics (cont.) Mode: the most frequently occurring value. Variance: the mean squared deviation of a number from its own mean. The variance is a measure of the “spread” of the data. Standard deviation: the square root of the variance. The standard deviation provides a measure of a typical deviation from the mean.
  40. Descriptive Statistics (cont.) Covariance: the covariance of two sets of numbers, X and Y, measures how much the two sets tend to “move together.” If Cov(X,Y)  0, then if X is above its mean, we would expect that Y would also be above its mean.
  41. Descriptive Statistics (cont.) Correlation Coefficient: the correlation coefficient between X and Y “norms” the covariance by the standard deviations of X and Y. You can think of this adjustment as a unit correction. The correlation coefficient will always fall between -1 and 1.
  42. A Quick Example
  43. A Quick Example (cont.)
  44. A Quick Example (cont.)
  45. Populations and Samples Two uses for statistics: Describe a set of numbers Draw inferences from a set of numbers we observe to a larger population The population is the underlying structure which we wish to study. Surveyors might want to relate 6000 randomly selected voters to all the voters in the United States. Macroeconomists might want to relate data about unemployment and inflation from 1958–2004 to the underlying process linking unemployment and inflation, to predict future realizations.
  46. Populations and Samples (cont.) We cannot observe the entire population. Instead, we observe a sample drawn from the population of interest. In the Monte Carlo demonstration from last time, an individual dataset was the sample and the Data Generating Process described the population.
  47. Populations and Samples (cont.) The descriptive statistics we use to describe data can also describe populations. What is the mean income in the United States? What is the variance of mortality rates across countries? What is the covariance between gender and income?
  48. Populations and Samples (cont.) In a sample, we know exactly the mean, variance, covariance, etc. We can calculate the sample statistics directly. We must infer the statistics for the underlying population. Means in populations are also called expectations.
  49. Populations and Samples (cont.) If the true mean income in the United States is b, then we expect a simple random sample to have sample mean b. In practice, any given sample will also include some “sampling noise.” We will observe not b, but b + e. If we have drawn our sample correctly, then on average the sampling error over many samples will be 0. We write this as E(e) = 0
  50. Probability A random variable X is a variable whose numerical value is determined by chance, the outcome of a random phenomenon A discrete random variable has a countable number of possible values, such as 0, 1, and 2 A continuous random variable, such as time and distance, can take on any value in an interval A probability distribution P[Xi] for a discrete random variable X assigns probabilities to the possible values X1, X2, and so on For example, when a fair six-sided die is rolled, there are six equally likely outcomes, each with a 1/6 probability of occurring
  51. Mean, Variance, and Standard Deviation The expected value (or mean) of a discrete random variable X is a weighted average of all possible values of X, using the probability of each X value as weights: (17.1) the variance of a discrete random variable X is a weighted average, for all possible values of X, of the squared difference between X and its expected value, using the probability of each X value as weights: (17.2) The standard deviationσ is the square root of the variance
  52. Continuous Random Variables Our examples to this point have involved discrete random variables, for which we can count the number of possible outcomes: The coin can be heads or tails; the die can be 1, 2, 3, 4, 5, or 6 For continuous random variables, however, the outcome can be any value in a given interval A continuous probability density curve shows the probability that the outcome is in a specified interval as the corresponding area under the curve
  53. Expectations Expectations are means over all possible samples (think “super” Monte Carlo). Means are sums. Therefore, expectations follow the same algebraic rules as sums. See the Statistics Appendix for a formal definition of Expectations.
  54. Algebra of Expectations k is a constant. E(k) = k E(kY) = kE(Y) E(k+Y) = k + E(Y) E(Y+X) = E(Y) + E(X) E(SYi ) = SE(Yi ), where each Yi is a random variable.
  55. Variances Population variances are also expectations.
  56. Algebra of Variances One value of independent observations is that Cov(Yi ,Yj ) = 0, killing all the cross-terms in the variance of the sum.
  57. 2 random variables: joint and marginal distributions The joint probability distribution of X and Y, two discrete random variables, is the probability that the two variables simultaneously take on certain values, say x and y. Example: Weather conditions and commuting times Let Y = 1 if the commute is short (less than 20 minutes) and = 0 if otherwise. Let X = 0 if it is raining and 0 if it is not The joint probability is the frequency with which each of the four possible outcomes (X=0,Y=0) (X=1,Y=0) (X=0,Y=1) (X=1,Y=1) occurs over many repeated commutes
  58. Joint probability Distribution Over many commutes, 15% of the days have rain and long commute, that is P(X=0, Y=0) 0.15. This is a joint probability distribution
  59. Marginal Probability Distribution The marginal probability distribution of a random variable X is just another name for its probability distribution. The marginal distribution of X from above is Find E(X) and Var(X)
  60. Conditional Probability Distribution The probability distribution of a random variable X conditional on another random variable Y taking on a specific value. The probability of X given Y. P(X=x|Y=y) P(X=x|Y=y) = P(X=x, Y=y)/ P(Y=y) Conditional probability of X given Y = joint probability of x and y divided by marginal probability of Y (the condition)
  61. Conditional Distribution P(Y=0|X=0) = P(X=0,Y=0)/ P(X=0) = 0.15/0.30 =0.5 Conditional Expectation: The conditional expectation of Y given X, that is the conditional mean of Y given X, is the mean of the conditional distribution of Y given X
  62. The expected number of long commutes given that it is raining is E(Y|X=0) = (0)*(0.15/0.30) + (1)*(0.15/.30)=0.5
  63. Law of Iterated Expectations The expected value of the expected value of Y conditional on X is the expected value of Y. If we take expectations separately for each subpopulation (each value of X), and then take the expectation of this expectation, we get back the expectation for the whole population.
  64. Independence Two random variables X and Y are independently distributed, or independent, if knowing the value of one of the variables provides no information about the other. Specifically, when E(Y|X) = E(Y) Or alternatively, P(Y=y|X=x) = P(Y=y) for all values of x and y Or P(Y=y,X=x) = P(Y=y)*P(X=x) That is, the joint distribution of two independent variables is the product of their marginal distributions
  65. Independence Are commuting times and weather conditions independent? P(Y=0,X=0) = 0.15 P(Y=0) * P(X=0) = 0.22 * 0.3 = 0.066 Since X and Y are NOT independent
  66. Covariance and Correlation Covariance is a measure of the extent to which two random variables move together If X and Y are independent then the covariance is zero but a covariance of zero does not imply independence. A zero covariance implies only linear independence
  67. Covariance and Correlation There is a positive relationship between commuting times and weather conditions
  68. Correlation Correlation solves the units problem of covariance. It is also a measure of dependence. It is unitless and has values between -1 and 1. A value of zero implies that X and Y are uncorrelated.
  69. Standardized Variables To standardize a random variable X, we subtract its mean and then divide by its standard deviation : (17.3) No matter what the initial units of X, the standardized random variable Z has a mean of 0 and a standard deviation of 1 The standardized variable Z measures how many standard deviations X is above or below its mean: If X is equal to its mean, Z is equal to 0 If X is one standard deviation above its mean, Z is equal to 1 If X is two standard deviations below its mean, Z is equal to –2 Figures 17.4 and 17.5 illustrates this for the case of dice and fair coin flips, respectively
  70. Figure 17.4a Probability Distribution for Six-Sided Dice, Using Standardized Z
  71. Figure 17.4b Probability Distribution for Six-Sided Dice, Using Standardized Z
  72. Figure 17.4c Probability Distribution for Six-Sided Dice, Using Standardized Z
  73. Figure 17.5a Probability Distribution for Fair Coin Flips, Using Standardized Z
  74. Figure 17.5b Probability Distribution for Fair Coin Flips, Using Standardized Z
  75. Figure 17.5c Probability Distribution for Fair Coin Flips, Using Standardized Z
  76. The Normal Distribution The density curve for the normal distribution is graphed in Figure 17.6 The probability that the value of Z will be in a specified interval is given by the corresponding area under this curve These areas can be determined by consulting statistical software or a table, such as Table B-7 in Appendix B Many things follow the normal distribution (at least approximately): the weights of humans, dogs, and tomatoes The lengths of thumbs, widths of shoulders, and breadths of skulls Scores on IQ, SAT, and GRE tests The number of kernels on ears of corn, ridges on scallop shells, hairs on cats, and leaves on trees
  77. Figure 17.6 The Normal Distribution
  78. The Normal Distribution (cont.) The central limit theorem is a very strong result for empirical analysis that builds on the normal distribution The central limit theorem states that: if Z is a standardized sum of N independent, identically distributed (discrete or continuous) random variables with a finite, nonzero standard deviation, then the probability distribution of Z approaches the normal distribution as N increases
  79. Sampling Recallthat: Population: the entire group of items that interests us Sample: the part of this population that we actually observe Statistical inference involves using the sample to draw conclusions about the characteristics of the population from which the sample came
  80. Selection Bias Any sample that differs systematically from the population that it is intended to represent is called a biased sample One of the most common causes of biased samples is selection bias, which occurs when the selection of the sample systematically excludes or underrepresents certain groups Selection bias often happens when we use a convenience sample consisting of data that are readily available Self-selection bias can occur when we examine data for a group of people who have chosen to be in that group
  81. Survivor and Nonresponse Bias A retrospective study looks at past data for a contemporaneously selected sample for example, an examination of the lifetime medical records of 65-year-olds A prospective study, in contrast, selects a sample and then tracks the members over time By its very design, retrospective studies suffer from survivor bias: we necessarily exclude members of the past population who are no longer around! Nonresponse bias: The systematic refusal of some groups to participate in an experiment or to respond to a poll
  82. The Power of Random Selection In a simple random sample of size N from a given population: each member of the population is equally likely to be included in the sample every possible sample of size N from this population has an equal chance of being selected How do we actually make random selections? We would like a procedure that is equivalent to the following: put the name of each member of the population on its own slip of paper drop these slips into a box mix thoroughly pick members out randomly In practice, random sampling is usually done through some sort of numerical identification combined with a computerized random selection of numbers
  83. Estimation First, some terminology: Parameter: a characteristic of the population whose value is unknown, but can be estimated Estimator: a sample statistic that will be used to estimate the value of the population parameter Estimate: the specific value of the estimator that is obtained in one particular sample Sampling variation: the notion that because samples are chosen randomly, the sample average will vary from sample to sample, sometimes being larger than the population mean and sometimes lower
  84. Sampling Distributions The sampling distribution of a statistic is the probability distribution or density curve that describes the population of all possible values of this statistic For example, it can be shown mathematically that if the individual observations are drawn from a normal distribution, then the sampling distribution for the sample mean is also normal Even if the population does not have a normal distribution, the sampling distribution of the sample mean will approach a normal distribution as the sample size increases It can be shown mathematically that the sampling distribution for the sample mean has the following mean and standard deviation: (17.5)
  85. The Mean of the Sampling Distribution A sample statistic is an unbiased estimator of a population parameter if the mean of the sampling distribution of this statistic is equal to the value of the population parameter Because the mean of the sampling distribution of X is μ, X is an unbiased estimator of μ
  86. The Standard Deviation of the Sampling Distribution One way of gauging the accuracy of an estimator is with its standard deviation: If an estimator has a large standard deviation, there is a substantial probability that an estimate will be far from its mean If an estimator has a small standard deviation, there is a high probability that an estimate will be close to its mean
  87. The t-Distribution When the mean of a sample from a normal distribution is standardized by subtracting the mean of its sampling distribution and dividing by the standard deviation of its sampling distribution, the resulting Z variable has a normal distribution W.S. Gosset determined (in 1908) the sampling distribution of the variable that is created when the mean of a sample from a normal distribution is standardized by subtracting and dividing by its standard error (≡ the standard deviation of an estimator):
  88. The t-Distribution (cont.) The exact distribution of t depends on the sample size, as the sample size increases, we are increasingly confident of the accuracy of the estimated standard deviation Table B-1 at the end of the textbook shows some probabilities for various t-distributions that are identified by the number of degrees of freedom: degrees of freedom = # observations - # estimated parameters
  89. Confidence Intervals A confidence interval measures the reliability of a given statistic such as X The general procedure for determining a confidence interval for a population mean can be summarized as: 1. Calculate the sample average X 2. Calculate the standard error of X by dividing the sample standard deviation s by the square root of the sample size N 3. Select a confidence level (such as 95 percent) and look in Table B-1 with N-1 degrees of freedom to determine the t-value that corresponds to this probability 4. A confidence interval for the population mean is then given by:
  90. Sampling from Finite Populations Notably, a confidence interval does not depend on the size of the population This may first seem surprising: if we are trying to estimate a characteristic of a large population, then wouldn’t we also need a large sample? The reason why the size of the population doesn’t matter is that the chances that the luck of the draw will yield a sample whose mean differs substantially from the population mean depends on the size of the sample and the chances of selecting items that are far from the population mean That is, not on how many items there are in the population
  91. Key Terms Selection, survivor, and   nonresponse bias Sampling distribution Population mean Sample mean Population standard deviation Sample standard deviation Degrees of freedom Confidence interval Random variable Probability distribution Expected Value Mean Variance Standard deviation Standardized random   variable Population Sample
More Related