Create Presentation
Download Presentation

Download Presentation
## Chapter 3 Numerical Descriptive Measures

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Contents**• Central Tendency: the extent to which all of the data values group around a central value. • Variation: the amount of dispersion or scattering of values away from a central point • Shape: the pattern of the distribution of values from the lowest value to the highest value**Blackstone (BX)**• Nov 28 2007 to Jan 17 2008**Central Tendency**• Mean (arithmetic mean) • Is the sum of the values divided by the number of values. • Drawback of Mean (extreme values) • Median • The value that splits a ranked set of data into two equal parts. • Median = (n+1)/2**The mode**• The value in a set of data that appears most frequently. • Quartiles • Split a set of data into four equal parts • Q1 = (n+1)/4 • i.e. n = 9 or n = 10**The Geometric Mean**• Measures the rate of change of a variable over time • Is the nth root of the product of n values • i.e. Rate of return (Jan 7-Jan 8) = 7.2% Rate of return (Jan 8-Jan 9) = 7.9% What’s the rate of return from Jan 7-Jan 9?**Range**• Range • Difference between the largest and the lowest • Interquartile Range • Difference between the third and first quartiles in a set of data (middle fifty)**Variance and standard deviation**• Sample variance is the sum of the squared differences around the mean divided by the sample size minus one • Standard deviation is the square root of the variance • Why use standard deviation?**Coefficient of variation**• Why is it important? • Z scores • Detect outliers**Shape**• Symmetrical • Skewed • Left Skewed • Right Skewed • Symmetrical**3.2 Numerical Description Measures For A Population**• Population Mean • Population Standard Deviation and Variance**Empirical Rule**• If distribution is symmetrical, population mean and stdv. can tell us a lot more about the distribution of the data • Example: Assume Blackstone stock follows a symmetrical distribution, what percent of stock price fall into the range between mean and first stdv.**The Chebyshev Rule**• If the distribution is skewed, the percentage of values that are found within distance of k δ From the µ must be at least (1 – 1/k²) × 100%**3.3 Computing Numerical Descriptive Measures From A**Frequency Distribution • Reading Assignment PP99**3.4 Exploratory Data Analysis**• The Five Number Summary • Min, Quartile 1, Median, Quartile 2 and Max • Help us determine the shape of the distribution • i.e. The following data represent the total fat for burgers from a sample of fast-food chains. 19, 31, 34, 35, 39, 39 and 43**The Box-and-Whisker Plot**• A graphical representation of the data based on the five number summary.**3.5 The Covariance and The Coefficient of Correlation**• The Covariance • Measures the strength of the linear relationship between two numerical variables. • Sample covariance**The coefficient of correlation**• Measures the relative strength of a linear relationship between two numerical variables. • It ranges from -1 for a perfect negative correlation to +1 for a perfect positive correlation. Zero means no correlation.**The coefficient of correlation (Cont’d)**• Sample coefficient of correlation