Create Presentation
Download Presentation

Download Presentation
## 2.4 Describing Distributions Numerically – cont.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**2.4 Describing Distributions Numerically – cont.**Describing Symmetric Data**Recall: 2 characteristics of a data set to measure**• center measures where the “middle” of the data is located • variability measures how “spread out” the data is**Measure of Center When Data Approx. Symmetric**• mean (arithmetic mean) • notation**Connection Between Mean and Histogram**• A histogram balances when supported at the mean. Mean x = 140.6**Mean: balance pointMedian: 50% area each halfright histo:**mean 55.26 yrs, median 57.7yrs**Properties of Mean, Median**1. The mean and median are unique; that is, a data set has only 1 mean and 1 median (the mean and median are not necessarily equal). 2. The mean uses the value of every number in the data set; the median does not.**Example: class pulse rates**• 53 64 67 67 70 76 77 77 78 83 84 85 85 89 90 90 90 90 91 96 98 103 140**2010**n = 845 = $3,297,828 median = $1,330,000 max = $33,000,000 2014 n = 848 = $3,932,912 median = $1,456,250 max = $28,000,000 2010, 2014 baseball salaries**Disadvantage of the mean**• Can be greatly influenced by just a few observations that are much greater or much smaller than the rest of the data**Skewness: comparing the mean, and median**• Skewed to the right (positively skewed) • mean>median**Skewed to the left; negatively skewed**• Mean < median • mean=78; median=87;**Symmetric data**• mean, median approx. equal**Describing Symmetric Data (cont.)**• Measure of center for symmetric data: • Measure of variability for symmetric data?**Example**• 2 data sets: x1=49, x2=51 x=50 y1=0, y2=100 y=50**49 51**On average, they’re both comfortable 0 100**Ways to measure variability**1. range=largest-smallest ok sometimes; in general, too crude; sensitive to one large or small obs.**The Sample Standard Deviation, a measure of spread around**the mean • Square the deviation of each observation from the mean; find the square root of the “average” of these squared deviations**Calculations …**Women height (inches) Mean = 63.4 Sum of squared deviations from mean = 85.2 (n − 1) = 13; (n − 1) is called degrees freedom (df) s2 = variance = 85.2/13 = 6.55 inches squared s = standard deviation = √6.55 =2.56 inches**1. First calculate the variance s2.**We’ll never calculate these by hand, so make sure to know how to get the standard deviation using your calculator, Excel, or other software. Mean ± 1 s.d. 2.Then take the square root to get the standard deviation s.**Remarks**1. The standard deviation of a set of measurements is an estimate of the likely size of the chance error in a single measurement**Remarks (cont.)**2. Note that s and s are always greater than or equal to zero. 3. The larger the value of s (or s ), the greater the spread of the data. When does s=0? When does s =0?**Remarks (cont.)**4. The standard deviation is the most commonly used measure of risk in finance and business • Stocks, Mutual Funds, etc. 5. Variance • s2 sample variance • 2 population variance • Units are squared units of the original data • square $, square gallons ??**Remarks 6):Why divide by n-1 instead of n?**• degrees of freedom • each observation has 1 degree of freedom • however, when estimate unknown population parameter like m, you lose 1 degree of freedom**Remarks 6) (cont.):Why divide by n-1 instead of n? Example**• Suppose we have 3 numbers whose average is 9 • x1= x2= • then x3 must be • once we selected x1 and x2, x3 was determined since the average was 9 • 3 numbers but only 2 “degrees of freedom”**Example**#1 #2 #3 #4 32 33 38 37 41 35 39 42 44 45 39 45 47 50 40 46 50 52 56 47 53 54 57 48 56 58 58 50 59 59 61 67 68 64 62 68 • x 50 50 50 50 • s 10.6 10.6 10.6 10.6 • m 50 52 56 47**Review: Properties of s and s**• s and s are always greater than or equal to 0 when does s = 0? s = 0? • The larger the value of s (or s), the greater the spread of the data • the standard deviation of a set of measurements is an estimate of the likely size of the chance error in a single measurement