1 / 23

RMTD 404

RMTD 404. Lecture 2. Summation Notation. We need a way to talk about the processes that occur in a statistical analysis in a succinct way We use summation notation Σ - stands for “sum” X - stands for the variable we sum

Download Presentation

RMTD 404

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RMTD 404 Lecture 2

  2. Summation Notation • We need a way to talk about the processes that occur in a statistical analysis in a succinct way • We use summation notation Σ - stands for “sum” X - stands for the variable we sum i - referred to as a subscripting index, stands for the individual values of X N - stands for the highest value we sum across (usually the number of cases). N could be replaced by a number, but we usually use a letter like N to indicate that we’re summing across all values of X (i.e., there are N values of the X variable).

  3. Summation Notation • Examples • Is read as the sum of the values of X ranging from 1 (the first unit/person) to the Nth person (the last unit/person) • Say X is a vector of {1,2,3,4,5} • Using the above summation notation we can get 1+2+3+4+5 = 15

  4. Summation Notation • We can be more specific • In this case we are only interested in summing the first 4 integer: 1+2+3+4 = 10 • What do you think about these ones?

  5. Summation Notation • X = {11,9,8,15,3} • If i = 2, Xi= 9 • If i = N, Xi= 3 (the Nth case value; N = 5) • What do we think about this?

  6. Summation Notation • X = {11,9,8,15,3} • If i= 2, Xi= 9 • If i = N, Xi= 3 (the Nth case value; N = 5) • What do we think about these? • Pay attention to the parentheses – solve those first then exponentiate

  7. Summation Notation • Some rules • Adding a constant • Multiplying a constant • Multiplying matched pairs (two vectors) • Difference between two vectors

  8. Summation Notation • Don’t let summation notation scare you • All we’re doing here is summing across a vector of rows (I)and a vector of columns (J)

  9. Measures of Central Tendency • To get at the “location” of the distributions we use measures of central tendency • We look at location shifts

  10. Measures of Central Tendency • Mean • Median • Mode X = {5,3,2,9,3,4,9,8,2} Using R…

  11. Distributions: Modality • Compare the following two graphics • The left graph shows evidence of a bimodal distribution (two distinct points) Mean, median, mode

  12. Distributions: Shape • When talking about shape, we are talking about kurtosis – the concentration of the data in the center, shoulders, and tail center leptokurtic shoulders mesokurtic platykurtic tails

  13. Distribution: Skewness • The left is negatively skewed while the right is positively skewed • When skewness is present, our measures of central tendency aren’t as obvious mode median mean

  14. Measures of Variability • Range – difference between two most extreme points • Interquartile Range – the difference between the 25th and 75th percentiles • Variance - the average deviation score from the mean • Standard deviation – average absolute deviation from the mean

  15. Measures of Variability • Coefficient of Variation - An index that rescales the standard deviations from two groups that are measured on the same scale but have very different means (useful for comparing group variability).

  16. SPSS & R • Using the NELS student data we can get the following output for the base-year math scores • Using SPSS • Using R summary(bytxmstd) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 30.28 43.17 51.45 51.71 59.66 71.22 30.00

  17. Transformation • There are some solutions to skewed distributions • Linear transformations • We can add a constant to each case in the dataset will shift the mean of the distribution by that value • We can similarly multiply or divide values each case by some constant

  18. Transformation • Standardization is a very common method • Z-scores help us turn raw scores into standard deviations (with a mean of 0 and sd of 1) • For example, if someone has a GRE score of 620, and the mean is 500, and sd is 100 then…

  19. Transformation • You can use the following formula to transform scores into have a mean and standard deviation of your interest • X’ is the transformed score, sx’ is the desired sd, and Xbar’ is the desired mean

  20. Parameters and Statistics (Quick Notation)

  21. Some Important Properties • Sufficiency • Statistic uses all of the information in the sample – think of the mean, median, and mode… • Unbiasedness • The average of the sum of all possible samples will yield the exact estimate of the parameter of interest – the expected value is equal to the parameter • Efficiency • The variability of a large number of samples is smaller for some statistic than for another (related) statistic • Resistant • Not heavily influenced by outliers

  22. Introduction to R • Basic commands • Creating variables • Graphics • Importing data

  23. Introduction to SPSS • Descriptives • Transformations • Graphics

More Related