1 / 37

Descriptive Statistics

14. Descriptive Statistics. Introduction. Descriptive statistics the type of statistical analysis focused on describing, summarizing, or explaining a set of data Inferential statistics the type of statistical analysis focused on making inferences about populations based on sample data.

aeldridge
Download Presentation

Descriptive Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 14 Descriptive Statistics

  2. Introduction • Descriptive statistics • the type of statistical analysis focused on describing, summarizing, or explaining a set of data • Inferential statistics • the type of statistical analysis focused on making inferences about populations based on sample data

  3. Descriptive Statistics • Start with a data set • Descriptive statistics to understand and summarize the key numerical characteristics of the data set • e.g., means, frequencies, graphs • Key question in descriptive statistics • how can I communicate the important characteristics of my data?

  4. College Graduate Data Set

  5. Frequency Distributions • A systematic arrangement of data values in which the unique data values are rank ordered and the frequencies are provided for each of these values

  6. Frequency Distributions • Column 1 • lowest salary = $24,000 • highest salary = $41,000 • Column 2 (frequencies) • most frequently occurring salary = $32,500 • three of the 25 recent graduates had this starting salary • Column 3 (percentages) • 4% of the cases had a salary of $24,000 • 8% of the cases had a salary of $32,000

  7. Graphic Representations of Data Bar graph graph that uses vertical bars to represent the data values of a categorical variable height of bar represents frequency of category bars should not touch

  8. Bar Graph Example • Frequencies • 8 psychology majors • 10 philosophy majors • 7 business majors • Percentages • 32% psychology majors • 40% philosophy majors • 28% business majors

  9. Graphic Representations of Data • Histogram • graph depicting frequencies and distribution of a quantitative variable used with quantitative variables • no space between bars • advantage over a frequency distribution • it more clearly shows the shape of the distribution

  10. Histogram Example

  11. Graphic Representations of Data Line graphs a graph relying on the drawing of one or more lines connecting data points also used with quantitative variables particularly useful for interpreting interactions

  12. Line Graph Example

  13. Experimental Line Graph Example

  14. Graphic Representations of Data • Scatterplots • a graphical depiction of the relationship between two quantitative variables

  15. Measures of Central Tendency • Provide a single value that is typical of the distribution of scores • mode • most frequently occurring number in a data set • least useful measure of central tendency • example • 0, 2, 3, 4, 5, 5, 5, 7, 8, 8, 9, 10 mode = 5 • 0, 2, 3, 4, 5, 5, 5, 7, 8, 8, 8, 9, 10 mode = 5 and 8, bimodal data set • median • the center point in an ordered set of numbers • examples • odd number of data points 1, 2, 3, 4, 5, median = 3 • even number of data points 1, 2, 3, 4, median = 2.5

  16. Measures of Central Tendency Mean arithmetic average most commonly used measure of central tendency formula sample statistical symbol = example

  17. Measures of Variability Numerical value expressing how spread out or how much variation is present in the values of a quantitative variable if all data points the same, then there is no variability 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 which data set more variable? Data for group one: 44, 45, 45, 45, 46, 46, 47, 47, 48, 49 Data for group two: 34, 37, 45, 51, 58, 60, 77, 88, 90, 98

  18. Measures of Variability range highest score minus lowest score rarely used as a measure of variability variance and standard deviation both account for every score in the data set as each gets larger, the numbers in the data set are more spread out

  19. Measures of Variability variance average deviation of the data values from their mean in squared units standard deviation square root of variance roughly the average amount that individual scores deviate from the mean

  20. Variance and Standard Deviation Examples

  21. Measures of Variability • Standard deviation and the normal curve • standard deviation has greatest meaning when the distribution is normally distributed • normal distribution • a distribution that follows the 68, 95, 99.7 percent rule • rule stating percentage of cases falling within 1, 2, and 3 standard deviations from the mean on a normal distribution

  22. The Normal Distribution

  23. Measures of Variability • Z scores • a score that has been transformed into standard deviation units • mean of z distribution is always zero; standard deviation always 1 • indicates how far above or below a raw score is from its mean in standard deviation units • e.g., a z score of +1.00 indicates a raw score that is one standard deviation unit above the mean • in a normal distribution, the proportion of scores occurring between any two points can be determined • scores on different distributions can be compared

  24. Z Score Formula and Examples • Formula • Examples

  25. Examining Relationships Among Variables • Unstandardized difference between means • a comparison of mean differences (DV) between levels of a categorical independent variable • example • college graduate data set • mean starting salary for males = $34,791.67 • mean starting salary for females = $31,269.23 • the unstandardized difference between these two means • $34,791.67 - $31,269.23 = $3,522.44 • Standardized difference between means • effect size indicator • index of magnitude or strength of a relationship or difference between means • Cohen’s d • the difference between two means in standard deviation units • a common measure of effect size • small, medium, and large effect sizes are indicated by values of at least .2, .5, and .8 respectively

  26. Cohen’s d Example • College student data set • gender is the categorical independent or predictor variable • starting salary is the quantitative dependent variable • mean starting salary for males = $34,791.67 • mean starting salary for females = $31,269.23 • standard deviation for females = $4,008.40 • this says that the mean starting salary for men is .88 standard deviations above the mean for females • using Cohen’s criteria for interpretation, one would consider this a “large” difference between the means

  27. Examining Relationships Among Variables • Correlation coefficient • index indicating the strength and direction of linear relationship between two quantitative variables • value ranges from +1.0 to -1.0 • absolute value indicates strength of relationship • sign indicates direction

  28. Correlation Strength and Direction

  29. Examining Relationships Among Variables • Correlation coefficient • positive correlation • correlation in which values of two variables tend to move in the same direction • e.g., the more hours students spend studying for a test, the higher their test grades tend to be • negative correlation • correlation in which values of two variables tend to move in opposite directions • e.g., the more hours students spend partying the night before an exam, the lower their test grades tend to be

  30. Examining Relationships Among Variables • Correlation coefficient • Pearson correlation (r) • used with two quantitative variables • only appropriate if data is related in a linear fashion • partial correlation • a technique that involves examining correlation after controlling for one or more variables • a scatterplot can be used to judge the strength and direction of a correlation

  31. Positive and Negative Scatterplots

  32. Regression Analysis • Use of one or more quantitative independent variables to explain or predict the values of a single quantitative dependent variable • Two main types • simple regression • involves the use of one independent or predictor variable • multiple regression • involves two or more independent or predictor variables

  33. Regression Analysis Prediction is made using the regression equation this equation defines the regression line that best fits the pattern of observations in your data slope – how steep is the line y-intercept – point where regression line crosses y-axis

  34. Regression Line Example

  35. Regression Analysis regression coefficient predicted change in the dependent variable (Y) given a one unit change in the independent variable (X) partial regression coefficient the regression coefficient in a multiple regression equation

  36. Contingency Tables Table used to examine relationship between two categorical variables Cells may contain frequencies or percentages

More Related