1 / 17

Uses of Statistics: Descriptive and Inferential Analysis

Learn about the two main uses of statistics - descriptive analysis to summarize data and inferential analysis to make decisions and draw conclusions. Understand sampling methods and the importance of representativeness in making valid statistical inferences.

kurtcole
Download Presentation

Uses of Statistics: Descriptive and Inferential Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review: Two Main Uses of Statistics Descriptive: To describe or summarize a collection of data points The data set in hand = all the data points of interest No “going beyond the data” Inferential: To make decisions, draw inferences about a population, or draw conclusions under conditions of uncertainty The data set in hand = merely a sample from some larger population (that is our real interest) Use these data to make a calculated guess about what we would find if we had full information Use probability theory to make calculated guesses (with some estimated margin of error)

  2. Statistical inferences involve two possible tasks: Estimation: Use sample data to infer population parameter  e.g., lifetime risk of being a victim of a violent crime according to NCVS data Hypothesis Testing: Use sample data to make a decision about the correctness of some hypothesis or prediction  e.g., whether civil orders of protection will lower the risk recurrent violence against spouses

  3. Both tasks rely on using Samples to make statements about populations: Sample: A limited number of cases selected to represent the larger population of data points Key Terms/Ideas in Sampling: Representativeness  degree to which sample is an exact replica in miniature of the population Sampling Error  degree to which sample statistic deviates from population value Sampling Method  procedure used to draw cases from the population of data points

  4. Two main types of sampling methods: Probability Sampling Selection where each data point has a known probability for being selected into the sample Simple Random sample every data point has an equal likelihood of being selected Other types of probability samples? Systematic Stratified Weighted Cluster Doesn’t guarantee representativeness each time

  5. Two main types of sampling methods: Non-probability Sampling: Selection procedure in which probability of selection is unknown Specific types of Non-probability samples? Accidental Convenience Purposive Snowball Volunteer No guarantee of representativeness Inferences become more “iffy”

  6. Why use one sample method versus another? Maximize representativeness of data Minimize sampling error and bias in data Maximize the validity of our statistical inferences Note: Inferential statistics always assume simple random sampling as a basic premise

  7. What exactly is “randomness”? What does it look like? How can we tell if we have it? Randomness is a property of our data selection procedure not the data points Compromises to randomness: Any deliberate departure from sampling Refusals, dropouts, nonresponses are (almost) never random Our data are always a more-or-less imperfect approximation to real random sample

  8. Making inferences from sample statistics involves 3 distributions: Population distribution: unobserved in population from which cases drawn Sample distribution: observed in cases from which data were collected Sampling distribution:unobserved but calculable distribution of statistics for samples of same size/type as ours (drawn from the same population)  This distribution is the key to making inferences

  9. “Sampling Distribution”: what is it? A hypothetical population of samples (and sample statistics) from drawn from the same population Has a describable theoretical distribution (based on repeatedly drawing a sample an infinite number of times) Has certain parameters determined by the population from which the sample is drawn and the size of the sample

  10. e.g.: If we draw a sample of 25 cases and compute the sample mean The sample mean has a theoretical sampling distribution whose characteristics are exactly determined by the distribution of the population (μ & σ) and by the sample size (n=25) The mean of the sampling distribution = the mean of the population (μs = μpop) In this case: the σ of the sampling distribution = σ/5 (i.e., one-fifth the σ of the population)

  11. Important features of Sampling distributions: If the variable is normally distributed in the population, then the sampling distribution of sample means will also be normal The mean of the sampling distribution = the mean of the population The σ of the sampling distribution = σ/√n Use this information to compute the likelihood of any sample mean being drawn from the population (using the standard normal [z] table)

  12. Important features of Sampling distributions to remember: The σ of the sampling distribution will always be smaller than the σ of the population The large the sample size, the smaller the standard error of the sampling distribution The mean of the sampling distribution will always be the population mean The sampling distribution will become more Normal as the sample size gets larger – no matter the distribution of the population! [this is called the Central Limit Theorem]

  13. Using Sample statistics to make inferences about population parameters: The best estimate of the population mean is the sample mean The usual sample estimate of σ is slightly too low; it needs to be adjusted to be unbiased Thus there are two different formulas for the sample variance/standard deviation: (descriptive) (inferential/estimated)

  14. Basic Steps in Estimating Population Parameters: Select valid estimator (unbiased, consistent) Select valid data sample Corresponds to population of interest Random sample Complete (no censoring or omissions) Compute value of statistical estimate Compute standard error for the estimate Compute confidence interval (i.e., plausible margin of sampling error)

  15. Two Approaches to estimation: Point Estimation: Use sample data to infer exact value of population parameter Highly likely to be wrong or off-mark to some degree e.g., infer that 30% of adults will be victims of violent crime in their lifetimes (could actually be 35% or 25%) Interval Estimation: Instead use sample data to compute a range of values (“confidence intervals”) within which the actual parameter is located (with some calculated margin of certainty or confidence) Yields more approximate but more plausible (or confident) estimates.

  16. Confidence Interval Estimation: Compute the sample mean Compute the sample standard error From the population (σ) From the sample (s or ) Compute the confidence interval or

  17. Note: For population value: For sample value:

More Related