Understanding Randomness and Sampling Methods in Statistics

Part III Gathering Data

Chapter 11Understanding Randomness • Random • An event is random if we know what outcomes could happen but not which particular values did or will happen • Random Numbers • “Hard to get” • Pseudorandom • Table of random digits • Pick a number from the next slide

1 2 3 4

Simulation • A simulation consist of a collection of things that happened at random. Is used to model real-world relative frequencies using random numbers. • Component • Situation that is repeated in the simulation. Each component has a set of possible outcomes • Outcome • An individual result of a simulated component of a simulation • Trial • The sequence of events that we are pretending will take place • Step-by-step page 295

Chapter 12Sample Surveys • Idea 1: Examine a part of the whole • Carefully select a smaller group from the population (Sample) • A sample that does not represent the population in some important way is said to be biased

Sample Survey (cont.) • Idea 2: Randomize • Randomizing protect us from the influences of all the features of our population, even the ones that we may not have thought about. • Is the best defense against bias, in which each individual is given a fair random chance of selection

Sample Surveys (cont.) • Idea 3: It’s the sample size • The fraction of the population that you have sampled doesn’t matter. It’s the sample size itself that’s important. • Census • A Sample that consist of the entire population. • Difficult to complete. Not practical, too expensive • Populations are not static • Can be more complex

Populations and parameters • Population parameter • Parameter (numerical value) that is part of a model for a population. We want to estimate this parameters from sampled data.

Sampling • When selecting a sample we want it to be representative, that is that the statistics we compute from the sample reflect the corresponding parameters accurately • Simple Random Sample (SRS) • Is a sample in which each combination of elements has an equal chance of being selected • Sampling Frame • A list of individuals from which the sample is drawn

Other Sampling Designs • Stratified random sampling • A sampling design in which the population is divided into homogeneous subsets called strata, and random samples are drawn from each stratum. • Cluster Sampling • Random samples are drawn not directly from the population, but from groups of clusters. (Convenience, practicality, cost)

Other Sampling Designs (cont.) • Systematic Sample • Sample drawn by selecting individuals systematically from a sampling frame. • (ex. Every 10 people) • Multistage Sample • Combining different sampling methods

How to Sample Badly • Sample badly with volunteers • Voluntary response bias invalidates a survey • Sample badly because of convenience • Convenience sampling: Simply include the individuals who are at hand • Sample from a bad sampling frame • Undercoverage • Some portion of the population is not sampled at all or has a smaller representation in the sample than it has in the population.

How to Sample Badly • Non response bias • Response Bias • Influence arising from the design of the survey wording. • Look for biases before the survey. There is no way to recover from a biased sample or a survey that asks biased questions • Sampling Variability • Difference from sample to sample, given that the samples are drawn at random

Exercises • Page 325 • #8 • #14 • #15

Chapter 13Experiments • Investigative Study • Observational Studies • Researchers don’t assign choices • No manipulation of the factors • Retrospective study • Observational study in which the researcher identifies the subject and then collect data on their previous condition or behavior • Prospective Study • Identifies or selects the subjects and follows the future outcomes

Experiment • Random assignment of subjects to treatments. • Explanatory Variable: • Factor (manipulate) • Response variable : • Measurement • Experimental units • Subjects • Participants • Factor • A variable whose levels are controlled by the experimenter • Levels of the factor • Treatments • All the combinations of the factors with their respective levels

The Four Principles of Experimental Design • 1 - Control • We need to control sources of variation other than the factors being studied. (make the conditions similar for all treatment groups) • 2 - Randomize • Assign the subjects randomly to the treatments to equalize the effects of unknown variation

The Four Principles of Experimental Design (cont.) • 3 - Replicate • Apply the treatments to several subjects. • 4 - Block • Separate in blocks of identifiable attributes that can affect the outcome of the experiment

Designing an Experiment • Step-by-Step Page 335

Experiments • Control Treatment • Baseline treatment level to provide basis for comparison. • Blinding • There are two main classes of individuals who can affect the outcome of the experiment • Subjects, treatment administrators • Evaluators of the results • Single Blinding (one) • Double Blinding (both)

Experiments • Placebos • A null treatment to make sure that the effect of the treatment is not due to the placebo effect. • Blocking • By blocking we isolate the variability due to the differences between the blocks so that we can see the differences due to the treatment more clearly • Confounding • When the levels of one factor are associated with the levels of another factor, we say that these two factors are confounded

Exercises • Page 351 • #10 • #12

Understanding Randomness and Sampling Methods in Statistics

Understanding Randomness and Sampling Methods in Statistics

Presentation Transcript

Part III

PART III

Part III

PART III

Part III

PART III

Part III

Part III

Part III

PART III

Part III

PART III

PART III

Part III

Part III

Part III

Part III

Part III

Part III

Part III

PART III