1 / 26

Chapter 2

Chapter 2. Producing Data. Homework 4. Read From Chapter 2: pages 83-96, example 2.14, page 103 (skip the material on the random number table) LDI: 2.1, 2.2, 2.3, 2.4 Read From Chapter 3: pages 145-148, 152-154, 185-193, 195 (summary) LDI: 3.1, 3.2, 3.3. Why Sample?.

Download Presentation

Chapter 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 2 Producing Data.

  2. Homework 4 • Read From Chapter 2: pages 83-96, example 2.14, page 103 (skip the material on the random number table) • LDI: 2.1, 2.2, 2.3, 2.4 • Read From Chapter 3:pages 145-148, 152-154, 185-193, 195 (summary) • LDI: 3.1, 3.2, 3.3

  3. Why Sample? • How many times do you see the letter f in the following sentence? • FINISHED FILES ARE THE RESULT OF YEARS OF SCIENTIFIC STUDY COMBINED WITH THE EXPERIENCE OF MANY YEARS

  4. The populationis the entire group of objects or individuals under study, about which information is wanted. • A unit is an individual object or person in the population. The units are often called subjects if the population consists of people. • A sample is a part of the population that is actually used to get information. • Avariable is a characteristic of interest to be measured for each unit in the sample. • The size of the population is denoted by the capital letter N. • The size of the sampleis denoted by the small letter n.

  5. Population size N = 16 Sample size n=4

  6. More Definitions • A parameter is a numerical summary that would be calculated from all of the units in the population. It is a fixed value. • A statistic is a numerical summary that is calculated from all of the units in a sample. It changes from sample to sample.

  7. Key Concept • The job of a statistic is to estimate a parameter.

  8. Parameter or Statistic? • Nine percent of the U.S. population has Type B blood. In a sample of 400 individuals from the U.S. population, 12.5% were found to have Type B blood. Circle your answer. (a) In this particular situation, the value of 9% is a ( parameter , statistic ). (b) In this particular situation, the value of 12.5% is a ( parameter , statistic ).

  9. Statistics Vary from Sample to Sample • Let’s assume we want to know how many keys on average a student at CR carries with them. Everyone get out your keys and count them and we’ll find an average. • Definition of our population is CR students (the subjects). Our variable we measured was “number of keys.” The value we calculated is the “average number of keys.” Is the value we found a parameter or a statistic? • Would the sample statistics be the same for each section of statistics? • Would the population parameter change each time we took a sample?

  10. Good Data? • We need to sample, a census is generally not possible. So, how do we collect data so that it is representative of the population? • If our data are collect in such a fashion as to systematically differ from the truth, then our data is said to be biased.

  11. Bad Data Collection Schemes that Will Result in Biased Data • Convenience Sample: You want to know how many lovers a student at CR has had in the last year. You’re shy to ask strangers, so you ask just your friends. What’s wrong with this scheme? Convenience samples consist of units of the population that are easily accessible and are almost always biased.

  12. Bad Data Collection Schemes that Will Result in Biased Data • Volunteer Response: You hang a flyer on the bulletin boards asking people if they support or oppose a woman’s right to choice (Roe v Wade) and to email you their response. What’s wrong with this scheme? • Volunteer samples consist of units that self chose to respond. Again, this type of sample is almost always biased.

  13. Definition of Types of Bias (page 90): • Selection Bias is the systematic tendency on the part of the sampling procedure to exclude or include a certain type of unit. • Nonresponse Bias is the distortion that can arise because a large number of units selected for the sample do not respond or refuse to respond, and these nonresponders have a tendency to be different from the responders. • Response Bias is the distortion that can arise because the wording of a question and the behavior of the interviewer can affect the responses received.

  14. Good Sample Methods • A sampling method that gives each unit in the population a known non-zero chance of being selected is called a probabilitysampling method.

  15. Types of Sampling • Simple Random Sampling; • Stratified Random Sampling; • Systematic Sampling; • Cluster Sampling; • and Multistage Sampling.

  16. A Simple Random Sampling (SRS) • A simple random sample of size n is a sample of n units selected in such a way that every possible sample of the given size n has the same chance of being selected as any other sample of size n. Samples of different sizes may have different chances of being selected.

  17. Selecting a SRS Using the TI-83 • Using the TI-83 to Select a Simple Random SampleEX: N=50 units in the population -- take simple random sample of size n=5. • Step 1: Assign LABELS -- Give each unit in the population a numerical label. • Step 2: Use the Calculator or Computer to produce a sequence of random labels between 1 and N.(Note: a starting seed may be given to make us all have the same outcome.) See page 87

  18. Chapter 3 Observational Studies and Experiments

  19. The Language of Studies • The units are the objects upon which measurements are made or observed. If people, we refer to them as subjects. • In a designedexperiment, the researcher actively imposes a treatment on the units in order to observe a response • In an observational study, the researcher simply observes the subjects and records the variable of interest with no attempt to manipulate the response.

  20. Think About It • Decide which should be used, an experiment or an observational study: • New treatment for prostate cancer • Golf increases chance of lightning strike • Type of insurance plan used at a local hospital • New metal to make gears is longer lasting.

  21. Relating Two Variables • A response variable (Y) measures an outcome of the study. It is a variable that is thought to depend in some way on the explanatory variable. • An explanatory variable or factor (X) is a variable that is thought to explain or cause the observed outcomes. It is a variable that is thought to explain the changes in the response variable

  22. Levels and Treatments • The possible values of the explanatory variable are called the levels of that variable. A treatment is a specific combination of levels of the explanatory variables.

  23. Does High Fiber and Exercise Reduce Heart Attacks? • What are the explanatory variables? • What is the response variable? Factor 2 — Amount of Exercise Factor 1 —Type of Diet

  24. Confound It All! • A confounding variable is a variable whose effect on the response variable cannot be separated from that effect of the explanatory variable on the response variable.

  25. Example • What if the majority of the men in our fiber/exercise study that have standard fiber and low exercise also smoke while the other treatments don’t have this property? • We didn’t measure this variable yet it may be a large factor in the response variable.

  26. Being Critical: How to Judge Information • The Source: Who funded the research? • The Design: What was the population under study? Was there control for bias? • The Results: Was the study carried out by a reputable institution? Was it peer reviewed prior to publication?

More Related