Keep this in mind Statistics have very logical answers. Statistics can be linked with psychology and sociology Keep an open-mind, there are always 2 sides to a coin (positive and negative)
Quick Talk • “1 out of 3 people cheat in a relationship” • Discuss this statement • What does it mean? • Do you believe it? • Why or why not?
Based on your discussion, why do you think it’s important to know about statistics?
Potential answers Know how authentic the statement is Don’t get “cheated” or “tricked” on Know where the data comes from Know how effective something is
What is statistics? Statistics: the study of how to collect, organize, analyze, and interpret numerical information from data
So then, what is needed? Individual: people or objects included in the study Variable: characteristic of the individual to be measured or observed
Quick Talk Think about dating. What “variables” do people look for when finding a boyfriend or girlfriend? List them
Variable comes in two types Quantitative variable: has a value or numerical measurement for which operations such as addition or averaging make sense (usually has numbers) Qualitative variable: describes an individual by placing the individual into a category or group such as male or female
Base on your list, identify them as quantitative or qualitative.
Sample Answers Age (quantitative) Weight (quantitative) Height (quantitative) Race (qualitative) Income (quantitative) Looks (qualitative) Body type (qualitative) Personality (qualitative)
Data Population data: data from “every” individual of interest Sample data: data from “only some” of the individual of interest
Quick Talk Compare the definition. Which type is more probable? Why?
Parameter: a numerical measure that describes an aspect of a population Statistic: numerical measure that describes an aspect of a sample
Easy way to remember Population with parameter Sample with statistic
Group work: Example #1 A car dealer wants to know what type of car people drive in the desert. He sent out 5000 surveys to random people living in the desert. A)identify the individual of study and the variable B)do the data comprise a sample? If so, what is the underlying population? C)is the variable qualitative or quantitative? D)Identify a quantitative variable that might be or interest E) Is the random sample a statistic or a parameter?
Answer • A) individual: people in the desert • Variable: car • B) The data comprise a sample of the population of all people living in the desert • C) qualitative • D)Income, age • E) statistic- computed from sample data
Example #2 Television station QUE wants to know the proportion of TV owners in Virginia who watch the station’s new program at least once a week. The station asked a group of 1000 TV owners in Virginia if they watch the program at least once a week A)identify the individual of study and the variable B)do the data comprise a sample? If so, what is the underlying population? C)is the variable qualitative or quantitative? D)Identify a quantitative variable that might be or interest E) Is the random sample a statistic or a parameter?
Levels of Measurement Nominal level of measurement: applies to data that consists of names, labels or categories. There are no implied criteria by which the data can be ordered from smallest to largest Ordinal level of measurement: applies to data that can be arranged in order. However, differences between data values either cannot be determined or are meaningless Interval level of measurement: applies to data that can be arranged in order. In addition, differences between data values are meaningful Ratio Level of measurement: applies to data that can be arranged in order. In addition, both differences between data values and ratios of data values are meaningful. Data at the ratio level have a true zero. (Means zero means something)
Example: Identify what level of measurement A)Taos, Acoma, Zuni, and Cochiti are names of four Native American pueblos from the population of names of all Native American pueblos in Arizona and New Mexico B) In a high school graduating class of 600 Students. Jeff ranked 1st, Melissa ranked 38th, Patrick ranked 150th, Ashley ranked 3rd, where 1 is the highest rank C) Body temperatures of trout in the Yellowstone River D) Length of shark swimming in the Pacific Ocean
Answer A) nominal B) ordinal C) interval D)ratio
Example #2 • Name the levels of measurement • A) My name is Mr. Liu • B) I am 28 years old • C) Highschool 1999-2003 College 2003-2007 Masters 2007-2008 • D) I make $35,000 after tax • E) I ranked 100th in highschool, 58th in college, 27th in Masters • F) Some of my friend’s name are Michael, Katherine, Patrick, Ashley, Sarah, Mya, Chris. • G) I am 5’8
Answers A) Nominal B) Ratio C) Interval D) Ratio E) Ordinal F) Nominal G) Ratio
Homework Practice Pg 10-11 #1-13 odd
Quick Talk Mr. Liu looked at the first 15 male students’ grades (which averages to a C) and made conclusion that of all the students in the school should have a C average. Discuss why this statement might not be correct. What is wrong with this study?
Things to remember If there is a study or data collect, it can not be BIASED in any way. You need to have a decent sample size and fair randomness to it. Fair = equal chance
1st type of data collection Simple random sample: Simple random sample of n measurements from a population selected in a manner such that every sample of size n from the population has an equal chance of being selected. Basically, everything has the same chance of getting selected.
Simple random sample example If I were to assign a number to each of the students here. (40 students) If I were to randomly choose 5 numbers, would the number 7 as likely to be selected as number 37? Could all 5 numbers be all odd? Could it ever be 27,28,29,30,31?
How to Draw a Random Sample 1) Number all members of the population sequentially 2) Use a table, calculator, or computer to select random numbers from the numbers assigned to the population members 3) Create the sample by using population members with numbers corresponding to those randomly selected
Read Example 3 in pg 13 Random-Number Table It is one of the way to create “randomness” in terms of number It is called a simulation
Another way Random Integer (randInt): Calculator TI83, TI84 • Go to MATH • Slide over to PRB • Choose #5 • It should show randInt( • If you want ONE random number out of total of 500, you should type • randInt(1,500) • This will give you a random number between 1 and 500If you want 30 random numbers out of total of 500, you should type • randInt(1,500,30)
2nd type of data collection Simulation (usually with number): a numerical facsimile or representation of a real-world phenomenon Note: Productive in studying nuclear reactors, cloud formation, cardiology, highway design, production control, shipbuilding, airplane design, war games, economics, and electronics.
Quick Talk: Why do you think it is important to use simulation as a data collection method? (think about the application field we just discussed)
Group Activity • In your group, create a sample simulation of a coin-tossing event 10 times • One person will record • One person will use a coin (head or tail) • One person use calculator (1=head, 2=tail) • One person use the table from the back of the book (odd=head, even=tail) • You should have a total of 30 trials. • Answer this question: • What is the theoretical probability of getting head? • What is experimental probability of getting head?
Answer Theoretical probability: 50% Experimental probability: depends on your group
Sampling: Different ways to create “randomness” Stratified sampling: Divide the entire population into distinct subgroups called strata. The strata are based on a specific characteristic such as age, income, education level, and so on. All members of a stratum share the specific characteristic. Draw random samples from each stratum Systematic sampling: Number all members of the population sequentially. Then, from a starting point selected at random, include every kth member of the population in the sample Cluster sampling: Divide the entire population into pre-existing segments of clusters. The clusters are often geographic. Make a random selection of clusters. Include every member of each selected cluster in the sample. Multistage sampling: Use a variety of sampling methods to create successively smaller groups at each stage. The final sample consists of clusters. Convenience sampling: Create a sample by using data from population members that are readily available (potential to have lots of bias).
Vocabulary dealing with sampling Sampling frame: a list of individuals from which a sample is actually selected Undercoverage: results from omitting population members from the sample frame Sampling error: the difference between measurements from a sample and corresponding measurements from the respective population. It is caused by the fact that the sample does not perfectly represent the population. Nonsampling error: result of poor sample design, sloppy data collection, faulty measuring instruments, bias in questionnaires, and so on
Note: Remember, is it possible to get a “population” sample? We have to use sample to predict the population. Sample is not a perfect representation of the population! Sampling error do not represent mistakes! They are just the consequences of using samples instead of population. Nonsampling error do occur, be aware of them! Avoid bias and sloppy data collection leading to false-truth, or truth-false (false-positive)
Homework Practice Pg 17-19 #1-5, 7, 13, 15
Quick Talk Why is planning a good experimental design important? Think about what we learned.
2 more types of Data collection techniques • 1) Experiment (most stringent and restrictive) • 2) Observational (Somewhat convenient) • Census • 3) Survey (most convenient way to collect data)
Basic Guideline for planning a statistical study 1) Identify the individuals or objects of interest 2)Specify the variables as well as protocols for taking measurements or making observations 3)Determine if you will use an entire population or a representative sample. Decide on a viable sampling method 4)In your data collection plan, address issues of ethics, subject confidentiality, and privacy. If you are collecting data at a business, store, college, or other institution, be sure to be courteous and to obtain permission as necessary. 5)Collect the data 6) Use appropriate descriptive statistics methods and make decisions using appropriate inferential statistics methods 7) Finally, note any concerns you might have about your data collection methods and list any recommendations for future studies.
Quick talk: You are a researcher in a biotech company. You are trying to find out the efficacy and the effectiveness of a vaccine. How would you conduct the experiment? Note: statistics is needed for ALL medical companies to test the effectiveness of their technology or medicine