152 Views

Download Presentation
## Chapter 1 Section 1

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Chapter 1Section 1**Introduction to the Practice of Statistics**Chapter 1 – Section 1**• The science of statistics is • Collecting • Organizing • Summarizing • Analyzing information to draw conclusions or answer questions**Chapter 1 – Section 1**• Organize and summarize the information Descriptive statistics (chapters 2 through 4) • Draw conclusion/generalization from the information Inferential statistics (chapters 9 through 11)**Chapter 1 – Section 1**• A population - Is the group to be studied - Includes all of the individuals in the group • A sample • Is a subset of the population • Is often used in analyses because getting access to the entire population is impractical**Chapter 1 – Section 1**• Characteristics of the individuals under study are called variables • Some variables have values that are attributes or characteristics … those are called qualitative or categorical variables • Some variables have values that are numeric measurements … those are called quantitative variables • The suggested approaches to analyzing problems vary by the type of variable**Chapter 1 – Section 1**• Examples of qualitative variables • Gender • Zip code • Blood type • States in the United States • Brands of televisions • Qualitative variables have category values … those values cannot be added, subtracted, etc.**Chapter 1 – Section 1**• Examples of quantitative variables • Temperature • Height and weight • Sales of a product • Number of children in a family • Points achieved playing a video game • Quantitative variables have numeric values … those values can be added, subtracted, etc.**Chapter 1 – Section 1**• Quantitative variables can be either discrete or continuous • Discrete variables • Variables that have a finite or a countable number of possibilities • Frequently variables that are counts • Continuous variables • Variables that have an infinite but not countable number of possibilities • Frequently variables that are measurements**Chapter 1 – Section 1**• Examples of discrete variables • The number of heads obtained in 5 coin flips • The number of cars arriving at a McDonald’s between 12:00 and 1:00 • The number of students in class • The number of points scored in a football game • The possible values of qualitative variables can be listed**Chapter 1 – Section 1**• Examples of continuous variables • The distance that a particular model car can drive on a full tank of gas • Heights of college students**Summary: Chapter 1 – Section 1**• The process of statistics is designed to collect and analyze data to reach conclusions • Variables can be classified by their type of data • Qualitative or categorical variables • Discrete quantitative variables • Continuous quantitative variables**Chapter 2**Organizing and Summarizing Data**Chapter 2 Sections**• Sections in Chapter 2 • Organizing Qualitative Data • Organizing Quantitative Data • Graphical Misrepresentations of Data**Chapter 2Section 1**Organizing Qualitative Data**Chapter 2 – Section 1**• Qualitative data values can be organized by a frequencydistribution • A frequency distribution lists • Each of the categories • The frequency for each category**Chapter 2 – Section 1**• A simple data set is blue, blue, green, red, red, blue, red, blue • A frequency table for this qualitative data is • The most commonly occurring color is blue**Chapter 2 – Section 1**• The relativefrequencies are the proportions (or percents) of the observations out of the total • A relative frequency distribution lists • Each of the categories • The relative frequency for each category**Chapter 2 – Section 1**• A relative frequency table for this qualitative data is • A relative frequency table can also be constructed with percents (50%, 12.5%, and 37.5% for the above table)**Chapter 2 – Section 1**• Bar graphs for our simple data (using Excel) • Frequency bar graph • Relative frequency bar graph**Chapter 2 – Section 1**• A Paretochart is a particular type of bar graph • A Pareto differs from a bar chart only in that the categories are arranged in order • The category with the highest frequency is placed first (on the extreme left) • The second highest category is placed second • Etc. • Pareto charts are often used when there are many categories but only the top few are of interest**Chapter 2 – Section 1**• A Pareto chart for our simple data (using Excel)**Chapter 2 – Section 1**• An example side-by-side bar graph comparing educational attainment in 1990 versus 2003**Chapter 2 – Section 1**• An example of a pie chart**Chapter 2Section 2**Organizing Quantitative Data:**Chapter 2 – Section 2**• Consider the following data • We would like to compute the frequencies and the relative frequencies**Chapter 2 – Section 2**• The resulting frequencies and the relative frequencies**Chapter 2 – Section 2**• Example of histograms for discrete data • Frequencies • Relative frequencies**Chapter 2 – Section 2**• Continuous data cannot be put directly into frequency tables since they do not have any obvious categories • Categories are created using classes, or intervals of numbers • The continuous data is then put into the classes**Chapter 2 – Section 2**• For ages of adults, a possible set of classes is 20 – 29 30 – 39 40 – 49 50 – 59 60 and older • For the class 30 – 39 • 30 is the lowerclasslimit • 39 is the upperclasslimit • The classwidth is the difference between the upper class limit and the lower class limit • For the class 30 – 39, the class width is 40 – 30 = 10**Chapter 2 – Section 2**• All the classes have the same widths, except for the last class • The class “60 and above” is an open-endedclass because it has no upper limit • Classes with no lower limits are also called open-ended classes**Chapter 2 – Section 2**• The classes and the number of values in each can be put into a frequency table • In this table, there are 1147 subjects between 30 and 39 years old**Chapter 2 – Section 2**• Good practices for constructing tables for continuous variables • The classes should not overlap • The classes should not have any gaps between them • The classes should have the same width (except for possible open-ended classes at the extreme low or extreme high ends) • The class boundaries should be “reasonable” numbers • The class width should be a “reasonable” number**Chapter 2 – Section 2**• Just as for discrete data, a histogram can be created from the frequency table • Instead of individual data values, the categories are the classes – the intervals of data**Chapter 2 – Section 2**• A stem-and-leafplot is a different way to represent data that is similar to a histogram • To draw a stem-and-leaf plot, each data value must be broken up into two components • The stem consists of all the digits except for the right most one • The leaf consists of the right most digit • For the number 173, for example, the stem would be “17” and the leaf would be “3”**Chapter 2 – Section 2**• In the stem-and-leaf plot below • The smallest value is 56 • The largest value is 180 • The second largest value is 178**Chapter 2 – Section 2**• To draw a stem-and-leaf plot • Write all the values in ascending order • Find the stems and write them vertically in ascending order • For each data value, write its leaf in the row next to its stem • The resulting leaves will also be in ascending order • The list of stems with their corresponding leaves is the stem-and-leaf plot**Chapter 2 – Section 2**• Modifications to stem-and-leaf plots • Sometimes there are too many values with the same stem … we would need to split the stems (such as having 10-14 in one stem and 15-19 in another) • If we wanted to compare two sets of data, we could draw two stem-and-leaf plots using the same stem, with leaves going left (for one set of data) and right (for the other set)**Chapter 2 – Section 2**• A dotplot is a graph where a dot is placed over the observation each time it is observed • The following is an example of a dot plot**Chapter 2 – Section 2**• A useful way to describe a variable is by the shape of its distribution • Some common distribution shapes are • Uniform • Bell-shaped (or normal) • Skewed right • Skewed left**Chapter 2 – Section 2**• A variable has a uniform distribution when • Each of the values tends to occur with the same frequency • The histogram looks flat**Chapter 2 – Section 2**• A variable has a bell-shaped distribution when • Most of the values fall in the middle • The frequencies tail off to the left and to the right • It is symmetric**Chapter 2 – Section 2**• A variable has a skewedright distribution when • The distribution is not symmetric • The tail to the right is longer than the tail to the left • The arrow from the middle to the long tail points right Right**Chapter 2 – Section 2**• A variable has a skewedleft distribution when • The distribution is not symmetric • The tail to the left is longer than the tail to the right • The arrow from the middle to the long tail points left Left**Summary: Chapter 2 – Section 2**• Quantitative data can be organized in several ways • Histograms based on data values are good for discrete data • Histograms based on classes (intervals) are good for continuous data • The shape of a distribution describes a variable … histograms are useful for identifying the shapes**Chapter 2Section 3**Graphical Misrepresentations of Data**Chapter 2 – Section 4**• The two graphs show the same data … the difference seems larger for the graph on the left • The vertical scale is truncated on the left**Chapter 2 – Section 4**• The gazebo on the right is twice as large in each dimension as the one on the left • However, it is much more than twice as large as the one on the left Original “Twice” as large**Summary: Chapter 2 – Section 1**• Qualitative data can be organized in several ways • Tables are useful for listing the data, its frequencies, and its relative frequencies • Charts such as bar graphs, Pareto charts, and pie charts are useful visual methods for organizing data • Side-by-side bar graphs are useful for comparing two sets of qualitative data