STA 291 Summer 2010

1 / 23

# STA 291 Summer 2010 - PowerPoint PPT Presentation

STA 291 Summer 2010. Lecture 1 Dustin Lueker. Topics. Statistical terminology Descriptive statistics Probability and distribution functions Inferential statistics Estimation (confidence intervals) Hypothesis testing Simple linear regression and correlation. Why study Statistics?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'STA 291 Summer 2010' - claus

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### STA 291Summer 2010

Lecture 1

Dustin Lueker

Topics
• Statistical terminology
• Descriptive statistics
• Probability and distribution functions
• Inferential statistics
• Estimation (confidence intervals)
• Hypothesis testing
• Simple linear regression and correlation

STA 291 Summer 2010 Lecture 1

Why study Statistics?
• Research in all fields is becoming more quantitative
• Research journals
• Most graduates will need to be familiar with basic statistical methodology and terminology
• Newspapers, advertising, surveys, etc.
• Many statements contain statistical arguments
• Computers make complex statistical methods easier to use

STA 291 Summer 2010 Lecture 1

Lies, Damn Lies, and Statistics
• Many times statistics are used in an incorrect and misleading manner
• Purposely misused
• Companies/people wanting to further their agenda
• Cooking the data
• Completely making up data
• Massaging the numbers
• Altering values to get desired result
• Accidentally misused
• Using inappropriate methods
• Vital to understand a method before using it

STA 291 Summer 2010 Lecture 1

What is Statistics?
• Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data
• Applicable to a wide variety of academic disciplines
• Physical sciences
• Social sciences
• Humanities
• Statistics are used for making informed decisions
• Government

STA 291 Summer 2010 Lecture 1

General Statistical Methodology

STA 291 Summer 2010 Lecture 1

Basic Terminology
• Population
• Total set of all subjects of interest
• Entire group of people, animals, products, etc. about which we want information
• Elementary Unit
• Any individual member of the population
• Sample
• Subset of the population from which the study actually collects information
• Used to draw conclusions about the whole population

STA 291 Summer 2010 Lecture 1

Basic Terminology
• Variable
• A characteristic of a unit that can vary among subjects in the population/sample
• Ex: gender, nationality, age, income, hair color, height, disease status, state of residence, grade in STA 291
• Parameter
• Numerical characteristic of the population
• Calculated using the whole population
• Statistic
• Numerical characteristic of the sample
• Calculated using the sample

STA 291 Summer 2010 Lecture 1

Data Collection and Sampling Theory
• Why take a sample? Why not take a census? Why not measure all of the units in the population?
• Accuracy
• May not be able to find every unit in the population
• Time
• Speed of response from units
• Money
• Infinite Population
• Destructive Sampling or Testing

STA 291 Summer 2010 Lecture 1

Example
• University Health Services at UK conducts a survey about alcohol abuse among students
• 200 of the students are sampled and asked to complete a questionnaire
• One question is “have you regretted something you did while drinking?”
• What is the population?
• What is the sample?

STA 291 Summer 2010 Lecture 1

‘Flavors’ of Statistics
• Descriptive Statistics
• Summarizing the information in a collection of data
• Inferential Statistics
• Using information from a sample to make conclusions/predictions about the population
• Ex: using a sample statistic to estimate a population parameter

STA 291 Summer 2010 Lecture 1

Example
• The Current Population Survey of about 60,000 households in the United States in 2002 distinguishes three types of families: Married-couple (MC), Female householder and no husband (FH), Male householder and no wife (MH)
• It indicated that 5.3% of “MC”, 26.5% of “FH”, and 12.1% of “MH” families have annual income below the poverty level
• Are these numbers statistics or parameters?
• The report says that the percentage of all “FH” families in the USA with income below the poverty level is at least 25.5% but no greater than 27.5%
• Is this an example of descriptive or inferential statistics?

STA 291 Summer 2010 Lecture 1

Scales of Measurement
• Quantitative or Numerical
• Variable with numerical values associated with them
• Qualitative or Categorical
• Variables without numerical values associated with them

STA 291 Summer 2010 Lecture 1

Qualitative Variables
• Ordinal
• Disease status, company rating, grade in STA 291
• Ordinal variables have a scale of ordered categories, they are often treated in a quantitative manner (A = 4.0, B = 3.0, etc.)
• One unit can have more of a certain property than another unit
• Nominal
• Gender, nationality, hair color, state of residence
• Nominal variables have a scale of unordered categories
• It does not make sense to say, for example, that green hair is greater/higher/better than orange hair

STA 291 Summer 2010 Lecture 1

Quantitative Variables
• Quantitative
• Age, income, height
• Quantitative variables are measured numerically, that is, for each subject a number is observed
• The scale for quantitative variables is called interval scale

STA 291 Summer 2010 Lecture 1

Example
• A study about oral hygiene and periodontal conditions among institutionalized elderly measured the following
• Nominal (Qualitative): Requires assistance from staff?
• Yes
• No
• Ordinal (Qualitative): Plaque score
• No visible plaque
• Small amounts of plaque
• Moderate amounts of plaque
• Abundant plaque
• Interval (Quantitative): Number of teeth

STA 291 Summer 2010 Lecture 1

Example
• A birth registry database collects the following information on newborns
• Birth weight: in grams
• Infant’s Condition:
• Excellent
• Good
• Fair
• Poor
• Number of prenatal visits
• Ethnic background:
• African-American
• Caucasian
• Hispanic
• Native American
• Other
• What are the appropriate scales? Quantitative (Interval) Qualitative (Ordinal, Nominal)

STA 291 Summer 2010 Lecture 1

Importance of Different Types of Data
• Statistical methods vary for quantitative and qualitative variables
• Methods for quantitative data cannot be used to analyze qualitative data
• Quantitative variables can be treated in a less quantitative manner
• Height: measured in cm/in
• Interval (Quantitative)
• Can be treated at Qualitative
• Ordinal:
• Short
• Average
• Tall
• Nominal:
• <60in or >72in
• 60in-72in

STA 291 Summer 2010 Lecture 1

Other Notes on Variable Types
• Try to measure variables as detailed as possible
• Quantitative
• More detailed data can be analyzed in further depth
• Caution: Sometimes ordinal variables are treated as quantitative (ex: GPA)

STA 291 Summer 2010 Lecture 1

Discrete Variables
• A variable is discrete if it can take on a finite number of values
• Gender
• Nationality
• Hair color
• Disease status
• Grade in STA 291
• Favorite MLB team
• Qualitative variables are discrete

STA 291 Summer 2010 Lecture 1

Continuous Variables
• Continuous variables can take an infinite continuum of possible real number values
• Time spent studying for STA 291 per day
• 43 minutes
• 2 minutes
• 27.487 minutes
• 27.48682 minutes
• Can be subdivided into more accurate values
• Therefore continuous

STA 291 Summer 2010 Lecture 1

Examples
• Number of children in a family
• Distance a car travels on a tank of gas
• % grade on an exam

STA 291 Summer 2010 Lecture 1

Discrete or Continuous
• Quantitative variables can be discrete or continuous
• Age, income, height?
• Depends on the scale
• Age is potentially continuous, but usually measured in years (discrete)

STA 291 Summer 2010 Lecture 1