1 / 47

# MGMT 276: Statistical Inference in Management Spring, 2014 - PowerPoint PPT Presentation

MGMT 276: Statistical Inference in Management Spring, 2014. Welcome. Green sheets. Please click in. My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z . For our class Due Tuesday April 29 th. For our class Due Tuesday April 29 th.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' MGMT 276: Statistical Inference in Management Spring, 2014' - kin

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
MGMT 276: Statistical Inference in ManagementSpring, 2014

Welcome

Green sheets

My last name starts with a

letter somewhere between

A. A – D

B. E – L

C. M – R

D. S – Z

Due Tuesday

April 29th

Due Tuesday

April 29th

In a negatively skewed distribution:

mean < median < mode

97 = mode = tallest point

87 = median = middle score

83 = mean = balance point

Frequency

Score on Exam

Note:

Always “frequency”

Mean

Mode

Median

Note:

Label and Numbers

(Exam 4: May 1st)

Lind

Chapter 13: Linear Regression and Correlation

Chapter 14: Multiple Regression

Chapter 15: Chi-Square

Plous

Chapter 17: Social Influences

Chapter 18: Group Judgments and Decisions

• Exam 4 – Optional Times for Final

• Two options for completing Exam 4

• Thursday (5/1/14) – The regularly scheduled time

• Tuesday (5/6/14) – The optional later time

• Only need to take one exam – these are two optional times

• On class website:

• Please print and complete homework worksheet #18

• Hypothesis Testing with Correlations

study guide

Next couple of lectures 4/22/14

Logic of hypothesis testing with Correlations

Interpreting the Correlations and scatterplots

Simple and Multiple Regression

Using correlation for predictions

r versus r2

Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent)Coefficient of correlation is name for “r”Coefficient of determination is name for “r2”(remember it is always positive – no direction info)Standard error of the estimate is our measure of the variability of the dots around the regression line(average deviation of each data point from the regression line – like standard deviation)

Coefficient of regression will “b” for each variable (like slope)

Rory is an owner of a small software company and employs 10 sales staff. Rory send his staff all over the world consulting, selling and setting up his system. He wants to evaluate his staff in terms of who are the most (and least)productive sales people and also whether more sales calls actually result in more systems being sold. So, he simply measures the number of sales calls made by each sales person and how many systems they successfully sold.

40

Number of

systems sold

30

20

10

0

0 1 2 3 4

Number of

Ava

70

Emily

Regression Example

Isabella

60

Do more sales calls result

Emma

Step 1: Draw scatterplot

Ethan

Step 2: Estimate r

Joshua

Jacob

Dependent

Variable

Independent

Variable

Do more sales calls result

Step 3: Calculate r

Step 4: Is it a significant correlation?

Do more sales calls result in more sales made?

• Step 4: Is it a significant correlation?

• n = 10, df = 8

• alpha = .05

• Observed r is larger than critical r

• (0.71 > 0.632)

• therefore we reject the null hypothesis.

• Yes it is a significant correlation

• r (8) = 0.71; p < 0.05

Step 3: Calculate r

Step 4: Is it a significant correlation?

Step 1: Draw prediction line

r = 0.71

b= 11.579 (slope)

a = 20.526 (intercept)

Draw a regression line

and regression equation

What are we predicting?

Step 1: Draw prediction line

r = 0.71

b= 11.579 (slope)

a = 20.526 (intercept)

Draw a regression line

and regression equation

Step 1: Draw prediction line

r = 0.71

b= 11.579 (slope)

a = 20.526 (intercept)

Draw a regression line

and regression equation

Step 1: Draw prediction line

r = 0.71

b= 11.579 (slope)

a = 20.526 (intercept)

Draw a regression line

and regression equation

Interpret slope & intercept

Regression: Predicting sales

Step 1: Predict sales for a certain number of sales calls

Step 2: State the regression equation

Y’ = a + bx

Y’ = 20.526 + 11.579x

Joshua

If make one sales call

Step 3: Solve for some value of Y’

Y’ = 20.526 + 11.579(1)

Y’ = 32.105

What should you expect from a salesperson who makes 1 calls?

They should sell 32.105 systems

If they sell more  over performing

If they sell fewer  underperforming

Regression: Predicting sales

Step 1: Predict sales for a certain number of sales calls

Isabella

Step 2: State the regression equation

Y’ = a + bx

Y’ = 20.526 + 11.579x

Jacob

If make two sales call

Step 3: Solve for some value of Y’

Y’ = 20.526 + 11.579(2)

Y’ = 43.684

What should you expect from a salesperson who makes 2 calls?

They should sell 43.68 systems

If they sell more  over performing

If they sell fewer  underperforming

Ava

Regression: Predicting sales

Step 1: Predict sales for a certain number of sales calls

Emma

Step 2: State the regression equation

Y’ = a + bx

Y’ = 20.526 + 11.579x

If make three

sales call

Step 3: Solve for some value of Y’

Y’ = 20.526 + 11.579(3)

Y’ = 55.263

What should you expect from a salesperson who makes 3 calls?

They should sell 55.263 systems

If they sell more  over performing

If they sell fewer  underperforming

Regression: Predicting sales

Step 1: Predict sales for a certain number of sales calls

Emily

Step 2: State the regression equation

Y’ = a + bx

Y’ = 20.526 + 11.579x

If make four sales calls

Step 3: Solve for some value of Y’

Y’ = 20.526 + 11.579(4)

Y’ = 66.842

What should you expect from a salesperson who makes 4 calls?

They should sell 66.84 systems

If they sell more  over performing

If they sell fewer  underperforming

Step 1: Compare expected sales levels to actual sales levels

Ava

Emma

Isabella

Emily

What should you expect from each salesperson

Joshua

Jacob

They should sell x systems depending on sales calls

If they sell more  over performing

If they sell fewer  underperforming

Step 1: Compare expected sales levels to actual sales levels

70-55.3=14.7

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

Ava

14.7

How did

Ava do?

Ava sold 14.7 more than expected taking into account how many sales calls she made over performing

Step 1: Compare expected sales levels to actual sales levels

20-43.7=-23.7

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

Ava

How did

Jacob do?

-23.7

Jacob sold 23.684 fewer

than expected taking into account how many sales calls he

Jacob

Step 1: Compare expected sales levels to actual sales levels

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

Ava

14.7

Emma

Isabella

Emily

-23.7

Joshua

Jacob

Step 1: Compare expected sales levels to actual sales levels

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

Ava

14.7

Emma

Isabella

-6.8

Emily

-23.7

7.9

Joshua

Jacob

Does the prediction line perfectly the predicted variable when using the predictor variable?

No, we are wrong sometimes…

How can we estimate how much “error” we have?

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

14.7

The green lines show how

much “error” there is in our

prediction line…how much

we are wrong in our predictions

-23.7

Any

Residuals?

Perfect correlation = +1.00 or -1.00

Each variable perfectly

predicts the other

No variability in the scatterplot

The dots approximate a straight line

Residual scores

How do we find the average amount of error in our prediction

Ava is 14.7

Jacob is -23.7

Emily is -6.8

The average amount by which actual scores

deviate on either side of the predicted score

Step 1: Find error for each value

(just the residuals)

Y – Y’

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

Step 2: Add up the residuals

Big problem

Σ(Y – Y’) = 0

Square the deviations

Σ(Y – Y’)

2

How would we find our “average residual”?

Square root

Σ(Y – Y’)

2

Σx

The green lines show how

much “error” there is in our

prediction line…how much

we are wrong in our predictions

N

Divide by df

n - 2

Deviation scores

Diallo is 0”

Preston is 2”

Mike is -4”

Step 1: Find error for each value

(just the residuals)

Hunter is -2

Y – Y’

Sound familiar??

Step 2: Find average

Difference between

expected Y’ and actual Y

is called “residual”

(it’s a deviation score)

∑(Y – Y’)2

n - 2

How would we find our “average residual”?

Σx

The green lines show how

much “error” there is in our

prediction line…how much

we are wrong in our predictions

N

of the estimate (line)

=

these formula

Regression Analysis – Least Squares Principle

When we calculate the regression line we try to:

• minimize distance between predicted Ys and actual (data) Y points (length of green lines)

• remember because of the negative and positive values cancelling each other out we have to square those distance (deviations)

• so we are trying to minimize the “sum of squares of the vertical distances between the actual Y values and the predicted Y values”

How well does the prediction line predict the predicted variable when using the predictor variable?

What if we want to know the “average deviation score”? Finding the standard error of the estimate (line)

Standard error

of the estimate (line)

Standard error of the estimate:

• a measure of the average amount of predictive error

• the average amount that Y’ scores differ from Y scores

• a mean of the lengths of the green lines

• Slope doesn’t give “variability” info

• Intercept doesn’t give “variability info

• Correlation “r” does give “variability info

• Residuals do give “variability info

A note about curvilinear variable when using the predictor variable?

relationships and patterns

of the residuals

How well does the prediction line predict the Ys from the Xs?

Residuals

• Shorter green lines suggest better prediction – smaller error

• Longer green lines suggest worse prediction – larger error

• Why are green lines vertical?

• Remember, we are predicting the variable on the Y axis

• So, error would be how we are wrong about Y (vertical)

Assumptions Underlying Linear Regression variable when using the predictor variable?

• For each value of X, there is a group of Y values

• These Y values are normally distributed.

• The means of these normal distributions of Y values all lie on the straight line of regression.

• The standard deviations of these normal distributions are equal.

Which minimizes variable when using the predictor variable?error better?

Is the regression line better than just guessing the mean of the Y variable?How much does the information about the relationship actually help?

How much better does the regression line predict the observed results?

r2

Wow!

What is r variable when using the predictor variable?2?

r2 = The proportion of the total variance in one variable that is

predictable by its relationship with the other variable

Examples

If mother’s and daughter’s heights are

correlated with an r = .8, then what amount (proportion or percentage)

of variance of mother’s height is accounted for by daughter’s height?

.64 because (.8)2 = .64

What is r variable when using the predictor variable?2?

r2 = The proportion of the total variance in one variable that is

predictable for its relationship with the other variable

Examples

If mother’s and daughter’s heights are

correlated with an r = .8, then what proportion of variance of mother’s height

is not accounted for by daughter’s height?

.36 because (1.0 - .64) = .36

or

36% because 100% - 64% = 36%

What is r variable when using the predictor variable?2?

r2 = The proportion of the total variance in one variable that is

predictable for its relationship with the other variable

Examples

If ice cream sales and temperature are correlated with an

r = .5, then what amount (proportion or percentage) of variance of ice cream sales is accounted for by temperature?

.25 because (.5)2 = .25

What is r variable when using the predictor variable?2?

r2 = The proportion of the total variance in one variable that is

predictable for its relationship with the other variable

Examples

If ice cream sales and temperature are correlated with an

r = .5, then what amount (proportion or percentage) of variance of ice cream sales is not accounted for by temperature?

.75 because (1.0 - .25) = .75

or

75% because 100% - 25% = 75%

Some useful terms variable when using the predictor variable?

• Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent)

• Coefficient of correlation is name for “r”

• Coefficient of determination is name for “r2”(remember it is always positive – no direction info)

• Standard error of the estimate is our measure of the variability of the dots around the regression line(average deviation of each data point from the regression line – like standard deviation)

Pop Quiz - 5 Questions variable when using the predictor variable?

1. What is regression used for?

• Include and example

2. What is a residual? How would you find it?

3. What is Standard Error of the Estimate (How is it related to residuals?)

4. Give one fact about r2

5. How is regression line like a mean?

r2

Writing Assignment - 5 Questions variable when using the predictor variable?

1. What is regression used for?

• Include and example

Regressions are used to take advantage of relationships

between variables described in correlations. We choose a value

on the independent variable (on x axis) to predict values for

the dependent variable (on y axis).

Writing Assignment - 5 Questions variable when using the predictor variable?

2. What is a residual? How would you find it?

Residuals are the difference between our predicted y (y’)

and the actual y data points. Once we choose a value on our

independent variable and predict a value for our dependent

variable, we look to see how close our prediction was. We

are measuring how “wrong” we were, or the amount of “error”

for that guess.

Y – Y’

Writing Assignment - 5 Questions variable when using the predictor variable?

3. What is Standard Error of the Estimate (How is it related to residuals?)

The average length of the residuals

The average error of our guess

The average length of the green lines

The standard deviation of the regression line

Writing Assignment - 5 Questions variable when using the predictor variable?

4. Give one fact about r2

5. How is regression line like a mean?

Thank you! variable when using the predictor variable?

See you next time!!