welcome to buad 310
Download
Skip this Video
Download Presentation
Welcome to BUAD 310

Loading in 2 Seconds...

play fullscreen
1 / 31

Welcome to BUAD 310 - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Welcome to BUAD 310. Instructor: Kam Hamidieh Lecture 21, Wednesday April 9 , 2014. Agenda & Announcement. Today: Finish up the problem from last time & finish off Simple Linear Regression Start Multiple Regression, Chapter 23 Homework 6 is due today at 5 PM. . About Exam II.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Welcome to BUAD 310' - ciqala


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
welcome to buad 310

Welcome to BUAD 310

Instructor: Kam Hamidieh

Lecture 21, Wednesday April 9, 2014

agenda announcement
Agenda & Announcement
  • Today:
    • Finish up the problem from last time & finish off Simple Linear Regression
    • Start Multiple Regression, Chapter 23
  • Homework 6 is due today at 5 PM.

BUAD 310 - Kam Hamidieh

about exam ii
About Exam II
  • NO CELL PHONES ARE ALLOWED.
  • Two cheat sheets allowed, both sides, hand written.
  • In class this Wednesday April 16.
  • Coversheet will be posted by Monday, 33 questions
  • Print z, and t tables and bring them with you.
  • Coverage: Lecture 12, March 3 to the end of lecture 21 (minus multiple regression), April 9, and HW 4, 5, & 6
  • All Exam II relevant material will be posted by tomorrow morning.
  • Scantrons passed out Monday, fill out before the exam, do not bend it!
  • We will review all of Monday.
  • Extended office hours:
    • Monday April 14: 4-6 PM
    • Tuesday April 15: 2-6 PM

BUAD 310 - Kam Hamidieh

ci and tests for b 1
CI and Tests for B1

To test H0: B1 = 0 vs. Ha: B1 ≠ 0:

(1) 100(1-α)% confidence interval for B1 is:

b1 ± tα/2se(b1)

where tα/2comes from a t distribution with df = n-2.

Or (2) Compute the test statistics:

then get the p-value from a t distribution with df = n-2.

BUAD 310 - Kam Hamidieh

ci for mean response
CI for Mean Response

100(1-α)% confidence interval for at xnew is:

)

where tα/2comes from a t distribution with df = n-2,

and

We will generally use software.

BUAD 310 - Kam Hamidieh

slide6

Outliers

  • “Outliers are observations that stand away from the rest of the data and appear distinct in a plot.” Imprecise!
  • They can have very strong influence in your final results.

BUAD 310 - Kam Hamidieh

slide7

Outliers

r2 = 0.80, Se = 3.28

X = 1,2,…,20

BUAD 310 - Kam Hamidieh

slide8

r2 = 0.80, Se = 3.28

r2 = 0.25, Se = 10

r2 = 0.29, Se = 9.7

r2 = 0.92, Se = 3.2

r2 = 026, Se = 6.1

BUAD 310 - Kam Hamidieh

slide9

How to Deal with Outliers

  • There are NO hard and fast rules on how to deal with outliers except: you should not just throw out yours without SOLID justification.
  • Check for data entry errors. (Not always possible!)
  • Examine the physical context.
  • Report your results with and without outliers.
  • Standardized residuals can help identify outliers too.
  • Transformations can help. (This will be discussed when we cover multiple regression.)

BUAD 310 - Kam Hamidieh

multiple regression
Multiple Regression
  • Simple Linear Regression:
    • One Y and one X, fit a line that gives the mean of Y’s for a given X
  • Multiple regression:
    • One Y and multiple X’s, you have multiple predictors

BUAD 310 - Kam Hamidieh

multiple regression model
Multiple Regression Model

The observed response Y is linearly related to k explanatory variables X1, X2, …, and XK by the equation:

A single Value of

response

comes from….

a linear combination of k variables plus…

Error,

Where…

Error are normal iid

Given a fixed values of X’s, the mean of Y’s is equal to ….

a linear combination of X’s at those fixed values

BUAD 310 - Kam Hamidieh

assumption redundant slide
Assumption (Redundant Slide?)
  • Constant Variance AssumptionThe variance of the error terms is σε2 the same for every combination of values of x1, x2,…, xk
  • Normality AssumptionThe error terms follow a normal distribution for every combination of values of x1, x2,…, xk
  • Independence AssumptionThe values of the error terms are statistically independent of each other

BUAD 310 - Kam Hamidieh

simple versus multiple
Simple versus Multiple

Simple regression

Data:

(x1,y1)

(x2,y2)

(xn,yn)

Assumed Model:

yi = B0 + B1 xi + εi

εi ~ iid N(0,σε)

Parameters: B0, B1, σε

Multiple regression

Data:

(y1, x11,x12,…,x1k)

(y2, x21,x22,…,x2k)

(yn, xn1,xn2,…,xnk)

Assumed Model:

yi = B0 + B1 xi 1 + B2xi 2 + … + Bkxi k εi

εi ~ iid N(0,σε)

Parameters: B0, B1, B2, … , Bk, σε

BUAD 310 - Kam Hamidieh

example page 615
Example (Page 615)
  • Defaults from subprime housing market brought down several financial institutions in 2008 (Lehman, Bear Stern, and AIG) and led to a massive bailout of the financial system.
  • Goal: A bank regulator wants to know how lenders are using credit scores to determine the rate of interest paid by subprime borrows.
  • The variables of interest are:

Y = APR, annual % rate on the loan

X1 = LTV, loan to value ratio, how much of the loan covers the value of the property. Values near 0 are “good”, near 1 are “bad”.

X2= Credit Score. The higher the better.

X3 = Income in 1000’s of dollars

X4 = Home value in 1000’s of dollars

  • The data are n = 372 mortgages obtained from a credit bureau.
  • There are 4 predictors: k = 4.

BUAD 310 - Kam Hamidieh

example
Example

Variable Names

X73

X72

X74

X71

Y7

A row is one observation

BUAD 310 - Kam Hamidieh

pairs plot
“Pairs Plot”

BUAD 310 - Kam Hamidieh

pairs plot1
“Pairs Plot”

APR seems linearly dependent on LTV and Credit Score and not so much on the other two.

Looking at the relationship between predictors is a good idea too.

BUAD 310 - Kam Hamidieh

pairwise correlations
Pairwise Correlations

BUAD 310 - Kam Hamidieh

pairwise correlations1
Pairwise Correlations

Highest correlations are APR with LTV and Credit score.

Why are some of the boxes empty?

BUAD 310 - Kam Hamidieh

least squares
Least Squares

The values for B0, B1, …, BK are estimated via least squares method:

Pick b0, b1,…, bkso this is as small as possible.

But where is the line?

BUAD 310 - Kam Hamidieh

least squares method
Least Squares Method

One Response Y, two predictors X1 & X2.

Method of least squares minimizes the vertical distances between the points and a plane.

(Picture from An Introduction to Statistical Learbing with Applications in R by James, Witten, Hastie, Tibshirani)

BUAD 310 - Kam Hamidieh

higher dimensions
Higher Dimensions?

Ask him!

He may know!!!

BUAD 310 - Kam Hamidieh

slide23

b0 ≈ 23.73

b1≈ -1.59

b2≈ -0.018

b3≈ 0.0004

b4≈ -0.00075

BUAD 310 - Kam Hamidieh

example continued
Example Continued

The estimated regression model now is:

Note: y-hat gives the mean APR for a given set of predictor values.

APR = 23.73 - 1.59(LTV) - 0.018(CreditScore) + + 0.0004(StatedIncome) - 0.00075(HomeValue)

BUAD 310 - Kam Hamidieh

interpretation
Interpretation

APR = 23.73 - 1.59(LTV) - 0.018(CreditScore) + + 0.0004(StatedIncome) - 0.00075(HomeValue)

b0 = 23.73:

When LTV = Credit Score = State Income = Home Value = 0, then the mean APR = 23.73%

b1= -1.59:

Holding all other x variables fixed, when LTV goes up by 0.1, then on average APR goes down by 0.159% (1.59 × 0.1)

b1 = -0.018:

Holding all other x variables fixed, when Credit Score goes up by 1 unit, then on average APR goes down by 0.018%

etc…….

BUAD 310 - Kam Hamidieh

example1
Example

Suppose we observe a subprime borrower with the following characteristics:LTV = 0.90

Credit Score = 650

Stated Income = $45,000

Home Value = $400,000

Our estimated model says that on average such a customer gets:

APR = 23.73 - 1.59(0.90) - 0.018(650) + 0.0004(45) -0.00075(400)

APR ≈ 10.32%

BUAD 310 - Kam Hamidieh

in class exercise 1
In Class Exercise 1

Part (1): Refer to slide 15.

  • What are the predictor and response values for the 9th observation?
  • What are the values of y10, x24, x11,3?

Part (2) Refer to slide 25.

  • Interpret the slope term for stated income variable.
  • What is the estimated mean APR for customer with LTV = 0.50, Credit Score = 600, Stated Income = $10,000, Home Value = $200,000?

BUAD 310 - Kam Hamidieh

model residuals
Model Residuals
  • Residuals are defined just like the simple linear regression case: residual = observed – fitted.
  • The official formula:
  • What is the “picture” for residuals?

BUAD 310 - Kam Hamidieh

standard deviation of residuals
Standard Deviation of Residuals
  • Compute the standard deviation of the residuals:
  • It has the same interpretation as before: it tells how far away your observed points are from the “plane” on average.
  • Se estimates σε.
  • The value n – k – 1 is called the residual degrees of freedom.
  • SSE = Sums of Squared (due to) Error
  • MSE = Mean squared (due to) Error

BUAD 310 - Kam Hamidieh

summarizing results in a table
Summarizing Results in a Table

n – k – 1 =

372 – 4 – 1 = 367

MSE = 1.55

SSE = 567.80

Se = 1.24

BUAD 310 - Kam Hamidieh

in class exercise 2
In Class Exercise 2

Again, refer to the subprime example.

  • What is the residual for the 9th observation?
  • What are the units of Se?
  • Referring to question 1, how many standard deviations does this observed value fall below or above the estimated equation? (This is relative to Se.)

BUAD 310 - Kam Hamidieh

ad