Regression
This presentation is the property of its rightful owner.
Sponsored Links
1 / 23

Regression PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

Regression. Population Covariance and Correlation. Sample Correlation. Sample Correlation. -.04. .98. -.79. Linear Model. DATA. REGRESSION LINE. (Still) Linear Model. DATA. REGRESSION CURVE. Parameter Estimation. Minimize SSE over possible parameter values.

Download Presentation

Regression

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Regression

Regression


Population covariance and correlation

Population Covariance and Correlation


Sample correlation

Sample Correlation


Sample correlation1

Sample Correlation

-.04

.98

-.79


Linear model

Linear Model

DATA

REGRESSION LINE


Still linear model

(Still) Linear Model

DATA

REGRESSION CURVE


Parameter estimation

Parameter Estimation

Minimize SSE over possible parameter values


Fitting a linear model in r

Fitting a linear model in R


Fitting a linear model in r1

Fitting a linear model in R

Intercept parameter is significant at .0623 level


Fitting a linear model in r2

Fitting a linear model in R

Slope parameter is significant at .001 level, so reject


Fitting a linear model in r3

Fitting a linear model in R

Residual Standard Error:


Fitting a linear model in r4

Fitting a linear model in R

R-squared is the correlation squared, also % of variation

explained by the linear regression


Create a best fit scatter plot

Create a Best Fit Scatter Plot


Add x and y labels

Add X and Y Labels


Inspect residuals

Inspect Residuals


Multiple regression

Multiple Regression

Example: we could try to predict change in diameter

using both change in height as well as starting height

and Fertilizer


Multiple regression1

Multiple Regression

  • All variables are significant at .05 level

  • The Error went down and R-squared went up (this is good)

  • Can even handle categorical variables


Regression w machine learning point of view

Regression w/ Machine Learning point of view


Regression w machine learning point of view1

Regression w/ Machine Learning point of view

Music Year

Timbre (90 attributes)

http://archive.ics.uci.edu/ml/datasets/YearPredictionMSD

  • Let’s “train” (fit) different models to a training data set

  • Then see how well they do at predicting a different “validation” data set (this is how ML competitions on Kaggle work)


Regression w machine learning point of view2

Regression w/ Machine Learning point of view

  • Create a random sample of size 10000 from original 515,345 songs

  • Assign first 5000 to training data set, second 5000 are saved for validation


Regression w machine learning point of view3

Regression w/ Machine Learning point of view

  • Fit linear model and generalized boosting regression model (other popular choices include random forests and neural networks)

  • The period after the tilde denotes we will use all 91 variables for training, the –V1 throws out V1 (since this is what we’re predicting)


Regression w machine learning point of view4

Regression w/ Machine Learning point of view

  • Next we make predictions for the validation data set

  • We compare the models by calculating the sum of squares error (SSE) for each model


Regression w machine learning point of view5

Regression w/ Machine Learning point of view


  • Login