1 / 27

CSC 4510 – Machine Learning

CSC 4510 – Machine Learning. 5: Multivariate Regression. Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www .csc.villanova.edu /~map/4510/. T he slides in this presentation are adapted from:

emlyn
Download Presentation

CSC 4510 – Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 4510 – Machine Learning 5: Multivariate Regression Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ • The slides in this presentation are adapted from: • Andrew Ng’s ML course http://www.ml-class.org/ CSC 4510 - M.A. Papalaskari - Villanova University

  2. Regression topics so far • Introduction to linear regression • Intuition – least squares approximation • Intuition – gradient descent algorithm • Hands on: Simple example using excel • How to apply gradient descent to minimize the cost function for regression • linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University

  3. What’s next? • Multivariate regression • Gradient descent revisited • Feature scaling and normalization • Selecting a good value for α • Non-linear regression • Solving for analytically (Normal Equation) • Using Octave to solve regression problems CSC 4510 - M.A. Papalaskari - Villanova University

  4. Multiple features (variables). CSC 4510 - M.A. Papalaskari - Villanova University

  5. Multiple features (variables). • Notation: • = number of features • = input (features) of training example. • = value of feature in training example. CSC 4510 - M.A. Papalaskari - Villanova University

  6. Multiple features (variables). CSC 4510 - M.A. Papalaskari - Villanova University

  7. Hypothesis: Previously: Now: For convenience of notation, define . Multivariate linear regression CSC 4510 - M.A. Papalaskari - Villanova University

  8. Hypothesis: Parameters: Cost function: Gradient descent: Repeat (simultaneously update for every ) CSC 4510 - M.A. Papalaskari - Villanova University

  9. Gradient Descent Previously (n=1): Repeat (simultaneously update ) CSC 4510 - M.A. Papalaskari - Villanova University

  10. New algorithm : Gradient Descent Repeat Previously (n=1): Repeat (simultaneously update for ) (simultaneously update ) CSC 4510 - M.A. Papalaskari - Villanova University

  11. New algorithm : Gradient Descent Repeat Previously (n=1): Repeat (simultaneously update for ) (simultaneously update ) CSC 4510 - M.A. Papalaskari - Villanova University

  12. Feature Scaling Idea: Make sure features are on a similar scale. E.g. = size (0-2000 feet2) = number of bedrooms (1-5) Get every feature into range size (feet2) number of bedrooms CSC 4510 - M.A. Papalaskari - Villanova University

  13. Feature Scaling Idea: Make sure features are on a similar scale. E.g. = size (0-2000 feet2) = number of bedrooms (1-5) Replace with to make features have approximately zero mean (Do not apply to ). Mean normalization E.g. CSC 4510 - M.A. Papalaskari - Villanova University

  14. Gradient descent • “Debugging”: How to make sure gradient descent is working correctly. • How to choose learning rate . CSC 4510 - M.A. Papalaskari - Villanova University

  15. Making sure gradient descent is working correctly. • For sufficiently small , should decrease on every iteration. • But if is too small, gradient descent can be slow to converge. Declare convergence if decreases by less than in one iteration? No. of iterations CSC 4510 - M.A. Papalaskari - Villanova University

  16. Summary: Choosing • If is too small: slow convergence. • If is too large: may not decrease on every iteration; may not converge. To choose , try CSC 4510 - M.A. Papalaskari - Villanova University

  17. Housing prices prediction CSC 4510 - M.A. Papalaskari - Villanova University

  18. Polynomial regression Price (y) Size (x) CSC 4510 - M.A. Papalaskari - Villanova University

  19. Choice of features Price (y) Size (x) CSC 4510 - M.A. Papalaskari - Villanova University

  20. Gradient Descent Normal equation: Method to solve for analytically. CSC 4510 - M.A. Papalaskari - Villanova University

  21. Intuition: If 1D (for every ) Solve for CSC 4510 - M.A. Papalaskari - Villanova University

  22. Examples: CSC 4510 - M.A. Papalaskari - Villanova University

  23. examples ; features. E.g. If CSC 4510 - M.A. Papalaskari - Villanova University

  24. is inverse of matrix . • Octave: pinv(X’*X)*X’*y CSC 4510 - M.A. Papalaskari - Villanova University

  25. training examples, features. Gradient Descent Normal Equation • No need to choose . • Don’t need to iterate. • Need to choose . • Needs many iterations. • Need to compute • Works well even when is large. • Slow if is very large. CSC 4510 - M.A. Papalaskari - Villanova University

  26. Notes on Supervised learning and Regression http://see.stanford.edu/materials/aimlcs229/cs229-notes1.pdf Octave http://www.gnu.org/software/octave/ Wiki: http://www.octave.org/wiki/index.php?title=Main_Page documentation: http://www.gnu.org/software/octave/doc/interpreter/ CSC 4510 - M.A. Papalaskari - Villanova University

  27. Exercise For next class: • Download and install Octave (Alternative: if you have MATLAB, you can use it instead.) • Verify that it is working by typing in an Octave command window: • x= [01 2 3] • y = [0 2 4 6] • plot(x,y) • This example defines two vectors, x y and should display a plot showing a straight line (the line y=2x). If you get an error at this point, it may be that gnuplot is not installed or cannot access your display. If you are unable to get this to work, you can still do the rest of this exercise, because it does not involve any plotting (just restart Octave). You might refer to the Octave wiki for installation help but if you are stuck, you can get some help troubleshooting this on Friday afternoon 3-4pm in the software engineering lab (mendel 159). • Create a few matrices and vectors, eg: • A = [1 2; 3 4; 5 6] • V = [3 5 -1 0 7] • Try some of the elementary matrix and vector operations from our linear algebra slides (adding, multiplying between matrices, vectors and scalars) • Print out a log of your session CSC 4510 - M.A. Papalaskari - Villanova University

More Related