1 / 67

CSC 4510 – Machine Learning

CSC 4510 – Machine Learning. 4: Regression (continued). Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/. T he slides in this presentation are adapted from:

ciara
Download Presentation

CSC 4510 – Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 4510 – Machine Learning 4: Regression (continued) Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ • The slides in this presentation are adapted from: • The Stanford online ML course http://www.ml-class.org/ CSC 4510 - M.A. Papalaskari - Villanova University

  2. Last time • Introduction to linear regression • Intuition – least squares approximation • Intuition – gradient descent algorithm • Hands on: Simple example using excel CSC 4510 - M.A. Papalaskari - Villanova University

  3. Today • How to apply gradient descent to minimize the cost function for regression • linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University

  4. Reminder: sample problem Housing Prices (Portland, OR) Price (in 1000s of dollars) Size (feet2) CSC 4510 - M.A. Papalaskari - Villanova University

  5. Reminder: Notation Training set of housing prices (Portland, OR) Notation: m = Number of training examples x’s = “input” variable / features y’s = “output” variable / “target” variable CSC 4510 - M.A. Papalaskari - Villanova University

  6. Reminder: Learning algorithm for hypothesis function h Training Set Learning Algorithm Linear Hypothesis: Size of house Estimate price h Univariate linear regression) CSC 4510 - M.A. Papalaskari - Villanova University

  7. Reminder: Learning algorithm for hypothesis function h Training Set Learning Algorithm Linear Hypothesis: Size of house Estimate price h Univariate linear regression) CSC 4510 - M.A. Papalaskari - Villanova University

  8. Linear Regression Model Gradient descent algorithm CSC 4510 - M.A. Papalaskari - Villanova University

  9. Today • How to apply gradient descent to minimize the cost function for regression • a closer look at the cost function • applying gradient descent to find the minimum of the cost function • linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University

  10. Hypothesis: Parameters: Cost Function: Goal: CSC 4510 - M.A. Papalaskari - Villanova University

  11. θ0 = 0 Simplified Hypothesis: Parameters: Cost Function: Goal: CSC 4510 - M.A. Papalaskari - Villanova University

  12. θ0 = 0 (for fixed θ1 this is a function of x) (function of the parameter θ1 ) y x hθ (x) = x CSC 4510 - M.A. Papalaskari - Villanova University

  13. θ0 = 0 (for fixed θ1 this is a function of x) (function of the parameter θ1 ) y x hθ (x) = 0.5x CSC 4510 - M.A. Papalaskari - Villanova University

  14. θ0 = 0 (for fixed θ1 this is a function of x) (function of the parameter θ1 ) y x hθ (x) = 0 CSC 4510 - M.A. Papalaskari - Villanova University

  15. What if θ0 ≠ 0? Hypothesis: Parameters: Cost Function: Goal: CSC 4510 - M.A. Papalaskari - Villanova University

  16. (for fixed θ0 , θ1 , this is a function of x) (function of the parameters θ0 , θ1) Price ($) in 1000’s Size in feet2 (x) hθ (x) = 10 + 0.1x CSC 4510 - M.A. Papalaskari - Villanova University

  17. CSC 4510 - M.A. Papalaskari - Villanova University

  18. (for fixed θ0 , θ1 , this is a function of x) (function of the parameters θ0 , θ1) CSC 4510 - M.A. Papalaskari - Villanova University

  19. (for fixed θ0 , θ1 , this is a function of x) (function of the parameters θ0 , θ1) CSC 4510 - M.A. Papalaskari - Villanova University

  20. (for fixed θ0 , θ1 , this is a function of x) (function of the parameters θ0 , θ1) CSC 4510 - M.A. Papalaskari - Villanova University

  21. (for fixed θ0 , θ1 , this is a function of x) (function of the parameters θ0 , θ1) CSC 4510 - M.A. Papalaskari - Villanova University

  22. Today • How to apply gradient descent to minimize the cost function for regression • a closer look at the cost function • applying gradient descent to find the minimum of the cost function • linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University

  23. Have some function Want • Gradient descent algorithm outline: • Start with some • Keep changing to reduce until we hopefully end up at a minimum CSC 4510 - M.A. Papalaskari - Villanova University

  24. Have some function Want Gradient descent algorithm CSC 4510 - M.A. Papalaskari - Villanova University

  25. Have some function Want Gradient descent algorithm learning rate CSC 4510 - M.A. Papalaskari - Villanova University

  26. If α is too small, gradient descent can be slow. If α is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge. CSC 4510 - M.A. Papalaskari - Villanova University

  27. at local minimum Current value of CSC 4510 - M.A. Papalaskari - Villanova University

  28. Gradient descent can converge to a local minimum, even with the learning rate α fixed. CSC 4510 - M.A. Papalaskari - Villanova University

  29. Linear Regression Model Gradient descent algorithm CSC 4510 - M.A. Papalaskari - Villanova University

  30. Gradient descent algorithm update and simultaneously CSC 4510 - M.A. Papalaskari - Villanova University

  31. J(0,1) 1 0 CSC 4510 - M.A. Papalaskari - Villanova University

  32. CSC 4510 - M.A. Papalaskari - Villanova University

  33. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  34. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  35. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  36. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  37. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  38. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  39. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  40. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  41. (for fixed , this is a function of x) (function of the parameters ) CSC 4510 - M.A. Papalaskari - Villanova University

  42. “Batch” Gradient Descent “Batch”: Each step of gradient descent uses all the training examples. Alternative: process part of the dataset for each step of the algorithm. • The slides in this presentation are adapted from: • The Stanford online ML course http://www.ml-class.org/ CSC 4510 - M.A. Papalaskari - Villanova University

  43. What’s next? We are not in univariate regression anymore: CSC 4510 - M.A. Papalaskari - Villanova University

  44. What’s next? We are not in univariate regression anymore: CSC 4510 - M.A. Papalaskari - Villanova University

  45. Today • How to apply gradient descent to minimize the cost function for regression • a closer look at the cost function • applying gradient descent to find the minimum of the cost function • linear algebra refresher CSC 4510 - M.A. Papalaskari - Villanova University

  46. Linear Algebra Review CSC 4510 - M.A. Papalaskari - Villanova University

  47. Matrix: Rectangular array of numbers Matrix Elements (entries of matrix) “i, j entry” in the ithrow, jth column Dimension of matrix: number of rows x number of columns eg: 4 x 2 CSC 4510 - M.A. Papalaskari - Villanova University

  48. Another Example: Representing communication links in a network b b a c a c e d e d Adjacency matrix Adjacency matrix a b c d e a b c d e a 0 1 2 0 3 a 0 1 0 0 2 b 1 0 0 0 0 b 0 1 0 0 0 c 2 0 0 1 1 c 1 0 0 1 0 d 0 0 1 0 1 d 0 0 1 0 1 e 3 0 1 1 0 e 0 0 0 0 0

  49. Vector: An n x 1 matrix. element CSC 4510 - M.A. Papalaskari - Villanova University n-dimensional vector

  50. Vector: An n x 1 matrix. 1-indexed vs 0-indexed: element CSC 4510 - M.A. Papalaskari - Villanova University n-dimensional vector

More Related