1 / 10

Understanding Logistic Regression in Machine Learning

This lecture on Logistic Regression explores its application as a classification method, fitting a regression function to binary targets (Y = 0 or 1). We'll derive an appropriate cost function, optimizing the model by maximizing the probability of data and utilizing gradient descent to update parameters A and b. The decision surface remains linear, demonstrating the efficiency of parametric methods. A practical example of fingerprint matching illustrates the real-world use of Logistic Regression, and an exercise helps solidify understanding of classifier training and parameter tuning.

Download Presentation

Understanding Logistic Regression in Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICS 178Introduction Machine Learning & data Mining Instructor max Welling Lecture 6: Logistic Regression

  2. Logistic Regression • This is also regression but with targets Y=(0,1). I.e. it is classification! • We will fit a regression function on P(Y=1|X) linear regression logistic regression

  3. Sigmoid function f(x) data-points with Y=1 data-points with Y=0

  4. In 2 Dimensions A,b determine 1) orientation 2) thickness (margin) 3) offset of decision surface sigmoid f(x)

  5. Cost Function • We want a different error measure that is better suited for 0/1 data. • This can be derived from maximizing the probability of the data again.

  6. Learning A,b • Again, we take the derivatives of the Error wrt to the parameters. • This time however, we can’t solve them analytically, so we use gradient descent.

  7. Gradients for Logistic Regression • After the math (on the white-board) we find: Note: first term in each eqn. (multiplied by Y) only sums over data with Y=1, while second term (multiplied by (1-Y) only sums over data with Y=0. Follow the gradient until the change in A,b falls below a small theshold (e.g. 1E-6).

  8. Classification • Once we have found the optimal values for A,b we classify future data with: • Least squares and Logistic regression are parametric methods since • all the information in the data is stored in the parameters A,b, i.e. • after learning you can toss out the data. • Also, the decision surface is always linear, its complexity does not grow • with the amount of data. • We have imposed our prior knowledge that the decision surface should be linear.

  9. A Real Example collaboration with S. Cole) • Fingerprints are matched against a data-base. • Each match is scored. • Using Logistic Regression we try to predict if a future match is a real or false. • Human fingerprint examiners claim 100% accuracy. Is this true?

  10. Exercise • You have layed your hands on a dataset where data have a single attribute • and a class label (0 or 1). You train a logistic regression classifier. • A new data-case is presented. What do you do to decide in what class it falls • (use an equation or pseudo-code) • How many parameters are there to tune for this problem? • Explain what these parameters mean in terms of the function P(Y=1|X).

More Related