1 / 24

Patterson: Chap 1 A Review of Machine Learning

Patterson: Chap 1 A Review of Machine Learning. Dr. Charles Tappert The information here, although greatly condensed, comes almost entirely from the chapter content. This Chapter.

samayoa
Download Presentation

Patterson: Chap 1 A Review of Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Patterson: Chap 1A Review of Machine Learning Dr. Charles Tappert The information here, although greatly condensed, comes almost entirely from the chapter content.

  2. This Chapter • Because the focus of this book is on Deep Learning, this first chapter presents only a rough review of the classical methods employed in machine learning • These classical methods are covered in more detail in the Duda textbook

  3. The Learning Machines • Definition: Machine Learning is using algorithms to extract information from raw data and represent it in some type of model • Deep learning emerged about 2006 and the deep learning systems are now winning the important machine learning competitions

  4. The Learning MachinesAI and Deep Learning

  5. The Learning Machines Biological Inspiration • Biological neural networks (brains) contain • Roughly 86 billion neurons • Over 500 trillion connections between neurons • Biological neural networks are much more complex than artificial neural networks (ANN) • Main properties of ANNs • Basic unit is the artificial neuron (node) • We can train ANNs to pass along only useful signals

  6. The Learning Machines What is Deep Learning? • For the purposes of this book we define deep learning as neural networks with a large number of parameters and layers in one of four fundamental network architectures • Unsupervised pretrained networks • Convolutional neural networks • Recurrent neural networks • Recursive neural networks

  7. The Learning Machines Going Down the Rabbit Hole • Deep learning has penetrated the computer science consciousness beyond most techniques in recent history • Top-flight accuracy with deep learning models • This initiates many philosophical discussions • Can machines be creative? What is creativity? • Can machines be as intelligent as humans?

  8. Framing the Questions • The basics of machine learning are best understood by asking the correct questions • What is the input data? • What kind of model is best for the data? • What kind of answer would we like to elicit from new data based on this model?

  9. Math Behind Machine Learning • Linear Algebra • Scalars, vectors, matrices, tensors, hyperplanes, solving systems of equations • Probability and Statistics • Conditional probabilities, Bayes Theorem, probability distributions • Students are expected to have the math background for this course

  10. How Does Machine Learning Work? • Fundamentally, machine learning is based on algorithmic techniques to minimize the error in Ax = b through optimization where • A is a matrix of input row vectors • x is the weight vector • b is a column vector of output labels • Essentially, we want to determine x = A-1b but it is usually not possible to invert A

  11. How Does Machine Learning Work?Regression, especially Linear • Attempts to find a function that describes the relationship between input x and output y • For linear regression, y = a + Bx

  12. How Does Machine Learning Work?Classification • The model attempts to find classes based on a set of input features • The dependent variable y is categorical rather than numerical • Binary classifier is the most basic • For example, someone has a disease or not

  13. How Does Machine Learning Work?Clustering • Clustering is unsupervised learning that usually involves a distance measure and iteratively moves similar items more closely together • At the end of the process, the items are clustered densely around n centroids

  14. How Does Machine Learning Work?Underfitting and Overfitting

  15. How Does Machine Learning Work?Optimization • Parameter optimization is the process of adjusting weights to produce accurate estimates of the data • Convergence of an optimization algorithm finds the parameters providing the smallest error across the training samples • The optimization function guides the learning toward a solution of least error

  16. How Does Machine Learning Work?Convex Optimization • Convex optimization deals with convex cost functions

  17. How Does Machine Learning Work?Gradient Descent • Gradient is a vector of n partial derivatives of the function f, a generalization of the 1D derivative • Problems – local minima and non-normalized features

  18. How Does Machine Learning Work?Stochastic Gradient Descent (SGD) • Stochastic gradient descent calculates gradient and updates parameter vector after each training sample • Whereas gradient descent calculates the gradient and updates the parameter vector over all training samples • The SGD method speeds up learning • A variant of SGD, called mini-batch, uses more than a single training sample per iteration and leads to smoother convergence

  19. How Does Machine Learning Work? Generative vs Discriminative Models • Two major model types – generative & discriminative • Generative models understand how the data were created in order to generate an output • These models generate likely output, such as art similar to that of a well-known artist • Discriminative models simply give us a classification or category for a given input • These models are typically used for classification in machine learning

  20. Logistic Regression • Logistic regression is a well-known linear classification model • Handles binary classification as well as multiple labels • The dependent variable is categorical (e.g., classification) • We have three components to solve for our parameter vector x • A hypothesis about the data • A cost function, maximum likelihood estimation • An update function, derivative of the cost function

  21. Logistic RegressionThe Logistic Function • The logistic function is defined as • This function is useful because it maps the input range of -infinity to +infinity into the output range 0-1, which can be interpreted as a probability

  22. Logistic Regression Understanding Logistic Regression Output • The logistic function is often denoted with the Greek letter sigma because the graph representation resembles an elongated “s” whose max and min asymptotically approach 1 and 0, respectively • f(x) represents the probability that y equals 1 (i.e., true)

  23. Evaluating ModelsThe Confusion Matrix • Various measures: e.g., Accuracy = (TP+TN)/(TP+TN+FP+FN)

  24. Building an Understanding of Machine Learning • In this chapter, we introduced the core concepts needed for practicing machine learning • The core mathematical concept of modeling is based around the equation • We looked at the core ideas of getting features into the matrix A, ways to change the parameter vector x, and setting the outcomes in the vector b

More Related