1 / 32

Machine Learning BITS F464

Machine Learning BITS F464. Navneet Goyal Department of Computer Science, BITS- Pilani , Pilani Campus, India. Introduction. Introduction. Let’s look at these incredible things that humans can do: Identifying a song by just listening to a very small part of it

Download Presentation

Machine Learning BITS F464

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine LearningBITS F464 NavneetGoyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

  2. Introduction

  3. Introduction Let’s look at these incredible things that humans can do: • Identifying a song by just listening to a very small part of it • Identifying a movie by looking at a very short clip • Identifying a person • Identifying a person even after you see him after many many years • Recollecting memories • Identifying a person from a distance • Identifying a person by just listening to his/her voice • Identifying a person by his chat/message signature • Our own GPS!

  4. Introduction Ever wondered how we could do all this? • Pattern recognition • Information retrieval Human Brain!! Neurons!! Ever wondered how we can make Machines learn to do all such tasks and that too with the efficiency of Humans?

  5. Machine Learning Humour Source – http://www.kdnuggets.com/2012/12/machine-learning-data-mining-humor.html

  6. Source - http://diegoferrin.wordpress.com

  7. Introduction Related Fields • Artificial Intelligence • Statistics • Data Mining

  8. Machine Learning Humour • What is the difference between statistics, machine learning, AI and data mining? • If there are up to 3 variables, it is statistics. • If the problem is NP-complete, it is machine learning. • If the problem is PSPACE-complete, it is AI. • If you don't know what is PSPACE-complete, it is data mining. Source – http://www.kdnuggets.com/2012/12/machine-learning-data-mining-humor.html

  9. What is Machine Learning? • Machines DO Machines LEARN • Shift in paradigm! • Machines can be made to learn! • How and for what purpose? • How? By writing algorithms! • Purpose: Mainly to Predict and to take Decisions!

  10. Types of Learning • Supervised • Unsupervised • Semi-supervised • Reinforced • Active • Deep

  11. Introduction • Zoologists study learning in animals • Psychologists study learning in humans • In this course, we focus on “Learning in Machines” • Course Objective • Study of approaches and algorithms that can make a machine learn

  12. Introduction • Machine Learning • Subarea of AI that is concerned with algorithms/programs that can make a machine learn • Improve automatically with experience • For example- doctors learning from experience • Faculty learning how to control the class and be effective • We all learn from experience Imagine computers learning from medical records and suggesting treatment (automated diagnosis & prescription)

  13. Machine Learning • A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

  14. Interesting Problems • Speech and Hand Writing Recognition • Robotics (training moving robots) • Search Engine (context aware) • Learning to drive autonomous vehicle • Medical Diagnosis • Detecting credit card fraud • Computational Bioinformatics • Game Playing

  15. What is Machine Learning? • To solve a problem, we need an algorithm! • For example: sorting a list of numbers • Input: list of numbers • Output: sorted list of numbers • For some tasks, like filtering spam mails • Input: an email • Output: Y/N • We do not know how to transform Input to Output • Definition of Spam changes with time and from one individual to individual • What to DO? Reference: E Alpaydin’s Machine Learning Book, 2010 (MIT Press)

  16. What is Machine Learning? • Collect lots of emails (both genuine and spam) • “Learn” what constitutes a spam mail (or for that matter a genuine mail) • Learn from DATA!! • For many similar problems, we may not have algorithm(s), but we do have example data (called Training Data) • Ability to process training data has been made possible by advances in computer technology

  17. What is Machine Learning? • Face Recognition!!! • We humans are so good at it!!! • Ever thought how we do it, despite • Different light conditions, pose, hair style, make up, glasses, ageing etc.. • Since we do not know how we do it, we can not write a program to do it • ML is about making inference from a sample

  18. Machine Learning Applications • What kind of data I would require for learning? • Credit card transactions • Face Recognition • Spam filter • Handwriting/Character Recognition

  19. Handwriting Recognition • Task T • recognizing and classifying handwritten words within images • Performance measure P • percent of words correctly classified • Training experience E • a database of handwritten words with given classifications

  20. Handwriting Recognition

  21. Pattern Recognition Example • Handwriting Digit Recognition Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  22. Pattern Recognition Example • Handwriting Digit Recognition • Non-trivial problem due to variability in handwriting • What about using handcrafted rules or heuristics for distinguishing the digits based on shapes of strokes? • Not such a good idea!! • Proliferation of rules • Exceptions of rules and so on… • Adopt a ML approach!! Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  23. Pattern Recognition Example • Handwriting Digit Recognition • Each digit represented by a 28x28 pixel image • Can be represented by a vector of 784 real no.s • Objective: to have an algorithm that will take such a vector as input and identify the digit it is representing • Take images of a large no. of digits (N) – training set • Use training set to tune the parameters of an adaptive model • Each digit in the training set has been identified by a target vector t, which represents the identity of the corresp. digit. • Result of running a ML algo. can expressed as a fn. y(x) which takes input a new digit x and outputs a vector y. Vector y is encoded in the same way as t • The form of y(x) is determined through the learning (training) phase Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  24. Pattern Recognition Example • Generalization • The ability to categorize correctly new examples that differ from those in training • Generalization is a central goal in pattern recognition • Preprocessing • Input variables are preprocessed to transform them into some new space of variables where it is hoped that the problem will be easier to solve (see fig.) • Images of digits are translated and scaled so that each digit is contained within a box of fixed size. This reduces variability. • Preprocessing stage is referred to as feature extraction • New test data must be preprocessed using the same steps as training data Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  25. Linear Classifiers in High-Dimensional Spaces Constructed Feature 2 Var1 Var2 Constructed Feature 1 Find function (x) to map to a different space Go back

  26. A word about Preprocessing!! • Preprocessing • Can also speed up computations • For eg.: Face detection in a high resolution video stream • Find useful features that are fast to compute and yet that also preserve useful discriminatory information enabling faces to be distinguished form non-faces • Avg. value of image intensity in a rectangular sub-region can be evaluated extremely efficiently and a set of such features are very effective in fast face detection • Such features are smaller in number than the number of pixels, it is referred to as a form of Dimensionality Reduction • Care must be taken so that important information is not discarded during pre processing Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  27. Curse of Dimensionality!! • Poses serious challenges ! • Important factor influencing the design on pattern recognition techniques • Mixture of oil, water & gas(homogeneous , annular & laminar) • Each data point is a point in a 12-dimensional space. • 100 points along only two dimensions, x6 & x7 • x – predict its class? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  28. Curse of Dimensionality!! • Unlikely that it belongs to the blue class! • Surrounded by lot of red points • Also, many green points nearby • Intuition: identity of the x should be determined strongly by nearby points and less strongly by more distant points • How can we turn this intuition into a learning algorithm? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  29. Curse of Dimensionality!! • Make grid lines! • Use majority voting • Problems?? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  30. Curse of Dimensionality • No. of cells grow exponentially with D • Need exponentially large no. of training data points • Not a good approach for more than a few dimensions!

  31. Curse of Dimensionality • Solutions • Principal Component Analysis • Singular Value Decomposition Brush up your Linear Algebra…

More Related