62 Views

Download Presentation
##### Machine Learning Overview

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Machine Learning Overview**Tamara Berg CS 590-133 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore, Percy Liang, Luke Zettlemoyer, Rob Pless, Killian Weinberger, Deva Ramanan**Announcements**• HW4 is due April 3 • Reminder: Midterm2 next Thursday • Next Tuesday’s lecture topics will not be included (but materialwill be on the final so attend!) • Midterm review • Monday, 5pm in FB009**Midterm Topic List**Be able to define the following terms and answer basic questions about them: Reinforcement learning • Passive vs Active RL • Model-based vs model-free approaches • Direct utility estimation • TD Learning and TD Q-learning • Exploration vsexploitation • Policy Search • Application to Backgammon/Aibos/helicopters (at a high level) Probability • Random variables • Axioms of probability • Joint, marginal, conditional probability distributions • Independence and conditional independence • Product rule, chain rule, Bayes rule**Midterm Topic List**Bayesian Networks General • Structure and parameters • Calculating joint and conditional probabilities • Independence in Bayes Nets (Bayes Ball) Bayesian Inference • Exact Inference (Inference by Enumeration, Variable Elimination) • Approximate Inference (Forward Sampling, Rejection Sampling, Likelihood Weighting) • Networks for which efficient inference is possible Naïve Bayes • Parameter learning including Laplace smoothing • Likelihood, prior, posterior • Maximum likelihood (ML), maximum a posteriori (MAP) inference • Application to spam/ham classification • Application to image classification (at a high level)**Midterm Topic List**HMMs • Markov Property • Markov Chains • Hidden Markov Model (initial distribution, transitions, emissions) • Filtering (forward algorithm) Machine Learning • Unsupervised/supervised/semi-supervised learning • K Means clustering • Training, tuning, testing, generalization**Machine learning**Image source: https://www.coursera.org/course/ml**Machine learning**• Definition • Getting a computer to do well on a task without explicitly programming it • Improving performance on a task based on experience**What is machine learning?**• Computer programs that can learn from data • Two key components • Representation: how should we represent the data? • Generalization: the system should generalize from its past experience (observed data items) to perform well on unseen data items.**Types of ML algorithms**• Unsupervised • Algorithms operate on unlabeled examples • Supervised • Algorithms operate on labeled examples • Semi/Partially-supervised • Algorithms combine both labeled and unlabeled examples**Clustering**• The assignment of objects into groups (aka clusters) so that objects inthe same cluster are more similar to each other than objects indifferent clusters. • Clustering is a common technique for statistical data analysis, used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics.**K-means clustering**• Want to minimize sum of squared Euclidean distances between points xi and their nearest cluster centers mk**Hierarchical clustering strategies**• Agglomerative clustering • Start with each data point in a separate cluster • At each iteration, merge two of the “closest” clusters • Divisive clustering • Start with all data points grouped into a single cluster • At each iteration, split the “largest” cluster**P**P P P Produces a hierarchy of clusterings**Divisive Clustering**• Top-down (instead of bottom-up as in Agglomerative Clustering) • Start with all data pointsin one big cluster • Then recursively split clusters • Eventually each data pointforms a cluster on its own.**Flat or hierarchical clustering?**• For high efficiency, use flat clustering (e.g. k means) • For deterministic results: hierarchical clustering • When a hierarchical structure is desired: hierarchical algorithm • Hierarchical clustering can also be applied if K cannot be predetermined (can start without knowing K) Source: Hinrich Schutze**Recall: Bag of Words Representation**• Represent document as a “bag of words”**Bag-of-features models**Slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba**Bags of features for image classification**• Extract features**Bags of features for image classification**• Extract features • Learn “visual vocabulary”**Bags of features for image classification**• Extract features • Learn “visual vocabulary” • Represent images by frequencies of “visual words”**…**1. Feature extraction**…**2. Learning the visual vocabulary**…**2. Learning the visual vocabulary Clustering**…**2. Learning the visual vocabulary Visual vocabulary Clustering**Example visual vocabulary**Fei-Fei et al. 2005**…..**3. Image representation frequency Visual words**Types of ML algorithms**• Unsupervised • Algorithms operate on unlabeled examples • Supervised • Algorithms operate on labeled examples • Semi/Partially-supervised • Algorithms combine both labeled and unlabeled examples