This presentation is the property of its rightful owner.
1 / 14

# Machine Learning PowerPoint PPT Presentation

Machine Learning. Usman Roshan Dept. of Computer Science NJIT. What is Machine Learning?. “ Machine learning is programming computers to optimize a performance criterion using example data or past experience. ” Intro to Machine Learning, Alpaydin, 2010 Examples: Facial recognition

Machine Learning

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Machine Learning

UsmanRoshan

Dept. of Computer Science

NJIT

### What is Machine Learning?

• “Machine learning is programming computers to optimize a performance criterion using example data or past experience.” Intro to Machine Learning, Alpaydin, 2010

• Examples:

• Facial recognition

• Digit recognition

• Molecular classification

### A little history

• 1946: First computer called ENIAC to perform numerical computations

• 1950: Alan Turing proposes the Turing test. Can machines think?

• 1952: First game playing program for checkers by Arthur Samuel at IBM. Knowledge based systems such as ELIZA and MYCIN.

• 1957: Perceptron developed by Frank Roseblatt. Can be combined to form a neural network.

• Early 1990’s: Statistical learning theory. Emphasize learning from data instead of rule-based inference.

• Current status: Used widely in industry, combination of various approaches but data-driven is prevalent.

### Example up-close

• Problem: Recognize images representing digits 0 through 9

• Input: High dimensional vectors representing images

• Output: 0 through 9 indicating the digit the image represents

• Learning: Build a model from “training data”

• Predict “test data” with model

### Data model

• We assume that the data is represented by a set of vectors each of fixed dimensionality.

• Vector: a set of ordered numbers

• We may refer to each vector as a datapointand each dimension as a feature

• Example:

• A bank wishes to classify humans as risky or safe for loan

• Each human is a datapoint and represented by a vector

• Features may be age, income, mortage/rent, education, family, current loans, and so on

### Machine learning resources

• Data

• NIPS 2003 feature selection contest

• mldata.org

• UCI machine learning repository

• Contests

• Kaggle

• Software

• Python sci-kit

• R

### Textbook

• Not required but highly recommended for beginners

• Introduction to Machine Learning by Ethem Alpaydin (2nd edition, 2010, MIT Press). Written by computer scientist and material is accessible with basic probability and linear algebra background

• Applied predictive modeling by Kuhn and Johnson (2013, Springer). More recent book focuses on practical modeling.

### Some practical techniques

• Combination of various methods

• Parameter tuning

• Error trade-off vs model complexity

• Data pre-processing

• Normalization

• Standardization

• Feature selection

### Background

• Basic linear algebra and probability

• Vectors

• Dot products

• Eigenvector and eigenvalue

• See Appendix of textbook for probability background

• Mean

• Variance

• Gaussian/Normal distribution

### Assignments

• Implementation of basic classification algorithms with Perl and Python

• Nearest Means

• Naïve Bayes

• K nearest neighbor

• Cross validation scripts

• Experiment with various algorithms on assigned datasets

### Project

• Some ideas:

• Experiment with Kaggle and NIPS 2003 feature selection datasets

• Experimental performance study of various machine learning techniques on a given dataset. For example comparison of feature selection methods with a fixed classifier.

### Exams

• One exam in the mid semester

• Final exam

• What to expect on the exams:

• Basic conceptual understanding of machine learning techniques

• Be able to apply techniques to simple datasets

• Basic runtime and memory requirements

• Simple modifications