Machine learning
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

Machine Learning PowerPoint PPT Presentation


  • 250 Views
  • Uploaded on
  • Presentation posted in: General

Machine Learning. Usman Roshan Dept. of Computer Science NJIT. What is Machine Learning?. “ Machine learning is programming computers to optimize a performance criterion using example data or past experience. ” Intro to Machine Learning, Alpaydin, 2010 Examples: Facial recognition

Download Presentation

Machine Learning

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Machine learning

Machine Learning

UsmanRoshan

Dept. of Computer Science

NJIT


What is machine learning

What is Machine Learning?

  • “Machine learning is programming computers to optimize a performance criterion using example data or past experience.” Intro to Machine Learning, Alpaydin, 2010

  • Examples:

    • Facial recognition

    • Digit recognition

    • Molecular classification


A little history

A little history

  • 1946: First computer called ENIAC to perform numerical computations

  • 1950: Alan Turing proposes the Turing test. Can machines think?

  • 1952: First game playing program for checkers by Arthur Samuel at IBM. Knowledge based systems such as ELIZA and MYCIN.

  • 1957: Perceptron developed by Frank Roseblatt. Can be combined to form a neural network.

  • Early 1990’s: Statistical learning theory. Emphasize learning from data instead of rule-based inference.

  • Current status: Used widely in industry, combination of various approaches but data-driven is prevalent.


Example up close

Example up-close

  • Problem: Recognize images representing digits 0 through 9

  • Input: High dimensional vectors representing images

  • Output: 0 through 9 indicating the digit the image represents

  • Learning: Build a model from “training data”

  • Predict “test data” with model


Data model

Data model

  • We assume that the data is represented by a set of vectors each of fixed dimensionality.

  • Vector: a set of ordered numbers

  • We may refer to each vector as a datapointand each dimension as a feature

  • Example:

    • A bank wishes to classify humans as risky or safe for loan

    • Each human is a datapoint and represented by a vector

    • Features may be age, income, mortage/rent, education, family, current loans, and so on


Machine learning resources

Machine learning resources

  • Data

    • NIPS 2003 feature selection contest

    • mldata.org

    • UCI machine learning repository

  • Contests

    • Kaggle

  • Software

    • Python sci-kit

    • R

    • Your own code


Machine learning techniques we will learn in the course

Machine Learning techniques we will learn in the course


Textbook

Textbook

  • Not required but highly recommended for beginners

  • Introduction to Machine Learning by Ethem Alpaydin (2nd edition, 2010, MIT Press). Written by computer scientist and material is accessible with basic probability and linear algebra background

  • Applied predictive modeling by Kuhn and Johnson (2013, Springer). More recent book focuses on practical modeling.


Some practical techniques

Some practical techniques

  • Combination of various methods

  • Parameter tuning

    • Error trade-off vs model complexity

  • Data pre-processing

    • Normalization

    • Standardization

  • Feature selection

    • Discarding noisy features


Background

Background

  • Basic linear algebra and probability

    • Vectors

    • Dot products

    • Eigenvector and eigenvalue

  • See Appendix of textbook for probability background

    • Mean

    • Variance

    • Gaussian/Normal distribution


Assignments

Assignments

  • Implementation of basic classification algorithms with Perl and Python

    • Nearest Means

    • Naïve Bayes

    • K nearest neighbor

    • Cross validation scripts

  • Experiment with various algorithms on assigned datasets


Project

Project

  • Some ideas:

    • Experiment with Kaggle and NIPS 2003 feature selection datasets

    • Experimental performance study of various machine learning techniques on a given dataset. For example comparison of feature selection methods with a fixed classifier.


Exams

Exams

  • One exam in the mid semester

  • Final exam

  • What to expect on the exams:

    • Basic conceptual understanding of machine learning techniques

    • Be able to apply techniques to simple datasets

    • Basic runtime and memory requirements

    • Simple modifications


Grade breakdown

Grade breakdown

  • Assignments and project worth 50%

  • Exams worth 50%


  • Login