machine learning n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Machine Learning PowerPoint Presentation
Download Presentation
Machine Learning

Loading in 2 Seconds...

play fullscreen
1 / 28

Machine Learning - PowerPoint PPT Presentation


  • 194 Views
  • Uploaded on

Machine Learning. Lecture # 1. Contents. Why machine learning (ML) useful ? What is ML ? Key steps of learning Types of ML algorithms. Why Machine learning . Computational power is available (Resource) Recent progress in algorithms and theory (Resource)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Machine Learning' - delu


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
machine learning

Machine Learning

Lecture # 1

contents
Contents
  • Why machine learning (ML) useful ?
  • What is ML ?
  • Key steps of learning
  • Types of ML algorithms
why machine learning
Why Machine learning
  • Computational power is available (Resource)
  • Recent progress in algorithms and theory (Resource)
  • Growing flood of online data (Requirement)
  • Three niches of ML
    • Data Mining: using historical data to improve decisions, e.g. Medical record – medical knowledge
    • Software applications we can’t program by hand, e.g. Speech recognition, handwritten recognition, autonomous driving
    • Self customizing programs, e.g. Amazon or Newsreaders that learn user interest
problems too difficult to program by hand
Problems Too Difficult to Program by Hand
  • Speech recognition
  • Face recognition
  • Robotics control
problems too difficult to program by hand1
Problems Too Difficult to Program by Hand
  • It is very hard to write programs that solve problems like recognizing a face.
    • We don’t know what program to write because we don’t know how our brain does it.
    • Even if we had a good idea about how to do it, the program might be horrendously complicated.
  • Instead of writing a program by hand, we collect lots of examples that specify the correct output for a given input.
  • A machine learning algorithm then takes these examples and produces a program that does the job.
    • The program produced by the learning algorithm may look very different from a typical hand-written program. It may contain millions of numbers.
    • If we do it right, the program works for new cases as well as the ones we trained it on.
software that customizes to user
Software that Customizes to User

www. Amazon.com

www. Netflix.com

what is ml 1 2
What is ML ? (1/2)
  • Field of study that gives computer the ability to learn without being explicitly programmed (Arthur Samuel, 1956)
  • Study of algorithms that improve their performance P at some task T with experience E (Tom Mitchell, 1998)

T: Play checkers

P: % of games won

E: Playing against self

  • Well defined learning task: <P, T, E>
what is ml 2 2
What is ML ? (2/2)
  • Handwriting Recognition
    • Task T: recognizing and classifying handwritten words within images
    • Performance P: percent of words correctly classified
    • Training experience E: a database of written words with given classification
  • ML course grade prediction
    • Task T: predicting student grades for ML course
    • Performance P: percent of grades correctly predicted
    • Training experience E: previous courses read by the students and corresponding grades
learning key steps 1 4
Learning: Key Steps (1/4)
  • Data: what past experience can we rely on ?
    • Names and grades of students in the past ML courses
    • Academic record of past and current students

Training data

Current data

learning key steps 2 4
Learning: Key Steps (2/4)
  • Assumption: to simplify the learning problem
    • The course has remained roughly the same over the years
    • Each student perform independently from others
  • Representation

Academic records are rather diverse so we might limit the summaries to select few courses. For example, we summaries the ith student (say peter) with vector

Xi=[A C B]

Where grade may correspond to numeric values

learning key steps 3 4
Learning: Key Steps (3/4)
  • Estimation
  • Given the training data: we need to find a mapping from “input vectors” x to “labels” y encoding the grades for the ML course.
  • Possible solution (nearest neighbor classifier)
    • For any student x find the “closest” student xi in the training set . Predict yi, the grade of the closest student
    • Evaluation: how can we tell how well our system is predicted?
    • We can wait till the end of this course
    • We can try to assess the accuracy based on the available data
  • Possible solution
    • Divide the training set into training and test subsets
    • Training the classifier based on training subset and evaluate it based on test subset
learning key steps 4 4
Learning: Key Steps (4/4)
  • Model selection
    • Refinement
      • To choose another classifier (instead of nearest neighbor)
      • To choose different representation (e.g. base the summaries on different set of courses)
      • Reducing assumptions (e.g. perhaps students work in groups, etc)
    • Analysing the performance:

We have to rely on the method of evaluating the accuracy of our predictions to select among the possible refinements

types of ml algorithms
Types of ML Algorithms

Major main types are:

  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
  • Semi Supervised learning
supervised learning
Supervised Learning
  • A process of finding a model that describes and distinguish data classes or concepts for the purpose of being able to predict the class of objects whose class label is unknown.
  • Given a collection of records (training set )
    • Each record contains a set of attributes, one of the attributes is the class.
  • Goal: previously unseen records should be assigned a class as accurately as possible.
    • A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.
supervised learning1

Test

Set

Model

Supervised learning

Class variable or output

Attributes of input data

Learn

Classifier

Training

Set

supervised learning application
Supervised Learning: Application
  • Direct Marketing
    • Goal: Reduce cost of mailing by targeting a set of consumers likely to buy a new cell-phone product.
    • Approach:
      • Use the data for a similar product introduced before.
      • We know which customers decided to buy and which decided otherwise. This {buy, don’t buy} decision forms the class attribute.
      • Collect various demographic, lifestyle, and company-interaction related information about all such customers.
        • Type of business, where they stay, how much they earn, etc.
      • Use this information as input attributes to learn a classifier model.
regression
Regression
  • Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency.
  • Greatly studied in statistics, neural network fields.
  • Examples:
    • Predicting sales amounts of new product based on advetising expenditure.
    • Predicting wind velocities as a function of temperature, humidity, air pressure, etc.
    • Time series prediction of stock market indices.

Rupees

feet2

unsupervised learning
Unsupervised Learning

•Unlike supervised learning which analyse class-labeled data objects, clustering analyse data objects without consulting a class. In fact class labels are not present in data because they are not known

• Major questions of the clustering are

-Are there any “groups” in the data ?

-What is each group ?

-How many ?

-How to identify them?

clustering definition
Clustering Definition
  • Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that
    • Data points in one cluster are more similar to one another.
    • Data points in separate clusters are less similar to one another.
  • Similarity Measures:
    • Euclidean Distance if attributes are continuous.
    • Other Problem-specific Measures.
illustrating clustering
Illustrating Clustering

Euclidean Distance Based Clustering in 3-D space

Intracluster distances

are minimized

Intercluster distances

are maximized

clustering application
Clustering: Application
  • Market Segmentation:
    • Goal: subdivide a market into distinct subsets of customers where any subset may conceivably be selected as a market target to be reached with a distinct marketing mix.
    • Approach:
      • Collect different attributes of customers based on their geographical and lifestyle related information.
      • Find clusters of similar customers.
      • Measure the clustering quality by observing buying patterns of customers in same cluster vs. those from different clusters.
types of ml algorithms1
Types of ML Algorithms
  • Reinforcement learning
    • Supervised learning:
      • Correct output for each training input is available
    • Reinforcement learning:
      • Some evaluation of an input is available, but not the exact output
reference literature
Reference Literature
  • Text book: Machine Learning by Tom Mitchell
  • Reference book: Pattern recognition and machine learning by C. Bishop