Cs 657 790 machine learning and data mining course introduction
This presentation is the property of its rightful owner.
Sponsored Links
1 / 41

CS 657/790 Machine Learning and Data Mining Course Introduction PowerPoint PPT Presentation


  • 65 Views
  • Uploaded on
  • Presentation posted in: General

CS 657/790 Machine Learning and Data Mining Course Introduction. Student Survey. Please hand in sheet of paper with: Your name and email address Your classification (eg, 2 nd year computer science PhD student) Your experience with MATLAB (none, some or much)

Download Presentation

CS 657/790 Machine Learning and Data Mining Course Introduction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cs 657 790 machine learning and data mining course introduction

CS 657/790 Machine Learning andData MiningCourse Introduction


Student survey

Student Survey

  • Please hand in sheet of paper with:

    • Your name and email address

    • Your classification (eg, 2nd year computer science PhD student)

    • Your experience with MATLAB (none, some or much)

    • Your undergraduate degree (when, what, where)

    • Your AI experience (courses at UWM or elsewhere)

    • Your programming experience


Course information

Course Information

  • Course Instructor: Joe Bockhorst

    • email: [email protected]

    • office: 1155 EMS

    • Course webpage: http://www.uwm.edu/~joebock/790.html

    • office hours: ???

      • Possible times:

        • before class on Monday (3:30-5:30)

        • Monday morning

        • Wednesday morning

        • after class Monday (7:00-9:00)


Textbook reading assignment

Textbook & Reading Assignment

  • Machine Learning (Tom Mitchell)

    • Bookstore in union, $140 new

    • Amazon.com hard cover: $125 new , $80 used

    • Amazon.com soft cover: < $30

  • Read (posted on class web page)

    • Preface

    • Chapter 1

    • Sections 6.1, 6.2, 6.9, 6.10

    • Sections 8.1, 8.2


Powerpoint vs whiteboard

Powerpoint Vs Whiteboard

  • Powerpoint encourages words over pictures (not good)

  • But powerpoint can be saved, tweaked, easily shared, …

    • Notes posted on course website following lecture

  • Your thoughts?


Full disclosure

Full Disclosure

  • Slides are a combination of

    • Jude Shavlik’s notes from UW-Madison machine learning course (Prof. I had)

    • Textbook Slides (Google “machine learning textbook”)

    • My notes


Class email list

Class Email List

  • Is there one?


Course outline

Course Outline

  • 1st half covers supervised learning

    • Algorithms: support vector machines, neural networks, probabilistic models …

    • Methodology

  • 2nd half covers graphical probability models

    • Powerful statistical models very useful for learning in complex and/or noisy settings


Course style

Course "Style"

  • Primarily algorithmic & experimental

  • Some theory, both mathematical & conceptual (much on statistics)

  • "Hands on" experience, interactive lectures/discussions

  • Broad survey of many ML subfields

    • "symbolic" (rules, decision trees)

    • "connectionist" (neural nets)

    • Support Vector Machines

    • statistical ("Bayes rule")

    • genetic algorithms (if time)


Two major goals

Two Major Goals

  • to understand what a learning system should do

  • to understand how (and how well) existing systems work


Background assumed

Background Assumed

  • Programming

    • Data structures and algorithms

      • CS 535

  • Math

    • Calculus (partial derivatives)

    • Simple probability & statistics


Programming assignments in matlab

Programming Assignments in MATLAB

  • Why MATLAB?

    • Fast prototyping

    • Integrated plotting

    • Widely used in academia (industry too?)

    • Will save you time in the long run

  • Why not MATLAB?

    • Proprietary software

    • Harder to work from home

  • Optional Assignment: familiarize yourself with MATLAB, use MATLAB help system


Student computer labs

Student Computer Labs

  • E256, E280, E285, E384, E270

  • All have MATLAB installed under Windows XP


Requirements

Requirements

  • Bi-weekly programming plus perhaps some “paper & pencil” homework

    • "hands on" experience valuable

    • HW0 – build a dataset

    • HW1 & HW2 supervised learning algorithms

    • HW3 & HW4 graphical probability models

  • Midterm exam (after about 8-10 weeks)

  • Final exam

  • Find project of your choosing

    • during last 4-5 weeks of class


Grading

Grading

HW's25%

Project20%

Midterm20%

Final30%

Quality Discussion 5%


Late hw s policy

Late HW's Policy

  • HW's due @ 4pm

  • you have 5 late days to use over the semester

    • (Fri 4pm → Mon 4pm is 1 late "day")

  • SAVE UP late days!

    • extensions only for extreme cases

  • Penalty points after late days exhausted

    • 10% per day

  • Can't be more than one week late


Machine learning vs data mining

Machine Learning Vs Data Mining

  • Machine Learning: computer algorithms that improve automatically through experience [Mitchell].

  • Data Mining: Extracting knowledge from large amounts of data. [Han & Kamber] (synonym: knowledge discovery in databases (KDD))


What s the difference topics in ml and dm texts mitchell vs han kamber

What’s the difference? Topics in ML and DM texts (Mitchell Vs Han & Kamber)

Supervised learning, decision trees, neural nets,

Bayesian networks, k-nearest neighbor, genetic algorithms, unsupervised learning (clustering in DM jargon),…

reinforcement learning, learning theory, evaluating learning systems, using domain knowledge, inductive logic programming, …

Data Warehouse,

OLAP, query languages, association rules, presentation, …

ML

DM

We’ll try to cover topics in red


The learning problem

The learning problem

  • Learning = improving with experience

  • Example: learn to play checkers

  • Improve over task T,

  • with respect to performance measure P,

  • based on experience E

  • T: Play Checkers

  • P: % of games won

  • E: games played against self


Famous example discovering genes

Famous Example: Discovering Genes

  • T: find genes in DNA sequences

    • ACGTGCATGTGTGAACGTGTGGGTCTGATGATGT…

  • P: % of genes found

  • E: experimentally verified genes

* Prediction of Complete Gene Structures in Human Genomic DNA,

Burge & Carlin J. Molecular Biology, 1997, 268 78-94


Famous example 2 autonomous vehicles driving

Famous Example 2: Autonomous Vehicles Driving

  • T: drive vehicle

  • P: reach destination

  • E: machine observation of human driver


Ml key to winning darpa grand challenge

ML key to winning DARPA Grand Challenge

Stanford team won 2005 driverless vehicle race

across Mojave Desert

“The robot's software system relied predominately on state-of-the-art AI technologies, such as machine learning and probabilistic

reasoning.”

[Winning the DARPA Grand Challenge, Thrun et al., Journal of Field Robotics, 2006]


Why study machine learning data mining

Why study machine learning (data mining)?

  • Data is plentiful

    • Retail, video, images, speech, text, DNA, bio-medical measurements, …

  • Computational power is available

  • Budding Industry

  • ML has great applications

  • ML still relatively immature


Next time hw0 create your own dataset

Next Time: HW0 – Create Your Own Dataset

  • Think about this

    • will need to create it by week after next

  • Google to find:

    • UCI archive (or UCI KDD archive)

    • UCI ML archive (UCI machine learning repository)


Hw0 your personal concept

HW0 – Your “Personal Concept”

  • Step 1: Choose a Boolean (true/false) concept

    • Subjective Judgement

      • Books I like/dislike

      • Movies I like/dislike

      • Web pages I like/dislike

    • “Time will tell” concepts

      • Stocks to buy

      • Medical outcomes

    • Sensory interpretation

      • Face recognition (See text)

      • Handwritten digit recognition

      • Sound recognition


Hw0 your personal concept1

HW0 – Your “Personal Concept”

  • Step 2: Choosing a feature Space

    • We will use fixed-length feature vectors

      • Choose N features

      • Each feature has Vipossible values

      • Each example is represented by a vector of N feature values

        (i.e., is a point in the feature space)

        e.g.: <red, 50, round>

        colorweight shape

    • Feature Types

      • Boolean

      • Nominal

      • Ordered

      • Hierarchical

  • Step 3: Collect examples (“I/O” pairs)

Defines a space

In HW0 we will use a subset

(see next slide)


Standard feature types for representing training examples source of domain knowledge

closed

polygon

continuous

square

triangle

circle

ellipse

Standard Feature Typesfor representing training examples – source of “domain knowledge”

  • Nominal

    • No relationship among possible values

      e.g., color є {red, blue, green} (vs. color = 1000 Hertz)

  • Linear (or Ordered)

    • Possible values of the feature are totally ordered

      e.g., size є{small, medium, large}←discrete

      weight є [0…500] ←continuous

  • Hierarchical

    • Possible values are partiallyordered in an ISA hierarchy

      e.g. for shape->


Example hierarchy kdd journal vol 5 no 1 2 2001 page 17

Product

Pct

Foods

Tea

99 Product

Classes

2302 Product

Subclasses

Dried

Cat Food

Canned

Cat Food

Friskies

Liver, 250g

~30k

Products

Example Hierarchy (KDD* Journal, Vol 5, No. 1-2, 2001, page 17)

  • Structure of one feature!

  • “the need to be able to incorporate hierarchical (knowledge about data types) is shown in every paper.”

  • - From eds. Intro to special issue (on applications) of KDD journal, Vol 15, 2001

* Officially, “Data Mining and Knowledge Discovery”, Kluwer Publishers


Our feature types for homeworks

Our Feature Types(for homeworks)

  • Discrete

    • tokens (char strings, w/o quote marks and spaces)

  • Continuous

    • numbers (int’s or float’s)

      • If only a few possible values (e.g., 0 & 1) use discrete

    • i.e., merge nominal and discrete-ordered

      (or convert discrete-ordered into 1,2,…)

    • We will ignore hierarchy info and

      only use the leaf values (it is rare any way)


Today s topics

Today’sTopics

  • Creating a dataset of

  • HW0 out on-line

    • Due next Monday

fixed length feature vectors


Some famous examples

Digitized

camera image

Learned

Function

Steering

Angle

age = 13

sex = M wgt = 18

Learned

Function

ill

vs

healthy

Some Famous Examples

  • Car Steering (Pomerleau)

  • Medical Diagnosis (Quinlan)

  • DNA Categorization

  • TV-pilot rating

  • Chemical-plant control

  • Back gammon playing

  • WWW page scoring

  • Credit application scoring

Medical

record


Hw0 creating your dataset

HW0: Creating your dataset

  • Choose a dataset

    • based on interest/familiarity

    • meets basic requirements

      • >1000 examples

      • category (function) learned should be binary valued

      • ~500 examples labeled class A,

        other 500 labeled class B

        → Internet Movie Database (IMD)


Hw0 creating your dataset1

HW0: Creating your dataset

  • IMD has a lot of data that are not discrete or continuous or binary-valued for target function (category)

Name

Country

List of movies

Name

Year of birth

Gender

Oscar nominations

List of movies

Studio

Actor

Name

Year of birth

List of movies

Director/

Producer

Made

Directed

Acted in

Produced

Movie

Title, Genre, Year, Opening Wkend BO receipts,

List of actors/actresses, Release season


Hw0 creating your dataset2

HW0: Creating your dataset

  • Choose a boolean or binary-valued target function (category)

    • Opening weekend box office receipts > $2 million

    • Movie is drama? (action, sci-fi,…)

    • Movies I like/dislike (e.g. Tivo)


Hw0 creating your dataset3

HW0: Creating your dataset

  • How to transfer available attributes:

    Other example attributes (select predictive features)

    • Movie

      • Average age of actors

      • Number of producers

      • Percent female actors

    • Studio

      • Number of movies made

      • Average movie gross

      • Percent movies released in US


Hw0 creating your dataset4

HW0: Creating your dataset

  • Director/Producer

    • Years of experience

    • Most prevalent genre

    • Number of award winning movies

    • Average movie gross

  • Actor

    • Gender

    • Has previous Oscar award or nominations

    • Most prevalent genre


Hw0 creating your dataset5

HW0: Creating your dataset

David Jensen’s group at UMass used Naïve Bayes (NB) to predict the following based on attributes they selected and a novel way of sampling from the data:

  • Opening weekend box office receipts > $2 million

    • 25 attributes

    • Accuracy = 83.3%

    • Default accuracy = 56%

  • Movie is drama?

    • 12 attributes

    • Accuracy = 71.9%

    • Default accuracy = 51%

  • http://kdl.cs.umass.edu/proximity/about.html


What do you think machine learning means

What Do You Think Machine Learning Means?


What is learning

What is Learning?

Learning denotes changes in the system that

… enable the system to do the same task …

more effectively the next time.

- Herbert Simon

Learning is making useful changes in our minds.

- Marvin Minsky


Major paradigms of machine learning

Not in Mitchell’s textbook (will spend 0-2 lectures on this – but also in CS776)

Major Paradigms of Machine Learning

  • Inducing Functions from I/O Pairs

    • Decision trees (e.g., Quinlan’s C4.5 [1993])

    • Connectionism / neural networks (e.g., backprop)

    • Nearest-neighbor methods

    • Genetic algorithms

    • SVM’s

  • Learning without a Teacher

    • Conceptual clustering

    • Self-organizing systems

    • Discovery systems


Major paradigms of machine learning1

Will be covered briefly

Major Paradigms of Machine Learning

  • Improving a Multi-Step Problem Solver

    • Explanation-based learning

    • Reinforcement learning

  • Using Preexisting Domain Knowledge Inductively

    • Analogical learning

    • Case-based reasoning

    • Inductive/explanatory hybrids


  • Login