1 / 35

Machine Learning

Machine Learning. Spring 2010 Rong Jin. CSE847 Machine Learning. Instructor: Rong Jin Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm Textbook Machine Learning The Elements of Statistical Learning Pattern Recognition and Machine Learning Many subjects are from papers

garry
Download Presentation

Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning Spring 2010 Rong Jin

  2. CSE847 Machine Learning • Instructor: Rong Jin • Office Hour: • Tuesday 4:00pm-5:00pm • Thursday 4:00pm-5:00pm • Textbook • Machine Learning • The Elements of Statistical Learning • Pattern Recognition and Machine Learning • Many subjects are from papers • Web site: http://www.cse.msu.edu/~cse847

  3. Requirements • ~10 homework assignments • Course project • Topic: visual object recognition • Data: over one million images with extracted visual features • Objective: build a classifier that automatically identify the class of objects in images • Midterm exam & final exam

  4. Goal • Familiarize you with the state-of-art in Machine Learning • Breadth: many different techniques • Depth: Project • Hands-on experience • Develop the way of machine learning thinking • Learn how to model real-world problems by machine learning techniques • Learn how to deal with practical issues

  5. Course Outline • Theoretical Aspects • Information Theory • Optimization Theory • Probability Theory • Learning Theory • Practical Aspects • Supervised Learning Algorithms • Unsupervised Learning Algorithms • Important Practical Issues • Applications

  6. Today’s Topics • Why is machine learning? • Example: learning to play backgammon • General issues in machine learning

  7. Why Machine Learning? • Past: most computer programs are mainly made by hand • Future: Computers should be able to program themselves by the interaction with their environment

  8. Recent Trends • Recent progress in algorithm and theory • Growing flood of online data • Computational power is available • Growing industry

  9. Three Niches for Machine Learning • Data mining: using historical data to improve decisions • Medical records  medical knowledge • Software applications that are difficult to program by hand • Autonomous driving • Image Classification • User modeling • Automatic recommender systems

  10. Typical Data Mining Task • Given: • 9147 patient records, each describing pregnancy and birth • Each patient contains 215 features • Task: • Classes of future patients at high risk for Emergency Cesarean Section

  11. Data Mining Results One of 18 learned rules: If no previous vaginal delivery abnormal 2nd Trimester Ultrasound Malpresentation at admission Then probability of Emergency C-Section is 0.6

  12. Credit Risk Analysis Learned Rules: If Other-Delinquent-Account > 2 Number-Delinquent-Billing-Cycles > 1 Then Profitable-Costumer ? = no If Other-Delinquent-Account = 0 (Income > $30K or Years-of-Credit > 3) Then Profitable-Costumer ? = yes

  13. Programs too Difficult to Program By Hand • ALVINN drives 70mph on highways

  14. Programs too Difficult to Program By Hand • ALVINN drives 70mph on highways

  15. Programs too Difficult to Program By Hand • Visual object recognition

  16. Image Retrieval using Texts

  17. What to Recommend? Description: A high-school boy is given the chance to write a story about an up-and-coming rock band as he accompanies it on their concert tour. Recommend: ? Description:A homicide detective and a fire marshall must stop a pair of murderers who commit videotaped crimes to become media darlings Rating: Description: A biography of sports legend, Muhammad Ali, from his early days to his days in the ring Rating: No Description: A young adventurer named Milo Thatch joins an intrepid group of explorers to find the mysterious lost continent of Atlantis. Recommend: ? Description:Benjamin Martin is drawn into the American revolutionary war against his will when a brutal British commander kills his son. Rating: Yes Software that Models Users History

  18. Netflix Contest

  19. Relevant Disciplines • Artificial Intelligence • Statistics (particularly Bayesian Stat.) • Computational complexity theory • Information theory • Optimization theory • Philosophy • Psychology • …

  20. Today’s Topics • Why is machine learning? • Example: learning to play backgammon • General issues in machine learning

  21. What is the Learning Problem • Learning = Improving with experience at some task • Improve over task T • With respect to performance measure P • Based on experience E • Example: Learning to Play Backgammon • T: Play backgammon • P: % of games won in world tournament • E: opportunity to play against itself

  22. Backgammon • More than 1020 states (boards) • Best human players see only small fraction of all board during lifetime • Searching is hard because of dice (branching factor > 100)

  23. TD-Gammon by Tesauro (1995) • Trained by playing with itself • Now approximately equal to the best human player

  24. Learn to Play Chess • Task T: Play chess • Performance P: Percent of games won in the world tournament • Experience E: • What experience? • How shall it be represented? • What exactly should be learned? • What specific algorithm to learn it?

  25. Choose a Target Function • Goal: • Policy: : b  m • Choice of value function • V: b, m   B = board  = real values

  26. Choose a Target Function • Goal: • Policy: : b  m • Choice of value function • V: b, m   • V: b   B = board  = real values

  27. Value Function V(b): Example Definition • If b final board that is won: V(b) = 1 • If b final board that is lost: V(b) = -1 • If b not final board V(b) = E[V(b*)] where b* is final board after playing optimally

  28. Representation of Target Function V(b) Same value for each board Lookup table (one entry for each board) • Summarize experience into • Polynomials • Neural Networks No Learning No Generalization

  29. Example: Linear Feature Representation • Features: • pb(b), pw(b) = number of black (white) pieces on board b • ub(b), ub(b) = number of unprotected pieces • tb(b), tb(b) = number of pieces threatened by opponent • Linear function: • V(b) = w0pb(b)+ w1pw(b)+ w2ub(b)+ w3uw(b)+ w4tb(b)+ w5tw(b) • Learning: • Estimation of parameters w0, …, w5

  30. Tuning Weights • Given: • board b • Predicted value V(b) • Desired value V*(b) • Calculate error(b) = (V*(b) – V(b))2 For each board feature fi wi wi + cerror(b)fi • Stochastically minimizes b (V*(b)-V(b))2 Gradient Descent Optimization

  31. Obtain Boards • Random boards • Beginner plays • Professionals plays

  32. Obtain Target Values • Person provides value V(b) • Play until termination. If outcome is • Win: V(b)  1 for all boards • Loss: V(b)  -1 for all boards • Draw: V(b)  0 for all boards • Play one move: b  b’ V(b)  V(b’) • Play n moves: b  b’… b(n) • V(b)  V(b(n))

  33. MathematicalModeling Finding Optimal Parameters + Statistics Optimization Machine Learning A General Framework

  34. Today’s Topics • Why is machine learning? • Example: learning to play backgammon • General issues in machine learning

  35. Importants Issues in Machine Learning • Obtaining experience • How to obtain experience? • Supervised learning vs. Unsupervised learning • How many examples are enough? • PAC learning theory • Learning algorithms • What algorithm can approximate function well, when? • How does the complexity of learning algorithms impact the learning accuracy? • Whether the target function is learnable? • Representing inputs • How to represent the inputs? • How to remove the irrelevant information from the input representation? • How to reduce the redundancy of the input representation?

More Related