1 / 58

Networks for Learning: Regression and Classification

9.520. Networks for Learning: Regression and Classification. tomaso poggio + alessandro verri. Multidisciplinary Approach to the Learning problem. LEARNING THEORY AND ALGORITHMS. Classification + Regression: Information extraction (text classification,…)

chenoa
Download Presentation

Networks for Learning: Regression and Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 9.520 Networks for Learning: Regression and Classification tomaso poggio + alessandro verri

  2. Multidisciplinary Approach to the Learning problem LEARNING THEORY AND ALGORITHMS • Classification + Regression: • Information extraction (text classification,…) • Computer vision (object recognition) • Computer graphics (TTVS) • Sound classification • Bioinformatics (DNA arrays) • Artificial Financial Markets (society of learning agents) ENGINEERING APPLICATIONS, PLAUSIBILITY PROOFS NEUROSCIENCE: MODELS AND EXPERIMENTS

  3. Learning: Brains and Machines CBCL: about 20 people...

  4. Overview of overview o Supervised learning: the problem and how to frame it within classical math o Examples of in-house applications o Learning and the brain

  5. EXAMPLES INPUT1 OUTPUT1 INPUT2 OUTPUT2 ........... INPUTn OUTPUTn Learning from Examples f OUTPUT INPUT

  6. Learning from Examples: formal setting Given a set of l examples(past data) Question: find function f such that is agood predictorof y for a future input x

  7. Neural Networks By the way… the term Neural Network has been overused. Look at the following definition of Neural Networks (from the Wall Street Journal, Nov 2, 1998) “...the investment method is based on a set of neural networks, that is a complex series of mathematical equations and algorithms incorporated into a software program.”

  8. Classical equivalent view: supervised learning as problem of multivariate function approximation =data from f = function f y = approximation of f x Generalization: estimating value of function where there are no data Regression: function is real valued Classification: function is binary

  9. Statistical Learning theory: key questions and foundations Key questions: when is generalization possible? Bounds on generalization error? Problem of multivariate function approximation is ill-posed for a finite sample of “examples” Generalization requires prior assumptions/regularization/capacity control to restrict the space of functions/hypotheses/architectures such as smoothness (tel. dir. example) Thus trade-off between capacity control and sample size: theory characterizes trade-off

  10. A unified theory • Regularization networks -- such as Gaussian Radial Basis Functions -- and also Support Vector Machines for regression (SVMR) and for classification (SVMC) can be justified in terms of a new, fundamental statistical theory of learning. • The theory -- developed mainly by Vapnik -- deals with the problem of learning from finite and small sample sizes. • Its focus is on the problem of capacity control, i.e. the learning and statistical version of regularization. • Its key motivation is to do classification and regression without density estimation

  11. Statistical Learning theory: specific algorithms To restrict space of hypothesis there is a “classical” solution: Regularization Networks The function f that minimizes has the form whereK is the basis function associated with the RKHS norm (the regularizer term) Wahba 1990; Poggio and Girosi, 1989; Smale and Cucker, 2001

  12. Non-classical framework: more general loss function We will see how a “small” extension of classical framework can be made to include RN, SVMC,SVMR… Girosi, Caprile, Poggio, 1990

  13. X Y X X 1 K K K C C 1 C N n + f Equivalence to networks The classical technique (+ SVM) admit the same solution… The three techniques admit the same solution… …and can all be “written” as the same type of network..

  14. Unified framework: RN, SVMR and SVMC “New” result: equation includes Regularization Networks, eg Radial Basis Functions, and Support Vector Machines (classification and regression) and some multilayer perceptrons. Statistical learning theory applies. Review by Evgeniou, Pontil and Poggio Advances in Computational Mathematics, 2000

  15. Theory summary • We will introduce • classical regularization • Vapnik’s theory with the notion of VC dimension • an extension of VC dimension (in order to apply the theory to both RN and SVM and to regression as well as classification) • SVM and properties such as: • SVMC is a special case of SVMR • Relations between Regularization, SVM and BPD • Bayes interpretation of Regularization, SVM, BPD • Beyond SVM and boosting

  16. Overview of overview o Supervised learning: the problem and how to frame it within classical math o Examples of in-house applications o Learning and the brain

  17. OUTPUT INPUT Learning from Examples: Applications Object identification Object categorization Graphics Finance Bioinformatics …

  18. Face identification A view-based system: 15 views Performance: 98% on 68 person database Beymer, 1995

  19. OUTPUT INPUT Learning from Examples: Applications Object identification Object categorization Face expression Graphics Finance Bioinformatics …

  20. Application: A trainable Object Detection System Scanning in x,y and scale Preprocessing with overcomplete dictionary of Haar wavelets TRAINING Data Base QP Solver SVM Classifier

  21. Learning Object Detection: Finding Frontal Faces ... Training Database 1000+ Real, 3000+ VIRTUAL 50,0000+ Non-Face Pattern Sung, Poggio 1995

  22. NON-FACES FACES Support Vectors are Sparse! Some of the support vectors found by SVM training with thousands of face and non-face examples …that is the coefficients in are sparse!

  23. Recent work on face detection • Detection of faces in images • Robustness against slight rotations in depth and imageplane • Full face vs. component-based classifier Heisele, Pontil, Poggio, 2000

  24. New Classifier:Combining Component Detectors Heisele, Pontil, Poggio, 2000

  25. The best existing system for face detection? Heisele, Poggio et al., 2000

  26. Trainable System for Object Detection: Pedestrian detection - Training Papageorgiou and Poggio, 1998

  27. System installed in experimental Mercedes A fast version, integrated with a real-time obstacle detection system MPEG Constantine Papageorgiou

  28. OUTPUT INPUT Learning from Examples: Applications Object identification Object categorization Face expression Graphics Finance Bioinformatics …

  29. Image Analysis Þ Bear (0° view) IMAGE ANALYSIS: OBJECT RECOGNITION AND POSE ESTIMATION ÞBear (45° view)

  30. The problem The main goal is to estimate basic facial parameters, e.g. degree of mouth openness, through learning

  31. The Three Stages Localization of Facial Features Analysis of Facial parts Face Detection

  32. OUTPUT INPUT Learning from Examples: Applications Object identification Object categorization Face expression Graphics Finance Bioinformatics …

  33. Image Synthesis Q = 0° view Þ UNCONVENTIONAL GRAPHICS Q = 45° view Þ

  34. nFX Interactive nFX Toono (S. Librande, R. Belfer)

  35. Supermodels MPEG (Steve Lines)

  36. A trainable system for TTVS • Input: Text • Output: photorealistic talking face uttering text Tony Ezzat

  37. TTVS: video Tony Ezzat, T. Poggio

  38. Reconstructed 3D Face Models from 1 image Blanz and Vetter, MPI SigGraph ‘99

  39. Reconstructed 3D Face Models from 1 image Blanz and Vetter, MPI SigGraph ‘99

  40. OUTPUT INPUT Learning from Examples: Applications Object identification Object categorization Face expression Graphics Finance Bioinformatics …

  41. Public Limit/Market Orders Bid/Ask Prices Bid/Ask Sizes Buy/Sell Inventory control All available market Information EMM User control/calibration Learning Feedback Loop Artificial Agents – learning algorithms – buy and sell stocks Example of a subproject: The Electronic Market Maker Nicholas Chang

  42. OUTPUT INPUT Learning from Examples: Applications Object identification Object categorization Face expression Graphics Finance Bioinformatics …

  43. Bioinformatics application: predicting type of cancer from DNA chips signals Prediction Statistical Learning Algorithm Prediction Examples Newsample Learning from examples paradigm

  44. Bioinformatics application: predicting type of cancer from DNA chips New feature selection SVM: Only 38 training examples, 7100 features AML vs ALL: 40 genes 34/34 correct, 0 rejects. 5 genes 31/31 correct, 3 rejects of which 1 is an error.

  45. Overview of overview o Supervised learning: the problem and how to frame it within classical math o Examples of in-house applications o Learning and the brain

  46. VIEW ANGLE S Model of view-invariant identification A graphical rewriting of a regularization Network (GRBF), a learning technique Poggio, Edelman, Nature, 1990.

  47. dorsal stream: “where” ventral stream: “what” The Visual System Simplified:Two Visual Pathways, “What” and “Where” Desimone & Ungerleider, 1989

  48. Recording Sites in Anterior IT Logothetis, Pauls, and Poggio, 1995; Logothetis, Pauls, 1995

  49. Model’s predictions: View-tuned Neurons VIEW-TUNED UNITS VIEW ANGLE

  50. Target Views -168 -120 -108 -96 -84 -72 -60 -48 -36 -24 -12 0 o o o o o o o o o o o o -168 -120 -84 -72 -48 -36 -24 -12 0 -108 -96 -60 Distractors o o 12 o 24 o 36 o 48 o 60 o 72 o 84 o 96 o 108 o 120 o 132 168 36 48 60 108 120 12 72 24 84 96 60 spikes/sec 800 msec A “View-Tuned” IT Cell Logothetis et al., 1995

More Related