1 / 35

LS-SVMlab & Large scale modeling

LS-SVMlab & Large scale modeling. Kristiaan Pelckmans, ESAT- SCD/SISTA J.A.K. Suykens, B. De Moor. I. Overview II. Classification III. Regression IV. Unsupervised Learning V. Time-series VI. Conclusions and Outlooks. Content. People Contributors to LS-SVMlab: Kristiaan Pelckmans

mikko
Download Presentation

LS-SVMlab & Large scale modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LS-SVMlab & Large scale modeling Kristiaan Pelckmans, ESAT- SCD/SISTA J.A.K. Suykens, B. De Moor

  2. I. Overview II. Classification III. Regression IV. Unsupervised Learning V. Time-series VI. Conclusions and Outlooks Content

  3. People • Contributors to LS-SVMlab: • Kristiaan Pelckmans • Johan Suykens • Tony Van Gestel • Jos De Brabanter • Lukas Lukas • Bart Hamers • Emmanuel Lambert • Supervisors: • Bart De Moor • Johan Suykens • Joos Vandewalle Acknowledgements Our research is supported by grants from several funding agencies and sources: Research Council K.U.Leuven: Concerted Research Action GOA-Mefisto 666 (Mathematical Engineering), IDO (IOTA Oncology, Genetic networks), several PhD/postdoc & fellow grants; Flemish Government: Fund for Scientific Research FWO Flanders (several PhD/postdoc grants, projects G.0407.02 (support vector machines), G.0080.01 (collective intelligence), G.0256.97 (subspace), G.0115.01 (bio-i and microarrays), G.0240.99 (multilinear algebra), G.0197.02 (power islands), research communities ICCoS, ANMMM), AWI (Bil. Int. Collaboration South Africa, Hungary and Poland), IWT (Soft4s (softsensors), STWW-Genprom (gene promotor prediction), GBOU McKnow (Knowledge management algorithms), Eureka-Impact (MPC-control), Eureka-FLiTE (flutter modeling), several PhD-grants); Belgian Federal Government: DWTC (IUAP IV-02 (1996-2001) and IUAP V-10-29 (2002-2006): Dynamical Systems and Control: Computation, Identification & Modelling), Program Sustainable Development PODO-II (CP-TR-18: Sustainibility effects of Traffic Management Systems); Direct contract research: Verhaert, Electrabel, Elia, Data4s, IPCOS. JS is a professor at K.U.Leuven Belgium and a postdoctoral researcher with FWO Flanders. BDM and JWDW are full professors at K.U.Leuven Belgium.

  4. Goal of the Presentation Overview & Intuition Demonstration LS-SVMlab Pinpoint research challenges Preparation NIPS 2002 Research results and challenges Towards applications Overview LS-SVMlab I. Overview

  5. I.2 Overview research “Learning, generalization, extrapolation, identification, smoothing, modeling” • Prediction (black box modeling) • Point of view: Statistical Learning, Machine Learning, Neural Networks, Optimization, SVM

  6. I.2 Type, Target, Topic

  7. I.3 Towards applications • System identification • Financial engineering • Biomedical signal processing • Datamining • Bio-informatics • Textmining • Adaptive signal processing

  8. I.4 LS-SVMlab

  9. I.4 LS-SVMlab (2) • Starting points: • Modularity • Object Oriented & Functional Interface • Basic bricks for advanced research • Website and tutorial • Reproducibility (preprocessing)

  10. II. Classification “Learn the decision function associated with a set of labeled data points to predict the values of unseen data” • Least Squares – Support Vector Machines • Bayesian Framework • Different norms • Coding schemes

  11. II.1 Least Squares – Support vector Machines (LS-SVM (,)) • Least Squares cost-function + regularization & equality constraints • Non-linearity by Mercer kernels • Primal-Dual Interpretation (Lagrange multipliers) Primal parametric Model: Dual non-parametric Model:

  12. II.1 LS-SVM (,) “Learning representations from relations”

  13. II.2 Bayesian Inference • Bayes rule (MAP): • Closed form formulas Approximations: - Hessian in optimum - Gaussian distribution • Three levels of posteriors:

  14. II.3 SVM formulations & norms • 1 norm + inequality constraints: SVM extensions to any convex cost-function • 2 norm + equality constraints: LS-SVM weighted versions

  15. … -1 -1 -1 1 … … 1 -1 -1 -1 … … 1 2 4 6 2 1 3 … … 1 2 4 6 2 1 3 … … … 1 -1 1 1 … Encoding Decoding II.4 Coding schemes Multi-class Classification task  (multiple) binary classifiers Labels:

  16. III. Regression “Learn the underlying function from a set of data points and its corresponding noisy targets in order to predict the values of unseen data” • LS-SVM(,) • Cross-validation (CV) • Bayesian Inference • Robustness

  17. III.1 LS-SVM(,) • Least Squares cost-function + Regularization & Equality constraints • Mercer kernels • Lagrange multipliers: Primal Parametric  Dual Non-parametric

  18. III.1 LS-SVM(,) (2) • Regularization parameter: • Do not fit noise (overfitting)! • trade-off noise and information

  19. 1 2 3 …. t-1 t … n 1 2 3 …. t-2 t-1 t t+1 t+2 … n 1 2 3…t-l-1 t-l…t+l t+1+l … n III.2 Cross-validation (CV) “How to estimate generalization power of model?” • Division training set – test set • Repeated division: Leave-one-out CV (fast implementation) • L-fold cross-validation • Generalized Cross-validation (GCV): • Complexity criteria: AIC, BIC, …

  20. III.2 Cross-validation Procedure (CVP) “How to optimize model for optimal generalization performance” • Trade-off fitting – model complexity • Kernel parameters • Optimization routine?

  21. III.1 LS-SVM(,) (3) • Kernel type and parameter “Zoölogy as elephantism and non-elephantism” • Model Comparison • By cross-validation or Bayesian Inference

  22. III.3 Applications “ok, but does it work?” • Soft4s • Together with O. Barrero, L. Hoegaerts, IPCOS (ISMC), BASF, B. De Moor • Soft-sensor • ELIA • Together with O. Barrero, I.Goethals, L. Hoegaerts, I.Markovsky, T. Van Gestel, ELIA, B. De Moor • Prediction short and long term electricity consumption

  23. III.2 Bayesian Inference • Bayes rule (MAP): • Closed form formulas • Three levels of posteriors:

  24. III.4 Robustness “How to build good models in the case of non-Gaussian noise or outliers” • Influence function • Breakdown point • How: • De-preciating influence of large residuals • Mean - Trimmed mean – Median • Robust CV, GCV, AIC,…

  25. IV. Unsupervised Learning “Extract important features from the unlabeled data” • Kernel PCA and related methods • Nyström approximation • From Dual to primal • Fixed size LS-SVM

  26. y z x IV.1 Kernel PCA Principal Component Analysis Kernel based PCA

  27. IV.2 Kernel PCA (2) • Primal Dual LS-SVM style formulations • For Kernel PCA, CCA, PLS

  28. IV.2 Nyström approximation • Sampling of integral equation • Approximating Feature map for Mercer kernel

  29. IV.3 Fixed Size LS-SVM ?

  30. V. Time-series “Learn to predict future values given a sequence of past values” • NARX • Recurrent vs. feedforward

  31. f V.1 NARX • Reducible to static regression • CV and Complexity criteria • Predicting in recurrent mode • Fixed size LS-SVM (sparse representation)

  32. V.1 NARX (2) Santa Fe Time-series competition

  33. V.2 Recurrent models? “How to learn recurrent dynamical models?” • Training cost = Prediction cost? • Non-parametric model class? • Convex or non-convex? • Hyper-parameters?

  34. VI.0 References • J. A. K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor & J. Vandewalle (2002), Least Squares Support Vector Machines, World Scientific. • V. Vapnik (1995), The Nature of Statistical Learning Theory, Springer-Verlag. • B. Schölkopf &A. Smola (2002), Learning with Kernels, MIT Press. • T. Poggio & F. Girosi (1990), ``Networks for approximation and learning'', Proc. of the IEEE, , 78, 1481-1497. • N. Cristianini &J. Shawe-Taylor (2000),An Introduction to Support Vector Machines, Cambridge University Press.

  35. VI. Conclusions “Non-linear Non-parametric learning as a generalized methodology” • Non-parametric Learning • Intuition & Formulations • Hyper-parameters • LS-SVMlab Questions?

More Related