- 251 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Support Vector Machines: Hype or Hallelujah?' - gavril

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Support Vector Machines:Hype or Hallelujah?

Kristin Bennett

Math Sciences Dept

Rensselaer Polytechnic Inst.

http://www.rpi.edu/~bennek

M2000

Outline

- Support Vector Machines for Classification
- Linear Discrimination
- Nonlinear Discrimination
- Extensions
- Application in Drug Design
- Hallelujah
- Hype

M2000

Support Vector Machines (SVM)

Key Ideas:

- “Maximize Margins”
- “Do the Dual”
- “Construct Kernels”

A methodology for inference based on Vapnik’s

Statistical Learning Theory.

M2000

Best Linear Separator:Supporting Plane Method

Maximize distance

Between two parallel

supporting planes

Distance

= “Margin”

=

M2000

Statistical Learning Theory

- Misclassification error and the function complexity bound generalization error.
- Maximizing margins minimizes complexity.
- “Eliminates” overfitting.
- Solution depends only on Support Vectors not number of attributes.

M2000

Nonlinear Classification: Map to higher dimensional space

IDEA: Map each point to higher dimensional feature space and

construct linear discriminant in the higher dimensional space.

Dual SVM becomes:

M2000

Generalized Inner Product

By Hilbert-Schmidt Kernels (Courant and Hilbert 1953)

for certain and K, e.g.

M2000

Final SVM Algorithm

- Solve Dual SVM QP
- Recover primal variable b
- Classify new x

Solution only depends on support vectors:

M2000

Support Vector Machines (SVM)

- Key Formulation Ideas:
- “Maximize Margins”
- “Do the Dual”
- “Construct Kernels”
- Generalization Error Bounds
- Practical Algorithms

M2000

SVM Extensions

- Regression
- Variable Selection
- Boosting
- Density Estimation
- Unsupervised Learning
- Novelty/Outlier Detection
- Feature Detection
- Clustering

M2000

Example in Drug Design

- Goal to predict bio-reactivity of molecules to decrease drug development time.
- Target is to predict the logarithm of inhibition concentration for site "A" on the Cholecystokinin (CCK) molecule.
- Constructs quantitative structure activity relationship (QSAR) model.

M2000

LCCKA Problem

- Training data – 66 molecules
- 323 original attributes are wavelet coefficients of TAE Descriptors.
- 39 subset of attributes selected by linear 1-norm SVM (with no kernels).
- For details see DDASSL project link off of http://www.rpi.edu/~bennek.
- Testing set results reported.

M2000

Many Other Applications

- Speech Recognition
- Data Base Marketing
- Quark Flavors in High Energy Physics
- Dynamic Object Recognition
- Knock Detection in Engines
- Protein Sequence Problem
- Text Categorization
- Breast Cancer Diagnosis
- See: http://www.clopinet.com/isabelle/Projects/

SVM/applist.html

M2000

Hallelujah!

- Generalization theory and practice meet
- General methodology for many types of problems
- Same Program + New Kernel = New method
- No problems with local minima
- Few model parameters. Selects capacity.
- Robust optimization methods.
- Successful Applications

BUT…

M2000

HYPE?

- Will SVMs beat my best hand-tuned method Z for X?
- Do SVM scale to massive datasets?
- How to chose C and Kernel?
- What is the effect of attribute scaling?
- How to handle categorical variables?
- How to incorporate domain knowledge?
- How to interpret results?

M2000

Support Vector Machine Resources

- http://www.support-vector.net/
- http://www.kernel-machines.org/
- Links off my web page:

http://www.rpi.edu/~bennek

M2000

Download Presentation

Connecting to Server..