1 / 29

An Introduction to Support Vector Machine Classification

Outline. What do we mean with classification, why is it usefulMachine learning- basic conceptSupport Vector Machines (SVM)Linear SVM

lexiss
Download Presentation

An Introduction to Support Vector Machine Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. An Introduction to Support Vector Machine Classification

    2. Outline What do we mean with classification, why is it useful Machine learning- basic concept Support Vector Machines (SVM) Linear SVM – basic terminology and some formulas Non-linear SVM – the Kernel trick An example: Predicting protein subcellular location with SVM Performance measurments

    3. Classification Everyday, all the time we classify things. Eg crossing the street: Is there a car coming? At what speed? How far is it to the other side? Classification: Safe to walk or not!!!

    5. Classification tasks in Bioinformatics

    6. Problems in classifying biological data Often high dimension of data. Hard to put up simple rules. Amount of data. Need automated ways to deal with the data. Use computers – data processing, statistical analysis, try to learn patterns from the data (Machine Learning)

    8. Black box view of Machine Learning

    9. Tennis example 2

    10. Linear Support Vector Machines

    11. Linear SVM 2

    12. Definitions

    13. Maximizing the margin

    14. The Lagrangian trick

    15. Problems with linear SVM

    16. Non-linear SVM 1

    17. Non-linear svm2

    18. Solving the optimization problem In many cases any general purpose optimization package that solves linearly constrained equations will do. Newtons’ method Conjugate gradient descent Other methods involves nonlinear programming techniques.

    19. Overtraining/overfitting

    20. Overtraining/overfitting 2 Example with a gardener.Example with a gardener.

    21. A practical example, protein localization Proteins are synthesized in the cytosol. Transported into different subcellular locations where they carry out their functions. Aim: To predict in what location a certain protein will end up!!!

    22. Subcellular Locations

    23. Method Hypothesis: The amino acid composition of proteins from different compartments should differ. Extract proteins with know subcellular location from SWISSPROT. Calculate the amino acid composition of the proteins. Try to differentiate between: cytosol, extracellular, mitochondria and nuclear by using SVM

    24. Input encoding

    25. Cross-validation

    26. Performance measurments

    27. Results We definetely get some predictive power out of our models. Seems to be a difference in composition of proteins from different subcellular locations. Another questions: What about nuclear proteins. Is there a difference between DNA-binding proteins and others???

    28. Conclusions We have (hopefully) learned some basic concepts and terminology of SVM. We know about the risk of overtraining and how to put a measure on the risk of bad generalization. SVMs can be useful for example in predicting subcellular location of proteins.

    29. You can’t input anything into a learning machine!!!

    30. References

More Related