1 / 25

A distributed PSO – SVM hybrid system with feature selection and parameter optimization

A distributed PSO – SVM hybrid system with feature selection and parameter optimization. Cheng-Lung Huang & Jian-Fan Dun. Soft Computing 2008. Introduction.

jamar
Download Presentation

A distributed PSO – SVM hybrid system with feature selection and parameter optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A distributed PSO–SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008

  2. Introduction • Hybridizing the particle swarm optimization (PSO) and support vector machines (SVM) to improve the classification accuracy with a small and appropriate feature subset. • Combining the discrete PSO with the continuous-valued PSO • Implementing via a distributed architecture using the web service technology to reduce the computational time.

  3. Introduction • The continuous-valued version is used to optimize the best SVM model parameters. • The discrete version is used to search the optimal feature subset. • PSO can be easily adopted for parallel processing by distributed system.

  4. Support Vector Machine • Kernel Function: RBF (C and Gamma ) • Multi-class strategies: one-against-one (adapt in this study) one-against-all

  5. Particle swarm optimization • Rnd( ) is a random function in the range[0, 1] • Positive constant c1 and c2 are personal and social learning factors. • w is the inertia weight and Inertia weight balances the global exploration and local exploitation. • Pi,d denote the best previous position encountered by the ith particle. • Pg,d denotes the global best position thus far. • t denotes the iteration counter.

  6. Particle swarm optimization • The new position of a particle is calculated using the following formula:

  7. Binary PSO • The function S(v) is a sigmoid limiting transformation and rnd( ) is a random number selected from a uniform distribution in [0, 1].

  8. Particle representation • Features mask (discrete-valued) • C (continuous-valued) • Gamma (continuous-valued)

  9. Fitness definition • WA: SVM classification accuracy weight • acci: SVM classification accuracy • WF: weight of the features • f j :the value of feature mask-‘‘1’’represents that feature j is selected and ‘‘0’’ represents that feature j is not selected. • nF : the total number of features.

  10. Strategies for setting the inertia weight

  11. Data descriptions • There are eight target classes that need to be classified in this data set. • The data set has 30 features that only five of them (f5, f10, f15, f20, and f25) are relevant to the eight classes.

  12. Experimental procedures • Randomly split the data into ten groups using stratified 10-fold cross validation. • Each group contains training, validation and test sets. • The training set is used to build the SVM model. • The validation set is used to determine the proper training iteration to avoid overtraining • The test set is used to evaluate the model’s classification accuracy.

  13. Setting of the system parameters

  14. Experimental procedures

  15. Experimental procedures

  16. Experimental procedures

  17. Experimental results

  18. Experimental results • HITF : the number of hits on correct features. • COVERF : the number of times the selected feature subset covered the correct features. • RATIOF : the ratio of correct features for the ten experiments (10-fold CV).

  19. Experimental results • f : denote the selected feature subset by the PSO. • F : denote correct discriminating features (f5, f10, f15, f20,and f25 in this experiment),

  20. Experimental results

  21. Fitness

  22. Distributed architectures

  23. CPU Time

  24. Conclusions • Input feature subset selection and the kernel parameters setting are crucial problems. • This study proposed a new hybrid PSO–SVM system to solve these two problems. • To overcome the long training time when dealing with a large-scale dataset, the PSO–SVM can be implemented with a distributed parallel architecture.

  25. Thank You

More Related