1 / 1

Daniel J. Garcia, Mentors: Dr. Lawrence Hall, Dr. Dmitry Goldgof, Kurt Kramer

Start. Generate 200 random feature sets. Run 10 fold cross validation on each set random set. Sort results of 10-fold cross validation by training time. Select 9 fastest random feature sets. Create three new sets: Union fastest 3, union fastest 5 and Union Fastest 9.

annot
Download Presentation

Daniel J. Garcia, Mentors: Dr. Lawrence Hall, Dr. Dmitry Goldgof, Kurt Kramer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Start Generate 200 random feature sets Run 10 fold cross validation on each set random set Sort results of 10-fold cross validation by training time Select 9 fastest random feature sets Create three new sets: Union fastest 3, union fastest 5 and Union Fastest 9 • Speed Comparison: • Random Sets method is considerably faster than the Wrapper Approach. • The average feature selection time between the Random Sets method and the Wrapper Method is 2 hours and 11 minutes. Finish Accuracy Comparison: 3.4% less accurate than the best achieved accuracy 1.96% less accurate than the best achieved accuracy 1.54% less accurate than the best achieved accuracy Testing the Hypothesis • Features from the fastest random sets are unequivocally better than features from the slower sets. This supports our hypothesis. • The superiority of the features is clearly seen by the comparing the union of 3 sets. The union of the fastest 3 sets is more accurate than the union of the slowest 3 sets, despite having less features to work with. REU 2006-Feature Selection Algorithm from Random Subsets Daniel J. Garcia, Mentors: Dr. Lawrence Hall, Dr. Dmitry Goldgof, Kurt Kramer Introduction Results (Comparison between the Random Sets Method and the well-known Wrappers Method) Feature selection methods are used to find the set of features that yield the best classification accuracy for a given data set. This results in better training and classification time for a classifier, in addition to better classification accuracy. Feature selection, however, is a time consuming process unfit for real time applications. The Big Picture Conclusion Random Sets Method Flowchart As has been shown, using random feature sets as a feature selection tool provides benefits for learning algorithms. Real time application is one of the greatest benefits, perhaps allowing for a limited feature selection algorithm to be run as new data is gathered. The random sets approach is fast, is very accurate in certain situations, and takes great advantage of parallel processing • References • Tong Luo, Kurt Kramer, Dmitry B. Goldgof, Lawrence O. Hall, Scott Samson, Andrew Remsen, Thomas Hopkins, Recognizing Plankton from Shadow Image Particle Profiling Evaluation Recorder, IEEE trans. on system, man and cybernetics-part B: cybernetics, August 2004, vol. 34, no. 4. • Samson, S., Hopkins, T., Remsen, A., Langebrake, L., Sutton, T., Patten, J., 2001. A system for high resolution zooplankton imaging. IEEE Journal of Oceanic Engineering 26 (4), pages 671-676. • Ron Kohavi and George H. John, Wrappers for Feature Subset Selection, Artificial Intelligence archive, December 1997, vol. 97, pages 273-324. • Kurt A. Kramer, Identifying Plankton from Grayscale Silhouette Images, Master Thesis USF, October 2005. • Chih-Chung Chang and Chih-Jen Lin, A Library for Support Vector Machines, libsvm, http://www.csie.ntu.edu.tw/~cjlin/libsvm/. • T. Lou, K. Kramer, D. Goldgof, L. Hall, S. Sampson, A. Remsen, T. Hopkins, "Active Learning to Recognize Multiple Types of Plankton", International Conference on Pattern Recognition (ICPR), Cambridge, UK, August 2004. Department of Computer Science & Engineering

More Related