1 / 1

Gene Expression Programming for Data Mining and Knowledge Discovery

Gene Expression Programming for Data Mining and Knowledge Discovery. Investigators: Peter Nelson, CS; Xin Li, CS; Chi Zhou, Motorola Inc. Prime Grant Support: Physical Realization Research Center of Motorola Labs. Genotype: sqrt.*.+.*.a.*.sqrt.a.b.c./.1.-.c.d.

moral
Download Presentation

Gene Expression Programming for Data Mining and Knowledge Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gene Expression Programming for Data Mining and Knowledge Discovery Investigators: Peter Nelson, CS; Xin Li, CS; Chi Zhou, Motorola Inc. Prime Grant Support: Physical Realization Research Center of Motorola Labs Genotype: sqrt.*.+.*.a.*.sqrt.a.b.c./.1.-.c.d • Real world data mining tasks: large data set, high dimensional feature set, non-linear form of hidden knowledge; in need of effective algorithms. • Gene Expression Programming (GEP): a new evolutionary computation technique for the creation of computer programs; capable of producing solutions of any possible form. • Research goal: applying and enhancing GEP algorithm to fulfill complex data mining tasks. Phenotype: Mathematical form: Figure 1. Representations of solutions in GEP • Overview: improving the problem solving ability of the GEP algorithm by preserving and utilizing the self-emergence of structures during its evolutionary process. • Constant Creation Methods for GEP: local optimization of constant coefficients given the evolved solution structures to speed up the learning process. • A new hierarchical genotype representation: natural hierarchy in forming the solution and more protective genetic operation for functional components. • Dynamic substructure library: defining and reusing self-emergent substructures in the evolutionary process. • Have finished the initial implementation of the proposed approaches. • Preliminary testing has demonstrated the feasibility and effectiveness of the implemented methods: constant creation methods have achieved significant improvement in the fitness of the best solutions; dynamic substructure library helps identify meaningful building blocks to incrementally form the final solution following a faster fitness convergence curve. • Future work include investigation for parametric constants, exploration of higher level emergent structures, and comprehensive benchmark studies.

More Related