1 / 27

Transformation of Input Space using Statistical Moments : EA-Based Approach

Transformation of Input Space using Statistical Moments : EA-Based Approach. Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis : University of Kent, UK Yew -Soon Ong : Nanyang Technological University, Singapore

ashley
Download Presentation

Transformation of Input Space using Statistical Moments : EA-Based Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transformation of Input Space using Statistical Moments:EA-Based Approach Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis: University of Kent, UK Yew-Soon Ong: Nanyang Technological University, Singapore Khalid Mehamdi: Um Al Qura University, Saudi Arabia

  2. The problem • Standard Regression models are presented with • Observational data of the form (xi, yi) i=1…n • Each xi denotes a k-dimensional input vector of design variables and y is the response. • When k ≫ n, high variance and over-fitting become a major concern.

  3. The problem High dimensional regression problem Regression Model Poor approximation

  4. Solutions The idea of transforming input space to reduce the number of design variables in the regression problems to improve generalisation is relatively little explored thus far. Majority of work in this topic has been done for classification problems. • Curse of dimensionality is solved by: • Reduce number of dimensions by selecting important features (e.g., PCA, FDA, ..etc.) • Transformation of input space (e.g., GP, FFX, ..etc.)

  5. analysis to understand the impact of different statistical moments on the evolved transformation procedure Contributions of this work • A novel evolutionary approach to transform the high-dimensional input space of regression models using only statistical moments. • dramatically improve LR’s generalisation and make it competitive to other state-of-the-art regression models. • Contributions

  6. The proposed transformation x0 z0 x1 Transformation z1 ,,, ,,, xk zn The z is smaller than x and easier to be approximated by standard regression models. We transform the input vector x into and vector called z. (xi, yi) (zi, yi)

  7. The proposed transformation We used standard Genetic Algorithm

  8. Genetic Algorithm Population representation

  9. Genetic Algorithm – Search operators op0op 1 op2 opg op0op 1 op2 opg a0 a2 a3 a7 a5 a8 a2 a3 a4 a2 a7 ... a0 a2 a7 … a0 a5 a6 a7 a9 … a0 a2 a3 a7 a5 a8 a2 a3 a4 a2 a7 ... a0 a2 a7 … a0 a5 a6 a7 a9 … …. …. Crossover in which two individuals exchange statistical moments and their parameters, randomly.

  10. Genetic Algorithm – Search operators op 1 op2 opg New op0 op0 a4 a3 a9 … a0 a2 a3 a7 a5 a8 a2 a3 a4 a2 a7 ... a0 a2 a7 … a0 a5 a6 a7 a9 … …. Aggressive mutation operator that replaces a statistical moment and its parameters, randomly selected, with another randomly selected moments from the pool of statistical moments.

  11. Genetic Algorithm – Search operators op0op 1 op2 opg a0 a2 a3 a7 a5 a8 a2 a3 a4 a2 a7 ... a0 a2 a7 … a0 a5 a6 a7 a9 … …. a4 Smooth mutation operator where a parameter of a randomly selected statistical moment is mutated into a new parameter.

  12. Genetic Algorithm – Fitness measure We used average prediction errors of Linear Regression (LR) as a fitness measure for GA. LR is a very simple algorithm where it considers the family of linear hypotheses:

  13. Genetic Algorithm – Fitness measure • Why LR ? • Hence, given these features LR can push the GA’s evolutionary process to linearly align the transformed inputs with their outputs and minimise the dimensionality of the new space.

  14. Genetic Algorithm – Fitness measure • The GA aims to minimise the following fitness function:

  15. Genetic Algorithm – Training • Two disjoint sets: training and validation. • The best individual in each generation is further tested with the validation set. • LR: two-folds cross-validation approach. • We select the individual that yields the best performance on the validation set across the run.

  16. Empirical tests • We tested the effects of the transformation procedure on LR and compared the results against five regression models, namely: • RBFN • RBFN + PCA • Kriging • Kriging + PCA • LR • LR + PCA • piecewise LR • Genetic Programming • Genetic Programming + PCA

  17. Empirical tests We tested 5 benchmark functions F2 = Schwefel function F1 = Rastriginfunction

  18. Empirical tests F3 = Michalewiczfunction F4 = Sphere function F5 = Dixon & Price function

  19. Empirical tests • For each test function, we trained all regression models to approximate the given function when the number of variables is • 100 variables. • 500 variables. • 1000 variables.

  20. Empirical tests

  21. Empirical tests Approximation Quality Sphere function for 2 variables

  22. Empirical tests LR approximate the Sphere function after input transformation LR approximate the Sphere function

  23. Learn from evolution

  24. Learn from evolution • It is clear from the heat maps that each problem has its unique characteristics. • Interestingly, there is a consensus among all maps that the following operators do not contribute to the construction of good transformation procedures. • copy • copy × intercept.

  25. Learn from evolution • Also, all maps agree that the following are important across all problems. • Average Deviation • Geometric Mean • Min • Max • We still do not have a full understanding of the effect of these moments on the transformed space. In future research we will focus on this aspect.

  26. Conclusions We hope our results will inspire other researchers to build a deeper understanding to discover relations between straight statistical momnetson making good transformation • In this work we presented: • A novel evolutionary approach to transform the high-dimensional input space of regression models using only statistical moments. • analysis to understand the impact of different statistical moments on the evolved transformation procedure. • dramatically improve LR’s generalisation and make it competitive to other state-of-the-art regression models.

  27. Thank you for paying attention!

More Related