1 / 24

System Identification and Curve Fitting with a Genetic Algorithm Hierarchy

System Identification and Curve Fitting with a Genetic Algorithm Hierarchy. Alice E. Smith and Mehmet Gulsen Department of Industrial Engineering University of Pittsburgh INFORMS Fall 1997. Curve Fitting.

Download Presentation

System Identification and Curve Fitting with a Genetic Algorithm Hierarchy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. System Identification and Curve Fitting with aGenetic Algorithm Hierarchy Alice E. Smith and Mehmet Gulsen Department of Industrial Engineering University of Pittsburgh INFORMS Fall 1997

  2. Curve Fitting Process of approximating a closed form function to a given data set of independent variables and dependent variable (variable selection, closed form function selection, coefficient estimation). Used for: • System identification • Judging the strength of relationship • Identifying main variables and interaction between variables • Interpolate/extrapolate to new data

  3. Conventional Approaches • Various regression techniques • Time series analysis • Spline fitting • Neural networks

  4. Genetic Algorithm Hierarchy Function and Variable Selection Upper Module optimized coefficients for functions candidate functions Lower Module Coefficient Estimation

  5. Upper GA Population Lower GA Population 1 1 1 n1 n2 n Data Lower GA Search Upper GA Search Search Structure

  6. Genetic Search Process Top Half Selection Offspring Initial Population Initial Population best (n) Final Population Mutants Offspring Uniform Selection Mutants

  7. Upper GA - Function Selection • Explore the possible functional forms that could represent the underlying relationship between independent and dependent variables of a data set • Objective Function: Minimize “adjusted” total error corresponding to the functional form. Adjustment is performed by penalizing more complex representations (more variables, higher order terms) • Stopping Criteria: Search is terminated when no improvement is observed for a specific number of generations

  8. + + + cos * 1 * * Upper GAFunction Selection - Encoding • Tree Structure

  9. + + + cos * 1 * * Upper GAFunction Selection - Penalty Function Penalty Factor = 0.05

  10. + sin + + + cos * 1 * 1 / ln * crossover Upper GAFunction Selection - Crossover Before: Parent 1 Parent 2 After: Offspring 1 Offspring 2

  11. + exp Upper GAFunction Selection - Mutation Before: Parent 1 randomly generated tree + + + cos * 1 * mutation * After: Mutant

  12. Lower GA - Coefficient Estimation • Estimate the coefficients of a given closed form function which minimize the total error over the set of data points Objective Function: Minimize total squared error Minimize K: number of data points • Stopping Criteria: Search is terminated when no improvement is observed for specific number of generations • Detailed results are published in “International Journal of Production Research”, Vol. 33, No. 7, 1995

  13. Lower GACoefficient Estimation - Encoding C1 C2 C3 C4 C5

  14. Lower GA - Selection/Breeding • Parents are selected for breeding uniformly from the superior half of the population • The values of the offspring’s coefficients are determined by calculating the arithmetic mean of the corresponding coefficients of two parents Parent A: 45.876 32.958 12.098 -3.892 0.2356 Parent B: 12.988 35.832 0.234 -12.984 2.4576 Offspring: 29.432 34.395 6.166 -8.438 1.3466

  15. Lower GA - Mutation • Perturbing existing solutions to explore new regions of search space • Perturbation value is obtained by multiplying the current population range with a random factor C1 C2 C3 C4 C5

  16. Test Problem C Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Mean Sd.Dv. 1 9.986 9.998 10.002 10.000 9.996 10.001 9.997 0.005 2 9.999 10.000 10.000 10.000 10.000 10.000 10.000 0.000 3 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 4 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 5 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 6 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 7 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 8 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 9 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 10 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 SE. 0.0017 0.000 0.0000 0.000 0.000 0.000 0.000 -

  17. Test Problem Different Error Metrics 8 7 6 5 Squared Error 4 Absolute Error Log10 of Squared Error 3 2 Maximum Error 1 0 1500 0 500 1000 Number of Generations

  18. Test Problem Different Numbers of Data Points

  19. Empirical Data Sets • Five benchmark problems from the literature 1. onion growth 2. children growth 3. sunspots 4. chemical plant 5. slip casting • Single variable/50 observations to 13 variables/1000 observations • Nonlinear regression, time series analysis, model identification

  20. Test Problem 3, Sunspot Data • Sunspot data from 1700 to 1995 • Highly cyclic with peak and bottom values approximately in every 11.1 years • Cycle is not symmetric. The number of counts reaches to maximum value faster than it drops to a minimum • Training range: 1700-1979 • Validation range: 1980-1995

  21. Functions Identified

  22. Model D

  23. Extrapolation of Model D

  24. Conclusions A unique approach for curve fitting problems Provides closed form function for the given data set Can handle non-linear, discontinuous functions Flexible in terms of error metric Can be used separately for function selection and coefficient optimization Computationally intensive and needs a priori setting of search parameters and penalty function components Forthcoming paper : “A hierarchical genetic algorithm for system identification and curve fitting with a supercomputer implementation,” Mehmet Gulsen and Alice E. Smith, Institute for Mathematics and its Applications, Volumes in Mathematics and its Applications, Volume on Evolutionary Computing.

More Related