Loading in 2 Seconds...

Genetic Programming With Boosting for Ambiguities in Regression Problem

Loading in 2 Seconds...

- 88 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Genetic Programming With Boosting for Ambiguities in Regression Problem' - avariella

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Genetic Programming With Boosting for Ambiguities in Regression Problem

Grégory Paris

Laboratoire d’informatique du Littoral

Université du Littoral-côte d’Opale

62228 Calais Cedex, France

What Are Ambiguities?

For a given x, several values are possible for f(x).

Contents

- Boosting to get several values
- Boosting in few words
- GPboost: our algorithm for regression problem
- Boosting deals with ambiguities, clusters the data
- Dealing with several values: Dendrograms
- Presentation
- Application
- Results and conclusion

Presentation of Boosting

- Introduced by Freund and Schapire in 90’s
- Improvement of machine learning methods
- For weak learners methods (methods that perform better than a random search)
- Decrease of error on learning set is assured
- Makes several hypothesis on different distributions
- Makes them vote to get a final hypothesis

Boosting and GP

- Iba’s version in 1999
- Distributions are used to build the fitness set
- Our version in 2001
- Distribution is included in the fitness function

Fitness set:

Distribution:

Each example has a weight

Initial weight is for each example

will be run T times (T rounds of boosting) with different distributions

GPboost(notation)« Weak Learner » :

: a GP algorithm including distribution in its fitness

Fitness function:

For do

Run using

The best-of-run is denoted

is the confidence given to function

: error on

: Normalization factor

GPboost (main loop)Update distribution for the next round:

End For

Each function gives a value for x

A median weighted by confidence values is computed

Others medians provide similar results

GPboost(final hypothesis)Using Boosting (1)

- Principle of boosting is to focus on points which have not been matched on previous round
- In ambiguities, all the points can not be matched with one function
- Using weights to alternatively focus on ambiguities.

Target

rms

- e.g.

- We are seeking a fitness function which will focus on extrema rather than average points

Application

- We run GPboost on this ambiguities problem
- We use our fitness function
- We set T=6, the number of rounds

Merging the data

- We are given 6 functions
- For a given x, we can provide 6 values
- We have to find a way to pick up 2 values among the 6.
- We propose dendrograms to solve this problem

Dendrogram

- T values
- Cluster the set of values and take the median of each cluster
- To cluster the values, we build dendrogram
- Start with T clusters
- At each step, group the two nearest clusters

Dendrogram (Building)

S={-1.1; -1; 0; 0.15; 1; 1.05}

Cut the dendrogram

- The dendrogram must be cut off at a height corresponding to the number of values we want.

Computing cut-off values

- A fixed cut-off value gives better results but needs a priori knowledge of the problem
- Dynamic cut-off value
- The number of values will be computed in order to reduce the error made on fitness set on each ambiguity

Other Benchmarks

- Inverting

Other Benchmarks

- Inverting

Conclusion and Future Work

- Good results on classical and simple problems

To do

- Improving cut-off value
- Applying to real problems

Download Presentation

Connecting to Server..