Genetic programming with boosting for ambiguities in regression problem
Download
1 / 22

Genetic Programming With Boosting for Ambiguities in Regression Problem - PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on

Genetic Programming With Boosting for Ambiguities in Regression Problem. Grégory Paris Laboratoire d’informatique du Littoral Université du Littoral-côte d’Opale 62228 Calais Cedex, France. [email protected] What Are Ambiguities?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Genetic Programming With Boosting for Ambiguities in Regression Problem' - avariella


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Genetic programming with boosting for ambiguities in regression problem

Genetic Programming With Boosting for Ambiguities in Regression Problem

Grégory Paris

Laboratoire d’informatique du Littoral

Université du Littoral-côte d’Opale

62228 Calais Cedex, France

[email protected]


What are ambiguities
What Are Ambiguities? Regression Problem

For a given x, several values are possible for f(x).


Contents
Contents Regression Problem

  • Boosting to get several values

    • Boosting in few words

    • GPboost: our algorithm for regression problem

    • Boosting deals with ambiguities, clusters the data

  • Dealing with several values: Dendrograms

    • Presentation

    • Application

  • Results and conclusion


Presentation of boosting
Presentation of Boosting Regression Problem

  • Introduced by Freund and Schapire in 90’s

  • Improvement of machine learning methods

  • For weak learners methods (methods that perform better than a random search)

  • Decrease of error on learning set is assured

  • Makes several hypothesis on different distributions

  • Makes them vote to get a final hypothesis


Boosting and gp
Boosting and GP Regression Problem

  • Iba’s version in 1999

  • Distributions are used to build the fitness set

  • Our version in 2001

  • Distribution is included in the fitness function


Gpboost notation

Fitness set: Regression Problem

Distribution:

Each example has a weight

Initial weight is for each example

will be run T times (T rounds of boosting) with different distributions

GPboost(notation)

« Weak Learner » :

: a GP algorithm including distribution in its fitness

Fitness function:


Gpboost main loop

For do Regression Problem

Run using

The best-of-run is denoted

is the confidence given to function

: error on

: Normalization factor

GPboost (main loop)

Update distribution for the next round:

End For


Gpboost final hypothesis

Each function gives a value for x Regression Problem

A median weighted by confidence values is computed

Others medians provide similar results

GPboost(final hypothesis)


Using boosting 1
Using Boosting (1) Regression Problem

  • Principle of boosting is to focus on points which have not been matched on previous round

  • In ambiguities, all the points can not be matched with one function

  • Using weights to alternatively focus on ambiguities.


Using boosting 2

Target Regression Problem

rms

  • e.g.

Using Boosting(2)

  • We are seeking a fitness function which will focus on extrema rather than average points


Application
Application Regression Problem

  • We run GPboost on this ambiguities problem

  • We use our fitness function

  • We set T=6, the number of rounds


Run of boosting
Run of Boosting Regression Problem


Merging the data
Merging the data Regression Problem

  • We are given 6 functions

  • For a given x, we can provide 6 values

  • We have to find a way to pick up 2 values among the 6.

  • We propose dendrograms to solve this problem


Dendrogram
Dendrogram Regression Problem

  • T values

  • Cluster the set of values and take the median of each cluster

  • To cluster the values, we build dendrogram

  • Start with T clusters

  • At each step, group the two nearest clusters


Dendrogram building
Dendrogram (Building) Regression Problem

S={-1.1; -1; 0; 0.15; 1; 1.05}


Cut the dendrogram
Cut the dendrogram Regression Problem

  • The dendrogram must be cut off at a height corresponding to the number of values we want.


Computing cut off values
Computing cut-off values Regression Problem

  • A fixed cut-off value gives better results but needs a priori knowledge of the problem

  • Dynamic cut-off value

  • The number of values will be computed in order to reduce the error made on fitness set on each ambiguity


Computing cut-off Value Regression Problem


Results
Results Regression Problem

With dynamic cut-off value

With static cut-off value


Other benchmarks
Other Benchmarks Regression Problem

  • Inverting


Other benchmarks1
Other Benchmarks Regression Problem

  • Inverting


Conclusion and future work
Conclusion and Future Work Regression Problem

  • Good results on classical and simple problems

    To do

  • Improving cut-off value

  • Applying to real problems


ad