Genetic programming with boosting for ambiguities in regression problem
Sponsored Links
This presentation is the property of its rightful owner.
1 / 22

Genetic Programming With Boosting for Ambiguities in Regression Problem PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on
  • Presentation posted in: General

Genetic Programming With Boosting for Ambiguities in Regression Problem. Grégory Paris Laboratoire d’informatique du Littoral Université du Littoral-côte d’Opale 62228 Calais Cedex, France. [email protected] What Are Ambiguities?.

Download Presentation

Genetic Programming With Boosting for Ambiguities in Regression Problem

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Genetic Programming With Boosting for Ambiguities in Regression Problem

Grégory Paris

Laboratoire d’informatique du Littoral

Université du Littoral-côte d’Opale

62228 Calais Cedex, France

[email protected]


What Are Ambiguities?

For a given x, several values are possible for f(x).


Contents

  • Boosting to get several values

    • Boosting in few words

    • GPboost: our algorithm for regression problem

    • Boosting deals with ambiguities, clusters the data

  • Dealing with several values: Dendrograms

    • Presentation

    • Application

  • Results and conclusion


Presentation of Boosting

  • Introduced by Freund and Schapire in 90’s

  • Improvement of machine learning methods

  • For weak learners methods (methods that perform better than a random search)

  • Decrease of error on learning set is assured

  • Makes several hypothesis on different distributions

  • Makes them vote to get a final hypothesis


Boosting and GP

  • Iba’s version in 1999

  • Distributions are used to build the fitness set

  • Our version in 2001

  • Distribution is included in the fitness function


Fitness set:

Distribution:

Each example has a weight

Initial weight is for each example

will be run T times (T rounds of boosting) with different distributions

GPboost(notation)

« Weak Learner » :

: a GP algorithm including distribution in its fitness

Fitness function:


For do

Run using

The best-of-run is denoted

is the confidence given to function

: error on

: Normalization factor

GPboost (main loop)

Update distribution for the next round:

End For


Each function gives a value for x

A median weighted by confidence values is computed

Others medians provide similar results

GPboost(final hypothesis)


Using Boosting (1)

  • Principle of boosting is to focus on points which have not been matched on previous round

  • In ambiguities, all the points can not be matched with one function

  • Using weights to alternatively focus on ambiguities.


Target

rms

  • e.g.

Using Boosting(2)

  • We are seeking a fitness function which will focus on extrema rather than average points


Application

  • We run GPboost on this ambiguities problem

  • We use our fitness function

  • We set T=6, the number of rounds


Run of Boosting


Merging the data

  • We are given 6 functions

  • For a given x, we can provide 6 values

  • We have to find a way to pick up 2 values among the 6.

  • We propose dendrograms to solve this problem


Dendrogram

  • T values

  • Cluster the set of values and take the median of each cluster

  • To cluster the values, we build dendrogram

  • Start with T clusters

  • At each step, group the two nearest clusters


Dendrogram (Building)

S={-1.1; -1; 0; 0.15; 1; 1.05}


Cut the dendrogram

  • The dendrogram must be cut off at a height corresponding to the number of values we want.


Computing cut-off values

  • A fixed cut-off value gives better results but needs a priori knowledge of the problem

  • Dynamic cut-off value

  • The number of values will be computed in order to reduce the error made on fitness set on each ambiguity


Computing cut-off Value


Results

With dynamic cut-off value

With static cut-off value


Other Benchmarks

  • Inverting


Other Benchmarks

  • Inverting


Conclusion and Future Work

  • Good results on classical and simple problems

    To do

  • Improving cut-off value

  • Applying to real problems


  • Login