digression symbolic regression n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Digression: Symbolic Regression PowerPoint Presentation
Download Presentation
Digression: Symbolic Regression

Loading in 2 Seconds...

play fullscreen
1 / 18

Digression: Symbolic Regression - PowerPoint PPT Presentation


  • 119 Views
  • Uploaded on

Digression: Symbolic Regression. Suppose you are a criminologist, and you have some data about recidivism. Injects Heroin in Eyeballs. Recidivist. Years in Prison. Holds Ph.D. IQ. 10 0 87 1 1

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Digression: Symbolic Regression' - zeus-albert


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
digression symbolic regression
Digression: Symbolic Regression
  • Suppose you are a criminologist, and you have some data about recidivism.

Injects Heroin

in Eyeballs

Recidivist

Years in

Prison

Holds

Ph.D

IQ

10 0 87 1 1

4 1 86 0 0

22 1 186 1 1

6 0 108 0 1

8 0 143 0 0

: : : : :

criminology 101
Criminology 101
  • You want a formula that predicts if someone will go back to jail after being released.
  • The formula will be based on the data collected, so the “independent variables” are
    • x1 = number of years in jail
    • x2 = holds Ph.D.
    • x3 = IQ
    • etc.
  • This is usually done with “regression”. Here is a simpler example, with one independent variable.
symbolic regression
Symbolic Regression
  • A simple data set with one independent variable, called x. What’s the relationship between x and y?

x

y

y

1

2

4

5

7

:

2.1

3.3

3.1

1.8

3.2

:

x

symbolic regression1
Symbolic Regression
  • You might try “linear regression:”

y

y = mx + b

x

symbolic regression2
Symbolic Regression
  • You might try “quadratic regression:”

y

y = ax2 + bx + c

x

symbolic regression3
Symbolic Regression
  • You might try “exponential regression:”

y

y = axb + c

x

symbolic regression4
Symbolic Regression
  • How would you choose?
  • Maybe there is some underlying “mechanism” that produced the data.
  • But you may not know…
  • “Symbolic regression” finds the form of the equation, and the coefficients, simultaneously.
how to do symbolic regression
How To Do Symbolic Regression?
  • One way: genetic programming.
  • “The evolution of computer programs through natural selection.”
  • The brainchild of John Koza, extending work by John Holland.
  • A very bizarre idea that actually works!
  • We will do this.
regression via genetic programming
Regression via Genetic Programming
  • We know how to produce “algebraic expression trees.”
  • We can even form them randomly.
  • Koza says “Make a generation of random trees, evaluate their fitnesses, then let the more fit have sex to produce children.”
  • Maybe the children will be more fit?
expression trees again
Expression Trees Again
  • A one-variable tree is a regression equation:

+

*

-

x

2

+

x

x

.5

y = (((x + 0.5) - x) + (2 * x))

evaluating expression trees
Evaluating Expression Trees

yp = (((x + 0.5) - x) + (2 * x))

x

yo yp |yo - yp|2

Superscripts:

“o” for “observed”

“p” for “predicted”

1

2

4

5

7

2.1 2.5 0.16

3.3 4.5 1.44

3.1 8.5 29.16

1.8 10.5 75.69

3.2 14.5 127.69

234.14 = “fitness”

a generation of random trees
A Generation of Random Trees

Tree 1

Tree 2

Tree 3

Tree 4

Tree Fitness

1 335

2 1530

3 950

4 1462

: :

(most of these are

really rotten!)

choosing parents
Choosing Parents

Tree 1

Tree 2

Tree 3

Tree 4

Generation 1

Tree Fitness

1 335

2 1530

3 950

4 1462

: :

Choose these two,

randomly, “proportional

to their fitness"

sexual reproduction
“Sexual Reproduction”

Choose “crossover

points”, at random

Generation 1

Then, swap the subtrees

to make two new child

trees:

Generation 2

the steps
The Steps
  • Create Generation 1 by randomly generating 500 trees.
  • Find the fitness of each tree.
  • Choose pairs of parent trees, proportional to their fitness.
  • Crossover to make two child trees, adding them to Generation 2.
  • Continue until there are 500 child trees in Generation 2.
  • Repeat for 50 generations, keeping the best (most fit) tree over all generations.
how could this possibly work
How Could This Possibly Work?
  • No one seems to be able to say…
  • John Holland proved something called the “schema theorem,” but it really doesn’t explain much.
  • It’s a highly “parallel” process that recombines “good” building blocks.
  • It really does work very well for a huge variety of hard problems!
why this in a java course
Why This, in a Java Course?
  • Because we’re going to implement it!
  • Because writing code to implement this isn’t too hard.
  • Because it illustrates a large number of O-O and Java ideas.
  • Because it’s fun!
  • Here is what my implementation looks like: