1 / 23

Automated discovery in math

Automated discovery in math. Machine learning techniques (GP, ILP, etc.) have been successfully applied in science How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? This is an exploration of using GP for that purpose

annot
Download Presentation

Automated discovery in math

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated discovery in math • Machine learning techniques (GP, ILP, etc.) have been successfully applied in science • How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? • This is an exploration of using GP for that purpose • Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data

  2. Cubes V = 8 E = 12 F = 6 V – E + F = 8 – 12 + 6 = 2

  3. Tetrahedra V = 4 E = 6 F = 4 V – E + F = 4 – 6 + 4 = 2

  4. Octahedra V = 6 E = 12 F = 8 V – E + F = 6 – 8 + 12 = 2

  5. Data for Euler’s identity

  6. At a glance • 50 generations • Population: 4000 ASTs • Generation #: 3600 (90% of population) • Maximum AST depth: 13 • Ramped half-and-half initialization • 3 non-terminals: +, -, * • 12 terminals: V, E, F, 1, 2, …, 9 • Crossover, no mutation

  7. Genetic algorithms (GA) • Search a space of solution attempts (“individuals”) • Use natural selection to guide the search • Must have a fitness function that can evaluate any given individual • Individuals procreate by exchanging (recombining) “genetic material”

  8. Example: SAT solving • Problem: Given a CNF formula P over n variables x1,…,xn, find a satisfying assignment • Search space: all n-bit strings • Fitness measure for a given individual b1 bn: # of satisfied clauses in P • Genetic operations: crossover and mutation

  9. Crossover: a1 … aj-1|aj … an + b1 … bj-1|bj … bn a1 … aj-1| bj … bn b1 … bj-1| aj … an Mutation: 0 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1

  10. Generic GA algorithm Parameterized over: N, P, G • Construct a random initial population • Set i := 1 • If i > N then halt • Compute the fitness of each individual; if the fittest solves the problem, halt. • Create a new population: • Pick P – G individuals and copy them • Create G new individuals by repeated applications of genetic operations • Set i := i + 1 and go to step 3

  11. Selection • How is an individual “picked” for reproduction or copying? • Main idea: the probability that an individual is selected should be proportional to the individual’s fitness • Many ways to ensure that. One method is tournament selection: • Pick 0 < k <= P individuals randomly • Select the fittest of the k • When k = 1: No selection pressure • When k = P: Too much selection pressure

  12. Genetic Programming (GP) • An instance of the generic GA scheme • Individuals are now programs, i.e., syntactic objects • Search space is kept finite by bounding program size • Programs are represented as ASTs (abstract syntax trees)

  13. Programs as ASTs if x > 0 then y := x * x else y := z + 1 Parsing if := := > + x y y 0 * x 1 x z

  14. Program structure in GP • Programs are usually simple Herbrand terms, i.e., functional expressions • AST leaves are called terminals • Internal nodes are non-terminals • Non-terminals are function symbols (e.g. +) • Terminals are constants and variables • Terminals + non-terminals must be sufficient for expressing solutions

  15. Viewing a functional AST as a “program” + * y x 2 The program has two “inputs”, x and y. Given specific values for these, it produces a unique result as output

  16. AST Crossover Crossover pt 1 Crossover pt 2 + - + T4 * T3 T1 T2 T5 T6 Parents Children - + T4 * + T3 T1 T5 T6 T2

  17. Initial population • Built randomly • Two methods for building a random AST: • Full method: All branches are equally long • Grow method: Different subtrees can have different sizes (but less than the maximum) • More usual: ramped half-and-half initialization: half of the trees are built with one method, the other half with the other method

  18. Problem formulation • Can cast it as a standard symbolic regression problem • View F as a function of E and V, and search space of all rational functions of two variables (up to a max depth) • Error function: difference between actual # of faces and the result produced by the program • Optimization: minimize the error • Quick convergence

  19. Another approach • Search space of all identities • Generated as follows: I T1 = T2 T L | T1 + T2 | T1 – T2 | T1 * T2 L V | E | F | 1 | 2 | … | 9 • Any other integer can be built from 1,…, 9 and the given non-terminals • Identity is not a non-terminal; it can only appear at the root of an AST

  20. Details • Generate P identities randomly (using ramped half-and-half initialization) • Crossover on two identities S1 = S2 and T1 = T2: • Mate two random subterms Si and Tj from each identity, producing two new subterms Si’ and Tj’ • If either new term is deeper than the max depth, then use one of the original parents • Replace Si and Tj in the identities by Si’ and Tj’ • No mutation

  21. Fitness • An identity is evaluated on a given triple of values for V, E, and F • Computing the fitness of an identity S = T: • For each of the k data triples ½: • If S = T holds for ½, then give the identity a point • Higher score, greater fitness • Maximum fitness: 9, minimum: 0

  22. Problem • Trivially true identities can get perfect scores, e.g.: • V = V • 1 + 2 = 5 – 3 • E – E + E = E • Solution: negative triples, e.g.: • V = 0, E = 0, F = 1 • Trivial identities will hold for such negative triples, but plausible identities will not

  23. Fitness computation • To evaluate an identity S = T: • For each of the k data triples p: • Allocate a point if S = T holds for p • Allocate a second point if S = T does not hold for the negative triple • Maximum score: 18, minimum: 0 • Also impose a penalty of b n/20 c points for an identity of length n (to discourage excessively long expressions)

More Related