1 / 30

Evolving Hyper-Heuristics using Genetic Programming

Evolving Hyper-Heuristics using Genetic Programming. Supervisor: Moshe Sipper Achiya Elyasaf. Overview. Introduction Searching Games State-Graphs Uninformed Search Heuristics Informed Search Evolving Heuristics Previous Work Rush Hour FreeCell. Representing Games as State-Graphs.

teresa
Download Presentation

Evolving Hyper-Heuristics using Genetic Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolving Hyper-Heuristics using Genetic Programming Supervisor: Moshe Sipper Achiya Elyasaf

  2. Overview • Introduction • Searching Games State-Graphs • Uninformed Search • Heuristics • Informed Search • Evolving Heuristics • Previous Work • Rush Hour • FreeCell

  3. Representing Games as State-Graphs • Every puzzle/game can be represented as a state graph: • In puzzles, board games etc., every piece move can be counted as a different state • In computer war games etc. – the place of the player / the enemy, all the parameters (health, shield…) define a state

  4. Rush-Hour as a state-graph

  5. Searching Games State-GraphsUninformed Search • BFS – Exponential in the search depth • DFS – Linear in the length of the current search path. BUT: • We might “never” track down the right path. • Usually games contain cycles • Iterative Deepening: Combination of BFS & DFS • Each iteration DFS with a depth limit is performed. • Limit grows from one iteration to another • Worst case - traverse the entire graph

  6. Searching Games State-GraphsUninformed Search • Most of the game domains are PSPACE-Complete! • Worst case - traverse the entire graph • We need an informed-search!

  7. Searching Games State-GraphsHeuristics • h:states -> Real. • For every state s, h(s) is an estimation of the minimal distance/cost from s to a solution • h is perfect: an informed search that tries states with highest h-score first – will simply stroll to solution • For hard problems, finding h is hard • Bad heuristic means the search might never track down the solution • We need a good heuristic function to guide informed search

  8. Searching Games State-Graphs Informed Search • Best-First search: Like DFS but select nodes with higher heuristic value first • Not necessarily optimal • Might enter cycles (local extremum) • A*: • Holds closed and sorted (by h-value) open lists. Best node of all open nodes is selected • Maintenance and size of open and closed is not admissible

  9. Searching Games State-Graphs Informed Search (Cont.) • IDA*: Iterative-Deepening with A* • The expanded nodes are pushed to the DFS stack by descending heuristic values • Let g(si) be the min depth of state si: Only nodes with f(s)=g(s)+h(s)<depth-limit are visited • Near optimal solution (depends on path-limit) • The heuristic need to be admissible

  10. Overview • Introduction • Searching Games State-Graphs • Uninformed Search • Heuristics • Informed Search • Evolving Heuristics • Previous Work • Rush Hour • FreeCell

  11. Evolving Heuristics • For H1, … ,Hn – building blocks (not necessarily admissible or in the same range),How should we choose the fittest heuristic? • Minimum? Maximum? Linear combination? • GA/GP may be used for: • Building new heuristics from existing building blocks • Finding weights for each heuristic (for applying linear combination) • Finding conditions for applying each heuristic • H should probably fit stage of search • E.g., “goal” heuristics when assuming we’re close

  12. Evolving Heuristics: GA • Genotype – • Phenotype –

  13. Evolving Heuristics: GP If False Condition True * And + H5 / H2 * ≤ ≥ H1 0.1 H1 0.1 H1 0.4 H2 0.7

  14. Evolving Heuristics: Policies

  15. Evolving Heuristics: Fitness Function

  16. Overview • Introduction • Searching Games State-Graphs • Uninformed Search • Heuristics • Informed Search • Evolving Heuristics • Previous Work • Rush Hour • FreeCell

  17. Rush Hour GP-Rush [Hauptman et al, 2009] Bronze Humie award

  18. Domain-Specific Heuristics • Hand-Crafted Heuristics / Guides: • Blocker estimation – lower bound (admissible) • Goal distance – Manhattan distance • Hybrid blockers distance – combine above two • Is Move To Secluded – did the car enter a secluded area? • Is Releasing Move

  19. Policy “Ingredients” Functions & Terminals:

  20. Coevolving (Hard) 8x8 Boards G F G G F F S S S H H H I I I RED RED RED M M M K K K K K K K K K K K K P P P

  21. Results Average reduction of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search:

  22. Results (cont’d) Time (in seconds) required to solve problems JAM01 . . . JAM40:

  23. FreeCell FreeCell remained relatively obscure until Windows 95 There are 32,000 solvable problems (known as Microsoft 32K), except for game #11982, whichhas been proven to be unsolvable Evolving hyper heuristic-based solvers for Rush-Hour and FreeCell [Hauptman et al, SOCS 2010] GA-FreeCell: Evolving Solvers for the Game of FreeCell [Elyasaf et al, GECCO 2011]

  24. FreeCell (cont’d) • As opposed to Rush Hour, blind search failed miserably • The best published solver to date solves 96% of Microsoft 32K • Reasons: • High branching factor • Hard to generate a good heuristic

  25. Learning Methods: Random Deals Which deals should we use for training? First method tested - random deals • This is what we did in Rush Hour • Here it yielded poor results • Very hard domain

  26. Learning Methods: Gradual Difficulty Second method tested - gradual difficulty • Sort the problems by difficulty • Each generation test solvers against 5 deals from the current difficulty level + 1 random deal

  27. Learning Methods: Hillis-Style Coevolution Third method tested - Hillis-style coevolution using “Hall-of-Fame”: • A deal population is composed of 40 deals (=40 individuals) + 10 deals that represent a hall-of-fame • Each hyper-heuristic is tested against 4 deal individuals and 2 hall-of-fame deals • Evolved hyper-heuristics failed to solve almost all Microsoft 32K! Why?

  28. Learning Methods: Rosin-style Coevolution p1 p2 Fourth method tested - Rosin-style coevolution: • Each deal individual consists of 6 deals • Mutation and crossover: p1

  29. Results

  30. Thank you for listening any questions?

More Related