1 / 27

Evolving Multimodal Networks for Multitask Games

Evolving Multimodal Networks for Multitask Games. Jacob Schrum – schrum2@cs.utexas.edu Risto Miikkulainen – risto@cs.utexas.edu University of Texas at Austin Department of Computer Science. Evolution in videogames Automatically learn interesting behavior Complex but controlled environments

horace
Download Presentation

Evolving Multimodal Networks for Multitask Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolving Multimodal Networks for Multitask Games Jacob Schrum – schrum2@cs.utexas.edu Risto Miikkulainen – risto@cs.utexas.edu University of Texas at Austin Department of Computer Science

  2. Evolution in videogames • Automatically learn interesting behavior • Complex but controlled environments • Stepping stone to real world • Robots • Training simulators • Complexity issues • Multiple contradictory objectives • Multiple challenging tasks

  3. Multitask Games • NPCs perform two or more separate tasks • Each task has own performance measures • Task linkage • Independent • Dependent • Not blended • Inherently multiobjective

  4. Test Domains • Designed to study multimodal behavior • Two tasks in similar environments • Different behavior needed to succeed • Main challenge: perform well in both Back Ramming Front Ramming

  5. Front Ramming Attack w/front ram Avoid counterattacks Back Ramming Attack w/back ram Avoid counterattacks Front/Back Ramming • Same goal, opposite embodiments

  6. Predator Attack prey Prevent escape Prey Avoid attack Stay alive Predator/Prey • Same embodiment, opposite goals

  7. Multiobjective Optimization High health but did not deal much damage • Game with two objectives: • Damage Dealt • Remaining Health • A dominates B iff A is strictly better in one objective and at least as good in others • Population of points not dominated are best: Pareto Front • Weighted-sum provably incapable of capturing non-convex front Tradeoff between objectives Dealt lot of damage, but lost lots of health

  8. NSGA-II • Evolution: natural approach for finding optimal population • Non-Dominated Sorting Genetic Algorithm II* • Population P with size N; Evaluate P • Use mutation to get P´ size N; Evaluate P´ • Calculate non-dominated fronts of {P È P´} size 2N • New population size N from highest fronts of {P È P´} *K. Deb et al. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. Evol. Comp. 2002

  9. Constructive Neuroevolution • Genetic Algorithms + Neural Networks • Build structure incrementally (complexification) • Good at generating control policies • Three basic mutations (no crossover used) Perturb Weight Add Connection Add Node

  10. Multimodal Networks (1) • Multitask Learning* • One mode per task • Shared hidden layer • Knows current task • Previous work • Supervised learning context • Multiple tasks learned quicker than individual • Not tried with evolution yet * R. A. Caruana, "Multitask learning: A knowledge-based source of inductive bias" ICML 1993

  11. Multimodal Networks (2) Starting network with one mode • Mode Mutation • Extra modes evolved • Networks choose mode • Chosen via preference neurons • MM Previous • Links from previous mode • Weights = 1.0 • MM Random • Links from random sources • Random weights • Supports mode deletion MM(P) MM(R)

  12. Experiment • Compare 4 conditions: • Control: Unimodal networks • Multitask: One mode per task • MM(P): Mode Mutation Previous • MM(R): Mode Mutation Random + Delete Mutation • 500 generations • Population size 52 • “Player” behavior scripted • Network controls homogeneous team of 4

  13. MO Performance Assessment • Reduce Pareto front to single number • Hypervolume of dominated region • Pareto compliant • Front A dominates front B implies HV(A) > HV(B) • Standard statistical comparisons of average HV

  14. 20 runs

  15. Front/Back Ramming Behaviors Multitask Back Ramming Front Ramming MM(R)

  16. 20 runs

  17. Predator/Prey Behaviors Multitask Predator Prey MM(R)

  18. Discussion (1) • Front/Back Ramming • Control < MM(P), MM(R) < Multitask • Multiple modes help • Explicit knowledge of task helps

  19. Discussion (2) • Predator/Prey • MM(P), Control, Multitask < MM(R) • Multiple modes not necessarily helpful • Disparity in relative difficulty of tasks • Multitask ends up wasting effort • Mode deletion aids search for one good mode

  20. How To Apply • Multitask good if: • Task division known, and • Tasks are comparably difficult • Mode mutation good if: • Task division is unknown, or • “Obvious” task division is misleading

  21. Future Work • Games with more tasks • Does method scale? • Control mode bloat • Games with independent tasks • Ms. Pac-Man • Collect pills while avoiding ghosts • Eat ghosts after eating power pill • Games with blended tasks • Unreal Tournament 2004 • Fight while avoiding damage • Fight or run away? • Collect items or seek opponents?

  22. Conclusion • Domains with multiple tasks are common • Both in real world and games • Multimodal networks improve learning in multitask games • Will allow interesting/complex behavior to be developed in future

  23. Questions? Jacob Schrum – schrum2@cs.utexas.edu Risto Miikkulainen – risto@cs.utexas.edu University of Texas at Austin Department of Computer Science

  24. Auxiliary Slides

More Related