html5-img
1 / 34

UCT for Tactical Assault Battles in Real-Time Strategy Games

UCT for Tactical Assault Battles in Real-Time Strategy Games. Radha-Krishna Balla 19 February, 2009. Overview. Introduction Related Work Method Experiments & Results Conclusion. Introduction Related Work Method Experiments & Results Conclusion. Domain. RTS games Resource Production

romeo
Download Presentation

UCT for Tactical Assault Battles in Real-Time Strategy Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UCT for Tactical Assault Battles in Real-Time Strategy Games Radha-Krishna Balla 19 February, 2009

  2. Overview Introduction Related Work Method Experiments & Results Conclusion

  3. Introduction Related Work Method Experiments & Results Conclusion

  4. Domain • RTS games • Resource Production • Tactical Planning • Tactical Assault battles

  5. RTS game - Wargus Screenshot of a typical battle scenario in Wargus

  6. Planning problem Large state space Temporal actions Spatial reasoning Concurrency Stochastic actions Changing goals

  7. Introduction Related Work Method Experiments & Results Conclusion

  8. Related Work • Board games – bridge, poker, Go etc., • Monte Carlo simulations • RTS games • Resource Production • Means-ends analysis – Chan et al., • Tactical Planning • Monte Carlo simulations – Chung et al., • Nash strategies – Sailer et al., • Reinforcement learning – Wilson et al., • Bandit-based problems, Go • UCT – Kocsis et al., Gelly et al.,

  9. Our Approach • Monte Carlo simulations • UCT algorithm Advantage • Complex plans from simple abstract actions • Exploration/Exploitation tradeoff • Changing goals

  10. Introduction Related Work Method Experiments & Results Conclusion

  11. Method Planning architecture UCT Algorithm Search space formulation Monte Carlo simulations Challenges

  12. Planning Architecture • Online Planner • State space abstraction • Grouping of units • Abstract actions • Join(G) • Attack(f,e)

  13. UCT Algorithm • Exploration/Exploitation tradeoff • Monte Carlo simulation – get subsequent states • Search tree • Root node – current state • Edges – available actions • Intermediate nodes – subsequent states • Leaf nodes – terminal states • Rollout-based construction • Value estimates

  14. UCT Algorithm – Pseudo Code 1 At each interesting time point in the game: build_UCT_tree(current state); choose argmax action(s) based on the UCT policy; execute the aggregated actions in the actual game; wait until one of the actions get executed; build_UCT_tree(state): for each UCT pass do run UCT_rollout(state); (.. continued)

  15. UCT Algorithm – Pseudo Code 2 UCT_rollout(state): recursive algorithm if leaf node reached then estimate final reward; propagate reward up the tree and update value functions; return; populate possible actions; if all actions explored at least once then choose the action with best value function; else if there exists unexplored action choose an action based on random sampling; run Monte-Carlo simulation to get next state based on current state and action; call UCT_rollout(next state);

  16. UCT Algorithm - Formulae Value Updation: Action Selection:

  17. Search Space Formulation

  18. Monte Carlo Simulations

  19. Domain-specific Challenges State space abstraction - Grouping of units (proximity-based) Concurrency of actions - Aggregation of actions - Join actions – simple - Attack actions – complex (partial simulations)

  20. Planning problem - revisited Large state space – abstraction Temporal actions – Monte Carlo simulations Spatial reasoning – Monte Carlo simulations Concurrency – aggregation of actions Stochastic actions – UCT (online planning) Changing goals – UCT (different objective functions)

  21. Introduction Related Work Method Experiments & Results Conclusion

  22. Experiments Table 1: Details of the different game scenarios

  23. Planners • UCT Planners • UCT(t) • UCT(hp) Number of rollouts – 5000 Averaged over – 5 runs

  24. Planners • Baseline Planners • Random • Attack-Closest • Attack-Weakest • Stratagus-AI • Human

  25. Video – Planning in action Simple scenario <add video> Complex scenario <add video>

  26. Results Figure 1: Time results for UCT(t) and baselines.

  27. Results Figure 2: Hit point results for UCT(t) and baselines.

  28. Results Figure 3: Time results for UCT(hp) and baselines.

  29. Results Figure 4: Hit point results for UCT(hp) and baselines.

  30. Results - Comparison Time results Hit point results U C T (t) U C T (hp) Figures 1, 2, 3 & 4: Comparison between UCT(t) and UCT(hp) metrics

  31. Results Figure 5: Time results for UCT(t) with varying rollouts.

  32. Introduction Related Work Method Experiments & Results Conclusion

  33. Conclusion • Conclusion • Future Work • Engineering aspects • Machine Learning techniques • Beyond Tactical Assault

  34. Thank you

More Related