1 / 40

JuKeCB

JuKeCB. Justin Karneeb. What is JuKeCB?. A Case Based Reasoning system developed for use in DOM, a Domination style game A Research Project which has been under development over the past three years by: Justin Karneeb Kellen Gillespie Stephan Lee-Urban Professor Munoz Avila.

zariel
Download Presentation

JuKeCB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. JuKeCB Justin Karneeb

  2. What is JuKeCB? • A Case Based Reasoning system developed for use in DOM, a Domination style game • A Research Project which has been under development over the past three years by: • Justin Karneeb • Kellen Gillespie • Stephan Lee-Urban • Professor Munoz Avila

  3. How about some more detail • JuKeCB is a CBR system that learns stochastic policies by observation • Stochastic Policy: A non deterministic “strategy” • Imitates winning strategies that have been observed in the past to win future battles • Continues to learn as it observes more games

  4. A short aside: DOM Game • In order to understand JuKeCB, you first need to understand the game it is playing. • SCREEN SHOT OF DOM

  5. DOM: The Rules • DOM is a Domination style game • Team based gameplay • Scoring based on holding or “dominating” key points on the map • Easily visible abstract strategies • Basic Strategy • Capture enemy Dom points • Defend owned Dom points • Own more Dom points than opponent

  6. DOM: Winning • Score is updated every five game turns • Each team is awarded 1*NumberDomPointsHeld points • Two possible game modes • Score Limit: Game ends when one team’s score exceeds X points • Turn Limit: Game ends when X number of turns have passed

  7. DOM: Meet the easy teams! • DomOneHuggerTeam • All Bots go to domination Point 1 • FirstHalfOfDomPointsTeam • Evenly distribute all bots to go to the first half of domination points • SecondHalfOfDomPointsTeam • Evenly distribute all bots to go to the second half of domination Points

  8. DOM: Meet the tough teams… • EachBotToOneDomTeam • Send all bots to a different domination point • GreedyDistanceTeam • Send all bots to their closest unowned domination point • Smart OpportunisticTeam • Sends each bot to a different unowned domination point

  9. Questions about DOM Game? • DOM PICTURE

  10. Meanwhile… Back at JuKeCB • What was all that stochastic policy nonsense you were talking about? • Each case in JuKeCB stores two stochastic policies • WinningStrategy • LosingStrategy • JuKeCB can employ a winning strategy against similar losing strategies • Example: SecondHalf beats Dom1Hugger • Why use stochastic policies at all? • Why not plans/single actions/scripts ect

  11. So what makes a good Policy? • Feature selection can be very difficult • It took us almost a full semester to get a set of features that seemed to work • Each feature must supply information about the strategy or game state • Do not include unnecessary information • Each feature must be reproducible • Able to reproduce similar results when ran • As a whole, features must completely identify a strategy and the game state (ideally)

  12. Brainstorm! • SCREENSHOT OF DOM GAME

  13. Here’s what we came up with • All features based on a timeframe or window • Domination Point Destinations • Probability each bot went to specific dom points • Unowned • Probability each bot went to an unowned point • Closest • Probability each bot went to its closest point • Score Difference • Difference in scores during the time window • Is that enough?

  14. Features Failure • The DomOneHugger issue • Who owns those other points! • Not enough information on game state • Makes us think the strategy is similar when in fact it is not • Needed more features! • Domination Point Held Ratios • The probability that Team0 held a given dom point • Still not perfect, good enough

  15. The Case Base Cycle • Retain • Observe game state • Store in case base • Retrieve • Observe game state • Forward similar case to JuKeCBTeam for reuse • Reuse • Enact strategy found in case

  16. Observation: Retain • JuKeCB does all of its learning by observation • Game Window Monitoring • Most features built over the course of the window • DomPointDest • DomHeldRatios • Unowned/Closest • Some features are created at the very end • DeltaScore • Some are static for a game • Num Domination points • Num Bots per team • Dom Point Distances

  17. Retain Continued • Once the window ends, the case is created

  18. Case Retrieval • JuKeCB uses a three-stage retrieval process to reduce search time • Stage One • Runs only at game start • Remove all cases that do not pertain to the map • Stage Two • Runs at every retrieve update • Get all cases that are 95% similar or higher • Stage Three • Runs after stage two • Gets the case with the highest delta score

  19. Stage One • Left out some features earlier… • Number of Domination points • Number of Bots per team • Distance between Domination points • These features only pertain to Stage One similarity • Temporarily remove all cases that • Have a different number of dom points • Have a different number of bots per team • Do not have a similar set of dom point distances

  20. Stage Two • Responsible for finding cases pertinent to the situation • Computes similarity between the enemies current strategy with all losing strategies • If no case is more than 60% similar, run randomly • If no case is more than 95% similar, return most similar case • Otherwise, return all cases more than 95% similar

  21. Stage Two Similarity • Compares only the following features • Dom point destinations (per bot) • ToUnowned (per bot) • ToClosest (per bot) • Dom Held Ratios • All features are real numbers • Similarity formula: • Weight*( 1-|(V1-V2)/(VMax-VMin)| ); • VMax and VMin are 1 and 0 for all features

  22. Stage Two Similarity Cont • Weights were something we tweaked for a long time • Currently, they are as follows • Destinations: 40% (.4/numdompoints/numbots) • Dom Point Ratios: 20% (.2*numdompoints) • ToUnowned: 20% (.2/numbots) • ToClosest: 20% (.2/numbots) • Could probably still be further refined

  23. Stage Three • Only run if stage two returns more than one case • Looks at the DeltaScoreWinning feature of each returned case • Return case with highest score • This was to combat a perfect similarity beating a very similar case with superior results

  24. Reuse: JuKeCB Team • JuKeCBTeamrecieves as input a strategy • Not a full case, just the winning policy of the case • Uses a random number generator to try to follow the given distribution as best as possible • Randomly roll numbers for each feature and act accordingly • Rank each destination by how many criteria it meets • If tied, choose one at random

  25. Reuse: Example • Bot0: • To Dom0: 20% -- Is owned • To Dom1: 20% -- Is unowned • To Dom2: 60% -- Is owned • To Unowned: 80% • To Closest: 10% • Dom Roll (0-100): 68 • Unowned Roll (0-100): 27 • Closest Roll (0-100): 92 • Dom0 Score = 0 • Dom1 score = 1 • Dom2 score = 1

  26. Maintanence • At the end of every game, JuKeCB compiles the list of all recently created cases • Attempts to add them to the case base • If no case is 95% similar, add it • If a case is 95% similar and the new case has a higher delta score, swap them • On demand, run a full check • Over time, swapping cases can cause redundant cases in the case base. Running a full check can be very time intensive

  27. Performance • Able to beat all ‘easy’ teams with ease • DomOneHugger • FirstHalfDomPoints • SecondHalfDomPoints • Able to win or be competitive against ‘hard’ teams • EachBotToOneDom • GreedyDistance • SmartOpportunistic

  28. Performance • The following results were run on this map • IMAGE OF MAP

  29. Untrained GreedyDistance

  30. Trained GreedyDistance

  31. Untrained DynamicTeam

  32. Trained DynamicTeam

  33. Problems • Retrieval can take a long time • Num Cases: 129, Average Time Taken: 221ms • Num Cases: 258, Average Time Taken: 721ms • Num Cases: 516, Average Time Taken: 3,063ms • Num Cases: 1032, Average Time Taken: 12,231ms • Cant beat SmartOpportunistic • We lack the features to properly define its strategy • No ‘defend’ features • All cases appear to be random

  34. Additional Work • Parallelizing retrieval • Direct speed up by using more CPUs • Clustering the Case Base • Greedy clustering • Similarity clustering • Using Asynchronous retrieval • Hide the delay

  35. Clustering JuKeCB • Speed up retrieval by dividing up the parts of the case base needed for any given retrieval • Greedy Clustering • Create new ‘clusters’ depending on the greedy policy • 010: Bot0-Dom0, Bot1-Dom1, Bot2-Dom0 • This clustering scheme got us very poor results • Too much data loss

  36. Clustering JuKeCB • Similarity Clustering • Each cluster gets a representative case • New cases are added to a cluster if the similarity is over a certain threshold • New clusters are created if no similar clusters found • Quite good results • Moderate speedup (sorry, cant find numbers!) • Only slight performance drop

  37. Parallelizing JuKeCB • Divide up the Case Base into X number of chunks • X = number of processors on the machine • Have each processor run stage two on its own chunk • Run stage three on the results of all chunks • Speedup was almost optimal • OldRetrievalTime/NumProcessors

  38. Asynchronous Retrieval • Hide retrieval delay by hiding it in a new thread • Works only if the game is running at ‘human playable speed’ • Gets near identical results to normal system with no visible delay • Similar to parallelizing, sacrifice slight speed up gain for better responsiveness

  39. Possible additional work • Combining all the previous methods into one ultra fast case based reasoning machine! • A clustered case based whose retrieval was done asynchronously in parallel • Optimising JuKeCB, JuKeCBTeam, and DOM Game • Some things were coded somewhat sloppily and could easily be improved– such as the reuse phase • Adding more features • Like we discussed earlier, we do not have enough features to properly define some strategies

  40. Closing/Questions • Overall JuKeCB was a great system for me to work on • It gave me substantial knowledge in the CBR field • A paper: Imitating Inscrutable Enemies: Learning from Stochastic Policy Observation, Retrieval and Reuse was published and presented at the ICCBR 2010 conference • ….I swear I did not name the system….

More Related