1 / 10

Game Theory in Q-Learning

Game Theory in Q-Learning. Bill Jarvis, Andrew Harbor, and Jonathon Wickens. Defining Game Theory. Game theory attempts to mathematically define the interactions between different “games” whose success heavily depends upon each others’ behavior. How the program works.

burke
Download Presentation

Game Theory in Q-Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Game Theory in Q-Learning Bill Jarvis, Andrew Harbor, and Jonathon Wickens

  2. Defining Game Theory • Game theory attempts to mathematically define the interactions between different “games” whose success heavily depends upon each others’ behavior.

  3. How the program works • A number of territories are created and each given to a single farmer. • Within each territory, a single plant is provided for sustenance.

  4. The Farmer’s options • During the first steps, the farmer may only devastate or eat the plant in order to learn a strategy for optimal sustenance. • When devastated, a plant gives a very low amount of strength to the farmer. • A devastated plant

  5. Phase 1 • During phase 1, all territories are segregated from each other. • This is implemented so that the farmers may learn to sustain their plants One step

  6. Phase 2 • During phase 2, the farmers are given the option to invade each other’s territory, in which the farmer may eat the territory’s food. Invading farmer Home farmer

  7. Phase 2 cont. • At this point, the farmer has the option to: • Eat a lot • Eat a little • Invade • Return home • Punish invader of home territory

  8. Sensory states • The farmer automatically detects: • Where it is currently located • At home or abroad • Whether its territory is invaded or safe • The state of the current territory’s food • Thriving or devastated • How much the current territory’s owner enjoys punishing invaders (N/A if at home) Example: [Home, Invaded, Devastated, N/A]

  9. The User Interface An invader being punished To create variety and tweak the simulation, the values of the appropriate constants may be changed during or at the beginning of the initialization of the simulation

  10. What we’ve learned about learning • Q-learning is very adaptable and general enough to apply to many situations. • Social interactions, in the form of game theory, vastly complicates the decision making process, so that individuality emerges in the form of farmer behavior or “personality.”

More Related