1 / 14

Reinforcement Learning

Reinforcement Learning. Ruti Glick Bar-Ilan university. learning. result of interaction between an agent and the world Percepts received by an agent should be used not only for acting, but also for improving the agent ’ s ability to behave optimally in the future to achieve the goal.

mdorothy
Download Presentation

Reinforcement Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reinforcement Learning Ruti Glick Bar-Ilan university

  2. learning • result of interaction between an agent and the world • Percepts received by an agent should be used not only for acting, but also for improving the agent’s ability to behave optimally in the future to achieve the goal.

  3. Supervised Learning • Learn from example • E.g. Decisions tree • Environment provide input / output pairs • Learn functions and probability models • You can think it as if there is a kind teacher

  4. Supervised Learning • Problems: • Difficulties in supplying large number of examples • Example: • Train robot to juggle • Board state in chess

  5. Reinforcement learning • Agent receives some evaluation of its action • not told of which action is the correct one to achieve its goal

  6. Example – play chess • Supervised Learning • Gets examples of board state + best move in this state • Reinforcement learning • Tries random movements • Learn about environment • How board will looks like after performing the action • What the opponent will do • Must get rewards / reinforcement

  7. Types of Reinforcement 1 • Positive Reinforcement • pleasurable consequence administered after a desired behavior • strengthens behavior • e.g. praising a dog after it performs a trick • Extinction • withholding positive reinforcement following an undesirable behavior • reduces behavior • e.g. imposing early curfew on a child who stayed out too late

  8. Types of Reinforcement 2 • Punishment: • an unpleasant consequence administered following an undesirable behavior • reduces behavior • e.g. a choke chain for a dog • Negative Reinforcement: • withholding an unpleasant consequence following a correct behavior • strengthens behavior • e.g. a boxer learning to block a jab

  9. Reinforcement Schedules 1 • Continuous Reinforcement: • every behavior is reinforced • Partial Reinforcement: • not every behavior reinforced: • Fixed interval: fixed time interval passes between reinforcements • Variable interval: time interval varies between a min and max • Final Reinforcement • At the terminal states

  10. Reinforcement Schedules 2 • Continuous schedule: • results in faster learning • but fastest extinction if a reinforcement is missed • Variable schedule • results is most effective for developing more permanent behavior

  11. Properties 1 • Accessible / inaccessible environment • In accessible – state can be identified by percept • Otherwise – agent must keep trucking after environment • Exist knowledge • Does agent have knowledge of environment and actions’ effects? • If no – has to learn the model • Rewards - Schedules • When does the agent get them?

  12. Properties 2 • Rewards – type • Components of actual utility • Score in ping pong • Dollars on betting • Hints fot utility • “bad dog” • “hot…” • Learning type • Passive or active

  13. Passive / active learner • Passive learner • Watches the world goes by • tries to learn utility of being in varied states • Active learner • Acts using its learned information • Must experience as mush as possible the environment

  14. Agents types • Utility - based agent • Learns utility of states • Use it to select best action • Must know the model of environment • E.g. in backgammon must know legal moves ant its effect • Q-learning agent • Learn expected utility of taking a given action in a given state • Doesn’t need to know the effect but legal moves • Can’t look ahead

More Related