1 / 179

Building an Artificial Cerebellum using a System of Distributed Q-Learning Agents

Building an Artificial Cerebellum using a System of Distributed Q-Learning Agents. Miguel Angel Soto Santibanez. Overview. Why they are important. Previous Work. Advantages and Shortcomings. Overview. Advantages and Shortcomings. New Technique. Illustration. Overview. Illustration.

azra
Download Presentation

Building an Artificial Cerebellum using a System of Distributed Q-Learning Agents

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building an Artificial Cerebellum using a System of Distributed Q-Learning Agents Miguel Angel Soto Santibanez

  2. Overview Why they are important Previous Work Advantages and Shortcomings

  3. Overview Advantages and Shortcomings New Technique Illustration

  4. Overview Illustration Centralized Learning Agent Issue New Techniques

  5. Overview New Techniques Evaluate Set of Techniques Summary of Contributions

  6. Overview Summary of Contributions Software Tool

  7. Early Sensorimotor Neural Networks Motor System (motor executors) Sensory System (receptors)

  8. Modern Sensorimotor Neural Networks Sensory System Motor System Cerebellum

  9. Natural Cerebellum Fast Precisely Synchronized

  10. Artificial Cerebellum

  11. Previous Work • Cerebellatron • CMAC • FOX • SpikeFORCE • SENSOPAC

  12. Previous work strengths • Cerebellatron improved movement smoothness in robots.

  13. Previous work strengths • CMACprovided function approximation with fast convergence.

  14. Previous work strengths • FOXimproved convergence by making use of eligibility vectors.

  15. Previous work strengths • SpikeFORCE and SENSOPAC have improved our understanding of natural cerebellums.

  16. Previous work strengths • May allow us to: Treat nervous system ailments. Discover biological mechanisms.

  17. Previous work strengths • LWPR a step in the right direction towards tackling the scalability issue.

  18. Previous work issues • Cerebellatron is difficult to use and requires very complicated control input

  19. Previous work issues • CMAC and FOXdepend on fixed sized tiles and therefore do not scale well.

  20. Previous work issues • Methods proposed by SpikeFORCE and SENSOPAC require rare skills.

  21. Previous work issues • The methods LWPR proposed by SENSOPAC only works well if the problem has only a few non-redundant and non-irrelevant dimensions.

  22. Previous work issues Two Categories: 1) Framework Usability Issues: 2) Building Blocks Incompatibility Issues:

  23. Previous work issues Two Categories: 1) Framework Usability Issues: very difficult to use requires very specialize skills

  24. Previous work issues Two Categories: 2) Building Blocks Incompatibility Issues: memory incompatibility processing incompatibility

  25. Proposed technique. Two Categories: • Framework Usability Issues: new development framework

  26. Proposed technique. Two Categories: • Building Blocks Incompatibility Issues new I/O mapping algorithm

  27. Proposed Technique Provides a shorthand notation.

  28. Proposed Technique Provides a recipe.

  29. Proposed Technique Provides simplification rules.

  30. Proposed Technique Provides a more compatible I/O mapping algorithm. Moving Prototypes

  31. Proposed Technique The shorthand notation symbols: a sensor:

  32. Proposed Technique The shorthand notation symbols: an actuator:

  33. Proposed Technique The shorthand notation symbols: a master learning agent:

  34. Proposed Technique The shorthand notation symbols: the simplest artificial cerebellum:

  35. Proposed Technique The shorthand notation symbols: an encoder:

  36. Proposed Technique The shorthand notation symbols: a decoder:

  37. Proposed Technique The shorthand notation symbols: an agent with sensors and actuators:

  38. Proposed Technique The shorthand notation symbols: a sanity point: S

  39. Proposed Technique The shorthand notation symbols: a slave sensor learning agent:

  40. Proposed Technique The shorthand notation symbols: a slave actuator learning agent:

  41. Proposed Technique The Recipe: 1) Become familiar with problem at hand. 2) Enumerate significant factors. 3) Categorize factors as either sensors, actuators or both. 4) Specify sanity points.

  42. Proposed Technique The Recipe: 5) Simplify overloaded agents. 6) Describe system using proposed shorthand notation. 7) Apply simplification rules. 8) Specify reward function for each agent.

  43. Proposed Technique The simplification rules: 1) Two agents in series can be merged into a single agent:

  44. Proposed Technique The simplification rules: 2) Ok to apply simplification rules as long as no sanity point is destroyed. S

  45. Proposed Technique The simplification rules: 3) If decoder and encoder share same output and input signals respectively, they can be deleted

  46. Proposed Technique The simplification rules: 4) Decoders with a single output signal can be deleted

  47. Proposed Technique The simplification rules: 5) Encoders with a single output signal can be deleted

  48. Proposed Technique The simplification rules: 6) If several agents receive signals from a single decoder and send their signals to single encoder:

  49. Proposed Technique • Q-learning: • Off-policy Control algorithm. • Temporary-Difference algorithm. • Can be applied online.

  50. Proposed Technique Q-learning: L. A. R

More Related