1 / 15

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning in the Large MIT CSAIL PIs: Leslie Pack Kaelbling, Tomás Lozano-Pérez, Tommi Jaakkola . Three Subprojects. Learning to behave in huge domains

lacy-holden
Download Presentation

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning in the Large MIT CSAIL PIs: Leslie Pack Kaelbling, Tomás Lozano-Pérez, Tommi Jaakkola

  2. Three Subprojects • Learning to behave in huge domains • Transfer of learned knowledge across problems and domains • Learning to recognize objects and interpret scenes

  3. Three Subprojects • Learning to behave in huge domains • Transfer of learned knowledge across problems and domains • Learning to recognize objects and interpret scenes

  4. Learning Objective • Learn to act effectively in highly complex dynamic domains • Learn models of complex world dynamics involving objects, properties, and relations • Learn “meta-cognition” strategies for deciding how to focus computational attention for action selection • Learning is crucial for both problems because human designers are unable to build appropriate models by hand

  5. What Is Being Learned? • Learning probabilistic dynamic rules pickup(X):on(X,Y), clear(X), table(Z), inhand-nil 0.8 : inhand(X), ¬on(X,Y), clear(Y), ¬clear(X) ¬inhand-nil 0.2:¬on(X,Y), clear(Y), on(X,Z) • Important goal is to learn partial models: some aspects will be easy to learn to predict, others will take longer • Take advantage of partial models as soon as they’re learned

  6. How is it Being Learned? • Search in rule space • logic-based methods for learning structure • convex optimization for probabilities • Effectiveness of learned models tested using planner to select actions • Learning is automatic • Amount of data needed depends on the frequency and reliability of phenomenon being modeled

  7. How is the Knowledge Represented? • Probabilistic dynamics rules • No background knowledge currently, but it would be easy to build in some rules • Knowledge is task-independent (though we may use utility to focus learning) • Models can account for only parts of the state evolution; and they’re probabilistic • Currently, no

  8. What is the Domain? • Currently: physics simulator of blocks world • Would like simulation of more complex environment, e.g., • battlefield • disaster relief • making breakfast

  9. How is Progress Being Measured? • First, human inspection of rules for plausibility • Second by performance of agent using rules for planning • Nothing changes in the experimental set-up except the learned rules • Metrics: • utility gained by the agent • computation speed • Easily done overnight on a workstation

  10. What are the Technical Milestones? • Defined by model sophistication rather than overt performance in the task • Learn rules with quantifiers • Learn to ground symbolic predicates in perception • Learn rules in partially observable environments • Postulate hidden causes • Focus rule-learning based on utility

  11. What is Being Learned? • Learning to formulate small planning problem, from a huge state space and competing goals • what are useful subgoals? • when is it appropriateto ignore certain aspectsof the domain? learninginferenceplanning perception action

  12. How is it Being Learned? • Learning parameters in abstract models • partial observability makes it hard • gradient descent works, but may be weak • take advantage of Russell’s methods? • Compare speed and utility of resulting action-selection system • Learning is automatic • Amount of data needed depends on the frequency and reliability of phenomenon being modeled

  13. How is the Knowledge Represented? • Parameters in strategies for building abstractions • Currently most of the abstraction structure is hand-coded • The knowledge depends on the distribution of problems an agent has to solve, but not on particular low-level tasks • Uncertainty isn’t represented explicitly, but is handled implicitly in statistical learning • We are learning at multiple levels of abstraction

  14. What is the Domain? • Nethack • Would like more complex simulated domain

  15. What are the Technical Milestones? • Meta-learning • Learn parameters in hand-built abstractions for MDPs • Learn new abstractions for MDPs • Learn to compose abstractions • Do it all for POMDPs

More Related