1 / 43

Mind is About Predictions

Mind is About Predictions. Rich Sutton AT&T Labs with special thanks to Michael Littman, Doina Precup, Satinder Singh, David McAllester. Mind is About Predictions. Hypothesis: Knowledge is predictive About what-leads-to-what, under what ways of behaving

preston
Download Presentation

Mind is About Predictions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mind is About Predictions Rich Sutton AT&T Labs with special thanks to Michael Littman, Doina Precup, Satinder Singh, David McAllester

  2. Mind is About Predictions Hypothesis: Knowledge is predictive About what-leads-to-what, under what ways of behaving What will I see if I go around the corner? Objects: What will I see if I turn this over? Active vision: What will I see if I look at my hand? Value functions: What is the most reward I know how to get? Such knowledge is learnable, chainable Hypothesis: Mental activity is working with predictions Learning them Combining them to produce new predictions (reasoning) Converting them to action (planning, reinforcement learning) Figuring out which are most useful

  3. Philosophical and Psychological Roots • Like classical british empiricism (1650–1800) • Knowledge is about experience • Experience is central • But not anti-nativist (evolutionary experience) • Emphasizing sequential rather than simultaneous events • Replace association/contiguity with prediction/contingency • Close to Tolman’s “Expectancy Theory” (1932–1950) • Cognitive maps, vicarious trial and error • Psychology struggled to make it a science (1890–1950) • Introspection • Behaviorism, operational definitions • Objectivity

  4. Modern Computional View of Mind • OK to talk about insides of minds • Ok to talk about the function and purpose of a design • We talk about Why • Why a system works • Why it should compute X and in manner Y • Why such a system should achieve purpose Z • This is new, and resolves classical struggles • Servo-mechanisms, state-transition probabilities • Utility and decision theory • Information as signal – subjective (private) yet clear • Purpose defines and constrains mental constructs

  5. Informational View of Mind • Mind does information processing • Mind exchanges information with the world • Only experience is known for sure • Anything more public or “objective” is suspect • World is an I-O entity, a black box • Although we often seem to talk about what is inside,All we can sensibly talk about is I-O behavior • This “interactionist stance” seems to follow from IVoM experience Mind World

  6. Is Mind about Predictions?ORIs Mind about Action (or Policies)? • Of course it is ultimately about action • But action generation methods are relatively clear • Value functions and decision theory • Pick action that maximizes expected cumulative reward • OR • Policy gradient RL methods • Execution-time search • Reflexes and behavior-based robotics • Learning-extended reflexes and conditioning • Flexible cognition requires more than action generation • Most mental activity is working with predictions

  7. An old, simple, appealing idea • Mind as prediction engine! • Predictions are learnable, combinable • They represent cause and effect, and can be pieced together to yield plans • Perhaps this old idea is essentially correct. • Just needs • Development, revitalization in modern forms • Greater precision, formalization, mathematics • The computational perspective to make it respectable • Imagination, determination, patience • Not rushing to performance • Not building in ungrounded world knowledge

  8. Topics • Super-Predictions • Combining Predictions (reasoning and planning) • Predictions and State

  9. The Simplest Predictions state action Experience 1-step Prediction a X Y k-step Prediction p X Y In general, predictions depend on actions, on policies And there is a huge space of policies…can be closed loop

  10. Simple Mixture Predictions Where will I be in 10–20 steps? Where will I be in roughly 10 steps? 10 steps 20 steps Arbitrary termination profiles are possible now 10 steps short term medium term long term • Closed-loop termination: • Terminate depending on what happens • Where will I be when X happens?

  11. Closed-loop termination loosens the time-specificity of predictions • Instead of “what will I see at t+100?” • Can say “what will I see when I open the box?” • Will we elect a black or a woman president first? • Where will the tennis ball be when it reaches me? • What time will it be when the talk starts? or “when John arrives?” “when the bus comes?” “when I get to the store?” A substantial increase in expressiveness

  12. Super-Predictions Closed-loop terminations And Closed-loop policies Correspond to arbitrary experiments and the results of those experiments What will I see if I go into the next room? What time will it be when the talk is over? Is there a dollar in the wallet in my pocket? Where is my car parked? Can I throw the ball into the basket? Is this a chair situation? What will I see if I turn this object around?

  13. Anatomy of a Super-Prediction 1 Predictor Recognizes the conditions, makes the prediction 2 Experiment - policy - termination condition - measurement function(s) 3 Goal A function of the anticipated measurement to be maximized by choice of policy and termination

  14. Example: Open-the-door • PredictorUse visual input to estimate • Probabilities of succeeding in opening the door, and of other outcomes (door locked, no handle, no real door) • expected cumulative cost (sub-par reward) in trying • Experiment • Policy for walking up to the door, shaping grasp of handle, turning, pulling, and opening the door • Terminate on successful opening or various failure conditions • Measure outcome and cumulative cost • Goal • Sum of expected cost and expected value of outcome • Can be used to define experiment’s policy and termination

  15. RoboCup-Soccer Example Safe to pass? Predict the outcome of choosing to pass • The pass will take several steps to set up • – choosing to pass involves a whole action policy • You may choose to not to pass half way through • Terminations and outcomes: • – pass is aborted • – opponents touch the ball before teammate • – teamate touches first, appears to control ball • – ball goes out of bounds

  16. Example: Pass-to-Teammate • Predictor uses perceived positions of ball, opponents, etc. to estimate probabilities of • Successful pass, openness of receiver • Interception • Reception failure • Aborted pass, in trouble • Aborted pass, something better to do • Loss of time • Experiment • Policy for maneuvering ball, or around ball, to set up and pass • Termination strategy for aborting, recognizing completion • Measurement of outcome, time • Goal • Some combination of outcome values, time, openness of rec.

  17. Topics • Super-Predictions • Combining Predictions (reasoning and planning) • Predictions and State

  18. Combining Predictions I: Composition If the mind is about predictions, Then thinking is combining predictions to produce new ones X Y Y Z X Z Here each prediction is assumed to predict A transient measurement (e.g., elapsed time, cumulative reward) A final measurement (e.g., partial distribution of outcome states) The new prediction does not necessarily have a goal

  19. Combining Predictions I: Composition If the mind is about predictions, Then thinking is combining predictions to produce new ones Y’ .1 X Y Y Z .8 Y’’ .1 Y’ .1 p b p b then if Y X Z 1 1 2 2 .8 T . 8 T + Y’’ .1 1 2 Here each prediction is assumed to predict A transient measurement (e.g., elapsed time, cumulative reward) A final measurement (e.g., partial distribution of outcome states) The new prediction does not necessarily have a goal

  20. Combining Predictions II: Choice A predictor plus a goal compose to form a value function  we can do all the usual planning backups with pg X Y In X, for g, is a better Choice than p,b. Store it with g. g =5 X Y’ g =6

  21. Sutton, Precup, & Singh, 1999 Room-to-Room Super-Predictions Target (goal) hallway “Options” Precup 2000 Sutton, Precup, & Singh 1999 4 stochastic primitive actions Policy u p F a i l 3 3 % l e f t r i g h t o f t h e t i m e Termination hallways d o w n 8 multi-step super-predictions ( t o e a c h r o o m ' s 2 h a l l w a y s ) Predict:Probability of reaching each terminal hallway Goal: minimize # steps + values for target and other outcome hallway

  22. Planning with Super-Predictions (super-predictions)

  23. Topics • Super-Predictions • Combining Predictions (reasoning and planning) • Predictions and State

  24. Predictive State Representations • Hypothesis: What we normally think of as stateis a set of predictions about outcomes of experiments • Wallet’s contents, John’s location, presence of objects… • Problem: So far we have assumed statesbut really world just gives information, “observations” • There are several ways to formalize this problem • Learning deterministic Finite State Automata • Rivest & Schapire, 1987 • Adding stochasticity: An alternative to Hidden Markov Models • Herbert Jaeger, 1999 • Adding action: An alternative to Partially Observable Markov Decision Processes • Littman, Sutton, & Singh 2001

  25. PSR Formalism 1 Random variables actions Experience: Mind World observations A test is a subsequence, a simple case of an experiment if the actions are done, will the observations occur? The world is defined by the probabilities of each test from the beginning of time: and after a finite history sequence h (formally another test):

  26. PSR Formalism 2 A Predictive State Representation (PSR) is a set of tests whose vector of predictions is sufficient information to predict all tests i.e., whose predictions are a sufficient statistic, a state A linearPSR is a PSR where each ftis linear

  27. Walk/Reset Example 1 Actions: Walk : Take a random step left or right, see 0 Reset: Jump to rightmost state, see 1 if already there Need to remember of Walks since last Reset Probabilities of being rightmost are: 1 .5 .5 .375 .375 .3125 .3125… PSR tests: Reset1, Walk0Reset1

  28. Walk/Reset Example 1 Start on Right... Walk : Take a random step left or right, see 0 Reset: Jump to rightmost state, see 1 if already there Need to remember of Walks since last Reset Probabilities of being rightmost are: 1 .5 .5 .375 .375 .3125 .3125… PSR tests: Reset1, Walk0Reset1

  29. Walk/Reset Example 1 Start on Right... After one WalkWalk step left or right, see 0 Reset: Jump to rightmost state, see 1 if already there Need to remember of Walks since last Reset Probabilities of being rightmost are: 1 .5 .5 .375 .375 .3125 .3125… PSR tests: Reset1, Walk0Reset1 .5 .5

  30. Walk/Reset Example 1 Start on Right... After one WalkWalk step left or right, see 0 After two Walks state, see 1 if already there Need to remember of Walks since last Reset Probabilities of being rightmost are: 1 .5 .5 .375 .375 .3125 .3125… PSR tests: Reset1, Walk0Reset1 .25 .25 .5

  31. PSR Results • Exist compact, linear PSRs • # tests ≤ # states in minimal POMDP • # tests ≤ Rivest & Schapire’s Diversity • # tests can be exponentially fewer than diversity and POMDP • Compact simulation/update process • Construction algorithm from POMDP • Learning/discovery algorithms of Rivest and Schapire, and of Jaeger, do not immediately extend to PSRs • There are natural EM-like algorithms (current work)

  32. Constructing Linear PSRs from POMDPs Outcome vectoru(t): the predictions for test t from all POMDP states. A test t is said to be independent of a set of tests T if it’s outcome vector is linearly independent of T’s o.v.s Accumulate tests whose outcome vectors are independent Search: Start with T={} While some extension aot of t  T independent, add to T Else terminate, return T.

  33. PSR Conclusions • A path to exorcizing the assumption of state • Toward the goal of totally data- (experience-) oriented AI • The predictive view of state is competitive • Even better (more compact) in some ways • States have data interpretations! • And are thus potentially more learnable, refinable • Naturally leads to constructive discovery ideas • Searching for the right tests to understand the world • “Tests” generalize naturally to super-predictions

  34. Empiricism Experience is central —Knowledge is about experience Mind World actions observations Experience is the data; it is all we really know Experience should be the focus of AI But by and large it is not… even in robotics, Alife, etc.

  35. Mind is About Predictions Hypothesis: Knowledge is predictive About what-leads-to-what, under what ways of behaving Such knowledge is learnable, chainable Hypothesis: Mental activity is working with predictions Learning them Combining them to produce new predictions (reasoning) Converting them to action (planning, reinforcement learning) Figuring out which are most useful Hypothesis: These ideas are newly viable Unfamiliar flexibiliy & expressiveness of “super”-predictions New engineering planning methods DP/RL/Values New state-representation ideas Hypothesis: Predictions are the Coin of the Mental Realm

  36. It’s Hard to Build Large AI Systems • Brittleness • Unforeseen interactions • Scaling • Requires too much manual complexity management • people must understand, intervene, patch and tune • like programming • Need more autonomy • learning, verification • internal coherence of knowledge and experience

  37. AI Implications of Predictive View • An alternative theory of knowledge and thought • Alternative to conventional, symbolic “language of thought” • Alternative to “database” view of knowledge • Requires experiments to be in the machine, not just the designer — true grounding • Automated complexity management • Should help with brittleness and scaling • Could permit AI systems of much greater complexity

  38. Both Predictors and Experiments must be in the Machine • “Classical” AI systems omit both! • e.g., “Tweety is a bird”, “John loves Mary” • sometimes called the “symbol grounding problem” • Modern AI sytems tend to skimp the experiments • supervised learning, Bayes nets, robotics… • It is not OK to leave the experimental definitions to external, human observers • the information is just not in the machine • we don’t understand it; we haven’t done our job! • Yet this is such an appealing shortcut that we have almost always done it

  39. More Predictive Knowledge • John is in the coffee room • My car in is the South parking lot • What we know about geography, navigation • What we know about how an object looks, rotates • What we know about how objects can be used • Recognition strategies for objects and letters • The portrait of Washington on the dollar in the wallet in my other pants in the laundry, has a mustache on it • Composing experiments creates a productive rep’n language

  40. Relational, Propositional, and Deictic  objectsX, If I drop X, then X will be on the floor • Holding object X means predicting certain sensations if, for example, one directs one’s eyes toward one’s hand • Thus, on dropping, the predicted sensations are merely transferred from the looking-at-hand prediction to the looking-at-floor prediction • Such transfer of existing predictions should be a common part of visual knowledge - updated every time the eyes move  X,Y, such that Red(X), Blue(Y), and Above(X,Y) • There is some place I can foveate and see Red • There is some place I can foveate and see Blue • If I foveate first the Red place, “mark” it, then the Blue place, the mark will be Above the fovea (may need to search) • These are typical ideas of modern, active, deictic vision X X

  41. Should All Knowledge be Experiential?Allowing only Predictions in terms of Data? loses • Expressiveness • can’t talk about objects, space, people; no “is-a” or “part-of” • External (human) coherence • verbal labels, interpretability, explainability, calibration • the “shortcut” of entering knowledge directly into the agent gains • The knowledge will have meaning to the machine • It can be mechanically learned/verified/extended • It will be suited for a general reasoning processes • composition and backup of predictions to yield new predictions

  42. There is value in forcing world knowledge into prediction form • We will finally have all the knowledge in the machine • all will be mechanically interpretable • we will finally really understand the knowledge’s meaning • anything else is just an empty shell • Agent will be able to learn/verify/extend knowledge • provides an internal coherence for the knowledge • enable building it up from a firm foundation • The knowledge will flow immediately into a general reasoning engine • the concatenation of predictions yields new predictions

  43. Conclusions • World knowledge must be expressed in terms of the data • Such posterior grounding is challenging, • lose expressiveness in the short term • lose external (human) coherence, explainability • But can be done step by step, • And brings palpable benefits • autonomous learning/verification/extension of knowledge • autonomous complexity management due to internal coherence • knowledge suited to general reasoning process • We must provide this grounding!

More Related