CS 416 Artificial Intelligence

CS 416Artificial Intelligence Lecture 2 Agents

Chess Article • Deep Blue (IBM) • 418 processors, 200 million positions per second • Deep Junior (Israeli Co.) • 8 processors, 3 million positions per second • Kasparov • 100 billion neurons in brain, 2 moves per second • But there are 85 billion ways to play the first four moves

Chess Article • 1997 - Kasparov Lost to Deep Blue • 2002 - Kramnik tied Deep Junior (current World Champion) • 2003 - Kasparov (current number 1) plays Deep Junior • Jan 26 – Feb 7

Chess Article • Cognitive psychologists report chess is a game of pattern matching for humans • But what patterns do we see? • What rules do we use to evaluate perceived patterns?

What is an agent? • Perception • Sensors receive input from environment • Keyboard clicks • Camera data • Bump sensor • Action • Actuators impact the environment • Move a robotic arm • Generate output for computer display

Perception • Percept • Perceptual inputs at an instant • May include perception of internal state • Percept Sequence • Complete history of all prior percepts • Do you need a percept sequence to play Chess?

An agent as a function • Agent maps percept sequence to action • Agent: • Set of all inputs known as state space • Agent Function • If inputs are finite, a table can store mapping • Scalable? • Reverse Engineering?

Evaluating agent programs • We agree on what an agent must do • Can we evaluate its quality? • Performance Metrics • Very Important • Frequently the hardest part of the research problem • Design these to suit what you really want to happen

Rational Agent • For each percept sequence, a rational agent should select an action that maximizes its performance measure • Example: autonomous vacuum cleaner • What is the performance measure? • Penalty for eating the cat? How much? • Penalty for missing a spot? • Reward for speed? • Reward for conserving power?

Learning and Autonomy • Learning • To update the agent function in light of observed performance of percept-sequence to action pairs • Explore new parts of state space • Learn from trial and error • Change internal variables that influence action selection

Adding intelligence to agent function • At design time • Some agents are designed with clear procedure to improve performance over time. Really the engineer’s intelligence. • Camera-based user identification • At run-time • Agent executes complicated equation to map input to output • Between trials • With experience, agent changes its program (parameters)

How big is your percept? • Dung Beetle • Largely feed forward • Sphex Wasp • Reacts to environment (feedback) but not learning • A Dog • Reacts to environment and can significantly alter behavior

Qualities of a task environment • Fully Observable • Agent need not store any aspects of state • The Brady Bunch as intelligent agents • Volume of observables may be overwhelming • Partially Observable • Some data is unavailable • Maze • Noisy sensors

Qualities of a task environment • Deterministic • Always the same outcome for state/action pair • Stochastic • Not always predictable – random • Partially Observable vs. Stochastic • My cats think the world is stochastic • Physicists think the world is deterministic

Qualities of a task environment • Markovian • Future state only depends on current state • Episodic • Percept sequence can be segmented into independent temporal categories • Behavior at traffic light independent of previous traffic • Sequential • Current decision could affect all future decisions • Which is easiest to program?

Qualities of a task environment • Static • Environment doesn’t change over time • Crossword puzzle • Dynamic • Environment changes over time • Driving a car • Semi-dynamic • Environment is static, but performance metrics are dynamic • Drag racing

Qualities of a task environment • Discrete • Values of a state space feature (dimension) are constrained to distinct values from a finite set • Blackjack: f(your cards, exposed cards) = action • Continuous • Variable has infinite variation • Antilock brakes: f (vehicle speed, wheel velocity) = unlock • Are computers really continuous?

Qualities of a task environment • Towards a terse description of problem domains • State space: features, dimensionality, degrees of freedom • Observable? • Predictable? • Dynamic? • Continuous? • Performance metric

Building Agent Programs • The table approach • Build a table mapping states to actions • Chess has 10150 entries (1080 atoms in the universe) • I’ve said memory is free, but keep it within the confines of the boundable universe • Still, tables have their place • Discuss four agent program principles

Simple Reflex Agents • Sense environment • Match sensations with rules in database • Rule prescribes an action • Reflexes can be bad • Don’t put your hands down when falling backwards! • Inaccurate information • Misperception can trigger reflex when inappropriate • But rules databases can be made large and complex

Simple Reflex Agents • Randomization • The vacuum cleaner problem Dirty Left Right

Model-based Reflex Agents • So when you can’t see something, you model it! • Create an internal variable to store your expectation of variables you can’t observe • If I throw a ball to you and it falls short, do I know why? • Aerodynamics, mass, my energy levels… • I do have a model • Ball falls short, throw harder

Model-based Reflex Agents • Admit it, you can’t see and understand everything • Models are very important! • We all use models to get through our lives • Psychologists have many names for these context-sensitive models • Agents need models too

Goal-based Agents • Lacking moment-to-moment performance measure • Overall goal is known • How to get from A to B? • Current actions have future consequences • Search and Planning are used to explore paths through state space from A to B

Utility-based Agents • Goal-directed agents that have a utility function • Function that maps internal and external states into a scalar • A scalar is a number

Learning Agents • Learning Element • Making improvements • Performance Element • Selecting actions • Critic • Provides learning element with feedback about progress • Problem Generator • Provides suggestions for new tasks to explore state space

A taxi driver • Performance Element • Knowledge of how to drive in traffic • Critic • Observes tips from customers and horn honking from other cars • Learning Element • Relates low tips to actions that may be the cause • Problem Generator • Proposes new routes to try and improved driving skills

Review • Outlined families of AI problems and solutions • Next class we study search problems

CS 416 Artificial Intelligence