Introduction to Intelligent Agents

Introduction to Intelligent Agents Jacques Robin

Outline • What are intelligent agents? • Characteristics of artificial intelligence • Applications and sub-fields of artificial intelligence • Characteristics of agents • Characteristics of agents’ environments • Agent architectures

Software Engineering Artificial Intelligence Agents Distributed Systems What are Intelligent Agents? • Q: What are Software Agents? • A: Software which architecture is based on the following abstractions: • Immersion in a distributed environment, continuous thread, encapsulation, sensor, perception, actuator, action, own goal, autonomous decision making • Q: What is Artificial Intelligence? • A: Field of study dedicated to: • Reduce the range of tasks that humans carry out better than current software or robots • Emulate humans’ capability to solve approximately but efficiently most instances of problems proven (or suspected) hard to solve algorithmically (NP-Hard, Undecidable etc.) in the worst case, using innovative, often human inspired, alternative computational metaphors and techniques • Emulate humans’ capability to solve vaguely specified problems using partial, uncertain information

Artificial Intelligence: Characteristics • Highly multidisciplinary inside and outside computer science • Ran-away field - by definition - at the forefront of computation tackling ever more innovative, challenging problems as the one it solved become mainstream computing • Most research in any other field of computation also involves AI problems, techniques, metaphors • Q: What conclusions can be derived from these characteristics? • A: Hard to avoid; very, very hard to do well • “Well” as in: • Well-founded (rigorously defined theoretical basis, explicit simplifying assumptions and limitations) • Easy to use (seamlessly integrated, easy to understand) • Easy to reuse (general, extendable techniques) • Scalable (at run time, at development time)

What is an Agent?General Minimal Definition • Any entity (human, animal, robot, software): • Situated in an environment (physical, virtual or simulated) that • Perceives the environment through sensors (eyes, camera, socket) • Acts upon the environment through effectors (hands, wheels, socket) • Possess its own goals, i.e., preferred states of the environments (explicit or implicit) • Autonomously chooses its actions to alter the environment towards its goals based on its perceptions and prior encapsulated information about the environment • Processing cycle: • Use sensor to perceive P • Interprets I = f(P) • Chooses the next action A = g(I,G) to perform to reach its goal G • Use actuator to execute A

Agent Perception Interpretation: I = f(P) P Goals Action Choice:A = g(I,O) A What is an Agent? Environment • Environment percepts • Self-percepts • Communicative percepts Sensors Autonomous Reasoning • Environment altering actions • Perceptive actions • Communicative actions Effectors

Intentionality: Encapsulate own goals (even if implicitly) in addition to data and behavior Decision autonomy: Pro-actively execute behaviors to satisfy its goals Can negate request for execution of a behavior from another agent More complex input/output: percepts and actions Temporal continuity: encapsulate an endless thread that constantly monitors the environment Coarser granularity: Encapsulate code of size comparable to a package or component Composed of various objects when implemented using an OO language No goal No decision autonomy: Execute behaviors only reactively whenever invoked by other objects Always execute behavior invoked by other objects Simpler input/output: mere method parameters and return values Temporally discontinuous: active only during the execution of its methods Agent x Object

Environment Percept Interpretation: I = f(P) AI Conventional Processing Conventional Processing Sensors Goals Action Choice: A = g(I,O) Effectors AI Intelligent Agent x Simple Software Agent

Disembodied AI System Situated Agent Environment Percept Interpretation Sensors InputData AI Reasoning Goals Goal AI Action Choice Effectors Output Data AI Intelligent Agent Classical AI System

What is an Agent? Other Optional Properties • Reasoning Autonomy: • Requires AI, inference engine and knowledge base • Key for: embedded expert systems, intelligent controllers, robots, games, internet agents ... • Adaptability: • Requires IA, machine learning • Key for: internet agents, intelligent interfaces, ... • Sociability: • Requires AI + advanced distributed systems techniques: • Standard protocols for communication, cooperation, negotiation • Automated reasoning about other agents’ beliefs, goals, plans and trustfulness • Social interaction architectures • Key for: multi-agent simulations, e-comerce, ...

What is an Agent?Other Optional Properties • Personality: • Requires AI, attitude and emotional modeling • Key for: Digital entertainment, virtual reality avatars,user-friendly interfaces ... • Temporal continuity and persistence: • Requires interface with operating system, DBMS • Key for: Information filtering, monitoring, intelligent control, ... • Mobility: • Requires: • Network interface • Secure protocols • Mobile code support • Key for: information gathering agents, ... • Security concerns prevented its adoption in practice

Welcome to the Wumpus World! Agent-Oriented Formulation: • Agents: gold digger • Environment objects: • caverns, walls, pits, wumpus, gold, bow, arrow • Environment’s initial state • Agents’ goals: • be alive cavern (1,1) with the gold • Perceptions: • Touch sensor: breeze, bump • Smell sensor: stench • Light sensor: glitter • Sound sensor: scream • Actions: • Legs effector: forward, rotate 90º • Hands effector: shoot, climb out

4 P B S G W P 3 S, B, G B 2 S B A 1 P B B start 1 2 3 4 Wumpus World: Abbreviations A - Agent W - Wumpus P - Pit G - Gold X? – Possibly X X! – Confirmed X V – Visited Cavern B – Breeze S – Stench G – Glitter OK – Safe Cavern

4 4 3 3 2 2 ok A P? V A ok 1 1 ok ok P? b ok 1 2 3 4 1 2 3 4 ok Perceiving, Reasoning and Actingin the Wumpus World • Percept sequence: nothing breeze • Wumpus world model maintained by agent: t=0 t=2

4 P? 4 A P? W! 3 S B G W! 3 2 V V S ok ok ok A s 2 ok ok V V P! 1 b ok ok V V P! 1 b ok ok 1 2 3 4 1 2 3 4 Perceiving, Reasoning and Actingin the Wumpus World • Percept sequence: stench {stench, breeze, glitter} • Wumpus World Model: • Action Sequence: t=11: Go to (2,3) to find gold t=7: Go to (2,1), Sole safe unvisited cavern

Classification Dimensionsof Agent Environments • Agent environments can be classified as points in a multi-dimensional spaces • The dimensions are: • Observability • Determinism • Dynamicity • Mathematical domains of the variables • Episodic or not • Multi-agency • Size • Diversity

Observability • Fully observable (or accessible): • Agent sensors perceive at each instant all the aspects of the environment relevant to choose best action to take to reach goal • Partially observable (or inaccessible, or with hidden variables) • Sources of partial observability: • Realm inaccessible to any available sensor • Limited sensor scope • Limited sensor sensitivity • Noisy sensors

Determinism • Deterministic: all occurrence of executing a given action in a given situation always yields same result • Non-deterministic (or stochastic): action consequences partially unpredictable • Sources of non-determinism: • Inherent to the environment: quantic granularity, games with randomness • Other agents with unknown or non-deterministic goal or action policy • Noisy effectors • Limited granularity of effectors or of the representation used to choose the actions to execute

Dynamicity: Staticand Sequential Environments • Static: Single perception-reasoning-action cycle during which environment is static • Sequential: Sequence of perception-reasoning-action cycles during each of which the environment changes only as a result of the agent’s actions State 1 State 2 Static Environment Agent Reasoning Percept Ação ... Sequential Environment State N State 1 State 2 State 3 Agent Reasoning Reasoning Reasoning Percept Action Percept Action Percept Ação

... Synchronous Concurrent Environment State 2 State 3 State 1 State 4 State 5 Agent Reasoning Reasoning Percept Action Percept Action Asynchronous Concurrent Environment ... State 2 State 3 State 6 State 1 State 4 State 5 Agent Reasoning Percept Action Percept Action Reasoning Dynamicity: ConcurrentSynchronous and Asynchronous • Synchronous: Environment can change on its own between one action and the next perception of an agent, but not during its reasoning • Asynchronous: Environment can change on its own at any time, including during the agent’s reasoning

Dynamicity: Stationary andNon-Stationary • Stationary: The underlying laws or rules that govern state changes in the environment are fixed and immutable; they remain the same during the entire lifetime of the agent • ex, a soccer game is asynchronous, yet stationary • Non-Stationary: The underlying laws or rules that govern state changes in the environment are themselves subject to dynamic changes (meta-level changes) during the lifetime of the agent • ex, an accounting agent acts in a non-stationary environment, since the tax laws are subject to changes from one year to the next

Multi-Agency • Sophistication of agent society: • Number of agent roles and agent instances • Multiplicity and dynamicity of agent roles • Communication, cooperation and negotiation protocols • Main classes: • Mono-agent • Multi-agent cooperative • Multi-agent competitive • Multi-agent cooperative and competitive • With static or dynamic coalitions

Mathematical Domain of Variables • MAS variables: • Parameters of agent percepts, actions and goals • Attributes of environment objects • Arguments of environment relations, states, events and locations Boolean Discrete Binary Dichotomical Qualitative Nominal Ordinal Interval Quantitative Fractional R Continuous [0,1]

Binary: Boolean, ex, Male  {True,False} Dichotomic, ex, Sex  {Male, Female} Nominal (or categorical) Finite partition of set without order nor measure Relations: only = ou  ex, Brazilian, French, British Ordinal (or enumerated): Finite partition of (partially or totally) ordered set without measure Relations: only =, , , > ex, poor, medium, good, excellent Interval: Finite partition of ordered set with measure m defining distance d: X,Y, d(X,Y) = |m(X)-m(Y)| No inherent zero ex, Celsius temperature Fractional (or proportional): Partition with distance and inherent zero Relations: anyone ex, Kelvin temperature Continuous (or real) Infinite set of values Mathematical Domain of Variables

Other Characteristics • Episodic: • Agent experience is divided in separate episodes • Results of actions in each episode, independent of previous episodes ex.: image classifier is episodic, chess is not soccer tournament is episodic, soccer game is not • Open environment: • Partially observable, Non-deterministic, Non-episodic, Continuous Variables, Concurrent Asynchronous, Multi-Agent. • ex.: RoboCup, Internet, stock market

Size, i.e.,number of instances of: Agent percepts, actions and goals Environment agents, objects, relations, states, events and locations Dramatically affects scalability of agent reasoning execution Diversity, i.e., number of classes of: Agent percepts, actions and goals Environment agents, objects, relations, states, events and locations Dramatically affects scalability of agent knowledge acquisition process Size and Diversity

Agents’ Internal Architectures • Reflex agent (purely reactive) • Automata agent (reactive with state) • Goal-based agent • Planning agent • Hybrid, reflex-planning agent • Utility-based agent (decision-theoretic) • Layered agent • Adaptive agent (learning agent) • Cognitive agent • Deliberative agent

Reflex Agent Environment Sensors Rules Percepts  Action A(t) = h(P(t)) Effectors

Agent P Reasoning Action Choice:A = g(I,O) A Remember … Environment Percept Interpretation: I = f(P) Sensors Goals Effectors

Environment Percept Interpretation: I = f(P) P Sensors Rules Percepts  Action A(t) = h(P(t)) Goals A Action Choice:A = g(I,O) Effectors So?

Reflex Agent • Principle: • Use rules (or functions, procedures) that associate directly percepts to actions • ex.IF speed > 60 THEN fine • ex.IF front car’s stop light switches on THEN brake • Execute first rule which left hand side matches the current percepts • Wumpus World example • IF visualPerception = glitter THEN action = pick • see(glitter)  do(pick) (logical representation) • Pros: • Condition-action rules is a clear, modular, efficient representation • Cons: • Lack of memory prevents use in partially observable, sequential, or non-episodic environments • ex, in the Wumpus World a reflex agent can’t remember which path it has followed, when to go out of the cavern, where exactly are located the dangerous caverns, etc.

Automata Agent Environment Percept Interpretation Rules: percepts(t)  model(t)  model’(t) Sensors (Past and) Current Enviroment Model Model Update Regras: model(t-1)  model(t) model’(t)  model’’(t) Goals Action Choice Rules: model’’(t)  action(t), action(t)  model’’(t)  model(t+1) Effectors

Automata Agent • Rules associate actions to percept indirectly through the incremental construction of an environment model (internal state of the agent) • Action choice based on: • current percepts + previous percepts + previous actions + encapsulated knowledge of initial environment state • Overcome reflex agent limitations with partially observable, sequential and non-episodic environments • Can integrate past and present perception to build rich representation from partial observations • Can distinguish between distinct environment states that are indistinguishable by instantaneous sensor signals • Limitations: • No explicit representation of the agents’ preferred environment states • For agents that must change goals many times to perform well, automata architecture is not scalable (combinatorial explosion of rules)

Automata Agent Rule Examples • Rules percept(t) model(t)  model’(t) • IF visualPercept at time T is glitterAND location of agent at time T is (X,Y)THEN location of gold at time T is (X,Y) • X,Y,T see(glitter,T) loc(agent,X,Y,T)loc(gold,X,Y,T). • Rules model’(t) model’’(t) • IF agent is with gold at time TAND location of agent at time T is (X,Y)THEN location of gold at time T is (X,Y) • X,Y,T withGold(T)  loc(agent,X,Y,T)loc(gold,X,Y,T).

Automata Agent Rule Examples • Rules model(t)  action(t) • IF location of agent at time T = (X,Y) AND location of gold at time T = (X,Y) THEN choose action pick at time T • X,Y,T loc(agent,X,Y,T)  loc(gold,X,Y,T)  do(pick,T) • Rules action(t)  model(t)  model(t+1) • IF choosen action at time T was pick THEN agent is with gold at time T+1 • T done(pick,T)  withGold(T+1).

(Explicit) Goal-Based Agent Environment Percept Interpretation Rules: percept(t)  model(t)  model’(t) Sensors (Past and) CurrentEnvironment Model Model Update Rules:model(t-1)  model(t) model’(t)  model’’(t) Goal Update Rules:model’’(t)  goals(t-1)  goals’(t) Goals Action Choice Rules: model’’(t)  goals’(t)  action(t) action(t)  model’’(t)  model(t+1) Effectors

(Explicit) Goal-Based Agent • Principle: explicit and dynamically alterable goals • Pros: • More flexible and autonomous than automata agent • Adapt its strategy to situation patterns summarized in its goals • Limitations: • When current goal unreachable as the effect of a single action, unable to plan sequence of actions • Does not make long term plans • Does not handle multiple, potentially conflicting active goals

Goal-Based Agent Rule Examples • Rule model(t)  goal(t) action(t) • IF goal of agent at time T is to return to (1,1) AND agent is in (X,Y) at time T AND orientation of agent is 90o at time T AND (X,Y+1) is safe at time T AND (X,Y+1) has not being visited until time T AND (X-1,Y) is safe at time T AND (X-1,Y) was visited before time T THEN choose action turn left at time T • X,Y,T, (N,M,K goal(T,loc(agent,1,1,T+N)) loc(agent,X,Y,T)  orientation(agent,90,T)  safe(loc(X,Y+1),T) loc(agent,X,Y+1,T-M)  safe(loc(X-1,Y),T)  loc(agent,X,Y+1,T-K)) do(turn(left),T)

Goal-Based Agent Rule Examples • Rule model(t)  goal(t)  action(t) • IF goal of agent at time T is to find gold AND agent is in (X,Y) at time T AND orientation of agent is 90o at time T AND (X,Y+1) is safe at time T AND (X,Y+1) has not being visited until time T AND (X-1,Y) is safe at time T AND (X-1,Y) was visited before time T THEN choose action forward at time T • X,Y,T, (N,M,K goal(T,withGold(T+N)) loc(agent,X,Y,T) orientation(agent,90,T)  safe(loc(X,Y+1),T)  loc(agent,X,Y+1,T-M)  safe(loc(X-1,Y),T)  loc(agent,X,Y+1,T-K)) do(forward,T)

Goal-Based Agent Rule Examples • Rule model(t)  goal(t) goal’(t) //If the agent reached it goal to hold the gold, //then its new goal shall be to go back to (1,1) • IF goal of agent at time T-1 was to find gold AND agent is with gold at time T THEN goal of agent at time T+1 is to be in location (1,1) • T, (N goal(agent,T-1,withGold(T+N))  withGold(T)M goal(agent,T,loc(agent,1,1,T+M))).

Planning Agent Environment (Past and)Current Environment Model Percept Interpretation Rules: percept(t)  model(t)  model’(t) Sensors Model Update Rules:model(t-1)  model(t) model’(t)  model’’(t) Goal Update Rules:model’’(t)  goals(t-1)  goals’(t) Goals Prediction of Future Environments Rules: model’’(t)  model(t+n) model’’(t)  action(t)  model(t+1) Hypothetical Future Environment Models Action Choice Rules: model(t+n) = result([action1(t),...,actionN(t+n)] model(t+n) goal(t)  do(action1(t)) Effectors

Planning Agent • Percept and actions associated very indirectly through: • Past and current environment model • Past and current explicit goals • Prediction of future environments resulting from different possible action sequences to execute • Rule chaining needed to build action sequence from rules capture immediate consequences of a single action • Pros: • Foresight allows choosing more relevant and safer actions in sequential environments • Cons: little point in building elaborated long term plans in, • Highly non-deterministic environment (too many possibilities to consider) • Largely non-observable environments (not enough knowledge available before acting) • Asynchronous concurrent environment (only cheap reasoning can reach a conclusion under time pressure)

Synchronization Hybrid Reflex-Planning Agent Environment Reflex Thread Reflex Rules Percepts Actions Sensors Planning Thread Current, past and future environment model Percept Interpretation Current Model Update Future Environments Prediction Effectors Goal Update Goals Action Choice

Hybrid Reflex-Planning Agent • Pros: • Take advantage of all the time and knowledge available to choose best possible action (within the limits of its prior knowledge and percepts) • Sophisticated yet robust • Cons: • Costly to develop • Same knowledge encoded in different forms in each component • Global behavior coherence harder to guarantee • Analysis and debugging hard due to synchronization issues • Not that many environments feature large variations in available reasoning time in different perception-reasoning-action cycles

Layered Agents • Many sensors/effectors are too fine-grained to reason about goals using directly the data/commands they provide • Such cases require a layered agent that decomposes its reasoning in multiple abstraction layers • Each layer represent the percepts, environment model, goals, and actions at a different level of details • Abstraction can consist in: • Discretizing, approximating, clustering, classifying data from prior layers along temporal, spatial, functional, social dimensions • Detail can consist in: • Decomposing higher-level actions into lower-level ones along temporal, spatial, functional, social dimensions Decide Abstractly Abstract Detail Perceive in Detail Act in Detail

Ambiente Percept Interpretation Layer2: Layer1: Sensors Layer0: Environment Model Environment Model Update Layer2: Layer2: Action Choice and Execution Control Layer2: Layer1: Effectors Layer0: Layered Automata Agent

Y X Exemplo de camadas de abstração:

Y X Abstraction Layer Examples

Utility-Based Agent • Principle: • Goals only express boolean agent preferences among environment states • A utility function u allows expressing finer grained agent preferences • u can be defined on a variety of domains and ranges: • actions, i.e., u: action  R (or [0,1]), • action sequences, i.e., u: [action1, ..., actionN] R (or [0,1]), • environment states, i.e., u: environmentStateModel  R (or [0,1]), • environment state sequences, i.e., u: [state1, ..., stateN]  R (or [0,1]), • environment state, action pairs, i.e., u: environmentStateModel x action  R (or [0,1]), • environment state, action pair sequences, i.e., u: [(action1-state1), ..., (actionN-stateN)] R (or [0,1]), • Pros: • Allows solving optimization problems aiming to find the best solution • Allows trading-off among multiple conflicting goals with distinct probabilities of being reached • Cons: • Currently available methods to compute (even approximately) argmax(u) do not scale up to large or diverse environments

Environment Percept Interpretation: Rules: percept  actions Sensors Goals Effectors Utility-Based Reflex Agent Action Choice: Utility Function u:actions  R

Introduction to Intelligent Agents