Research on the Use of Intelligent Agents in Training Systems at Texas A&M

Research on the Use of Intelligent Agents in Training Systems at Texas A&M Thomas R. Ioerger Associate Professor Department of Computer Science Texas A&M University

Our Approach Historical Context of Projects TRL - an agent architecture Modeling Teamwork CAST - a multi-agent architecture Advanced Team Behaviors User Modeling in a Team Context Cognitive Modeling of Command and Control Outline

Develop programmable agents that can be hooked up with simulators Embed algorithms for interpreting collaborative activity to automatically produce appropriate interactions they should be able to infer when to act or communicate, like humans Simulating human behavior is useful for training... Our Approach

University XXI - DoD funding (1999-2000) developed TRL for modeling info flow in Bn TOCs MURI - AFOSR funding (2001-2005) worked with cognitive scientists to develop theories of how to use agents in training, e.g. for AWACS Army Research Lab, Aberdeen (2001-2002) HBR modeling of teams in sims like OneSAF, JVB NASA (current) SATS: future ATC with aircraft self-separation Historical Context

TRL Agent Architecture • Declarative and procedural knowledge bases • TRL Knowledge Representation Language • - For Capturing Procedural Knowledge (Tasks & Methods) • APTE Method Selection-Algorithm • - responsible for building, maintaining, and repairing task-decomposition trees • Inference Engine JARE • - Java Automated Reasoning Engine • - Knowledge Base with Facts and Horn Clauses • - back-chaining (like Prolog) • - Updating World With Facts • - now OpenSource at http://jare.sourceforge.net • Written in Java

OtherAgents TRL Agent Architecture Diagram TaskableAgents TRL Task Decomposition Hierarchy assert, query, retract APTE Algorithm TRL KB: tasks & methods JARE KB: facts & Horn-clauses Process Nets results messages sensing operators messages OTB (simulation)

First-order Horn-clauses (rules with variables) Similar to PROLOG Make inferences by back-chaining consequent antecedents ((threat ?a ?b)(enemy ?a)(friendly ?b) (in-contact ?a ?b)(larger ?a ?b) (intent ?a aggression)) >(query (threat ?x task-force-122)) solution 1: ?x = regiment-52 solution 2: ?x = regiment-54 JARE Knowledge Base

Task Representation Language (TRL) • Provides descriptors for: goals, tasks, methods, and operators • Tasks: “what to do” • Can associate alternative methods, with priorities or preference conditions • Can have termination conditions • Methods: “how to do it” • Can define preference conditions for alternatives • Process Net • - Procedural language for specifying how to do things • - While loops, if conditionals, sequential, parallel constructs • - Can invoke sub-tasks or operators • Operators: lowest-level actions that can be directly executed in the simulation environment, e.g. move unit, send message, fire on enemy • Each descriptor is a schema with arguments and variables • Conditions are evaluated as queries to JARE

Example TRL Knowledge (:Task Monitor (?unit) (:Term-cond (destroyed ?unit)) (:Method (Track-with-UAV ?unit) (:Pref-cond (not (weather cloudy)))) (:Method (Follow-with-scouts ?unit) (:Pref-cond (ground-cover dense)))) (:Method Track-with-UAV (?unit) (:Pre-cond (have-assets UAV)) (:Process (:seq (:if(:cond(not(launched UAV)))(launch UAV)) (:let((x y)(loc ?unit ?x ?y))(fly UAV ?x ?y)) (circle UAV ?x ?y))))

Task-Decomposition Hierarchy level 1 T1 level 2 M1 T3 level 3 T2 T5 T4 level 4 M7 M12 M92 M60 level 5 T40 T15 T18 T40 T45 T2 C T45 Tx =Task Mx = Method C = Condition Process Nets

TOC Staff - Agent Decomposition Maintain friendly situation, Maneuver sub-units Control indirect fire, Artillery, Close Air, ATK Helicopter S3 FSO Maintain enemy situation, Detect/evaluate threats, Evaluate PIRs S2 CDR Move/hold, Make commands/decisions, RFI to Brigade Companies Scouts Maneuver, React to enemy/orders, Move along assigned route Move to OP, Track enemy

Team Psychology Research: Salas, Cannon-Bowers, Serfaty, Ilgen, Hollenbeck, Koslowski, etc. “two or more individuals working together, interdependently, toward a common goal” members often play distinct roles types of control: centralized (hierarchical) vs. distributed (consensus-oriented) process measures vs. outcome measures communication, adaptiveness shared mental models Modeling Teamwork

Commitment to shared goals Joint Intentions (Cohen & Levesque; Tambe) Cooperation, non-interference Backup roles, helping behavior Mutual awareness goals of teammates; achievement status information needs Coordination, synchronization Distributed decision making consensus formation (voting), conflict resolution Computational Models of Teamwork

developed at Texas A&M; part of MURI grant from DoD/AFOSR multi-agent system implemented in Java components: MALLET: a high-level language for describing team structure and processes JARE: logical inference, knowledge base Petri Net representation of team plan special algorithms for: belief reasoning, situation assessment, information exchange, etc. CAST: Collaborative AgentArchitecture for Simulating Teamwork

CAST Architecture expand team tasks into Petri nets keep track of who is doing each step agent teammates MALLET knowledge base (definition of roles, tasks, etc.) messages human teammates events, actions state data JARE knowledge base (domain rules) simulation make queries to evaluate conditions, assert/retract information models of other agents’ beliefs Agent

(role sam scout) (role bill S2) (role joe FSO) (responsibility S2 monitor-threats) (capability UAV-operator maneuver-UAV) (team-plan indirect-fire (?target) (select-role (scout ?s) (in-visibility-range ?s ?target)) (process (do S3 (verify-no-friendly-units-in-area ?target)) (while (not (destroyed ?target)) (do FSO (enter-CFF ?target)) (do ?s (perform-BDA ?target)) (if (not (hit ?target)) (do ?s (report-accuracy-of-aim FSO)) (do FSO (adjust-coordinates ?target)))))) MALLET descriptions of team structure evaluated by queries to JARE knowledge base descriptions of team process

(role al holder) (role dan holder)... (team-plan kick-field-goal () (select-role (?c (center ?c)) (?h (holder ?h) (not (injured ?h))) (?k (kicker ?k))) (process (seq (hike-ball ?c) (catch-ball ?h) (hold-ball ?h) (kick-ball ?k)))) When there is ambiguity, agents automatically communicate (send messages) to decide who will do what Key points: coordination does not have to be explicit in plan defer task assignments to see who is best Dynamic Role Selection

Information sharing is a key to efficient teamwork Want to capture information flow in team, including proactive distribution of information Agent A should send message I to Agent B iff: A believes I is true A believes B does not already believe I (non-redundant) I is relevant to one of B’s goals, i.e. pre-condition of current goal that B is responsible for in plan DIARG Algorithm (built into CAST): 1. check for transitions which other agents are responsible for that can fire (pre-conds satisfied) 2. infer whether other agent might not believe pre-conds are true (currently, beliefs based on post-conditions of executed steps, i.e. tokens in output places) 3. send proactive message with information Proactive Information Exchange

AWACS - DDD (Aptima, Inc.)

How to use agents in training? How to improve team performance? Classic approach: shared mental models Impact of individual cognition on teamwork Collab. with Wayne Shebilske (Wright State) attention management, workload, automaticity, reserve capacity to help/share Many possible roles for agents: user modeling, coaching, feedback, AAR, dynamic scenarios, role players, partners, enemies, low-cost highly-available practice... Approaches to Team Training

Complex tasks (e.g. operating machinery) multiple cognitive components (memory, perceptual, motor, reasoning/inference...) novices feel over-whelmed limitations of part-task training automaticity vs. attention management Role for intelligent agents? can place agents in simulation environments need guiding principles to promote learning Complex Tasks, and the Needfor new Training Methods

AIM (Active Interlocked Modeling; Shebilske, 1992) trainees work in pairs (AIM-Dyad) each trainee does part of the task together importance of context (integration of responses) can produce equal training, 100% efficiency gain co-presence/social variables not required trainees placed in separate rooms correlation with intelligence of partner Bandura, 1986: “modeling” Previous Work: Partner-Based Training

Hypothesis: Would the training be as effective if the partner were played by an intelligent agent? Important pre-requisite: a CTA (cognitive task analysis) a hierarchical task-decomposition allows functions to be divided in a “natural” way between human and agent partners Automating the Partner with an Intelligent Agent

Representative of complex tasks has similar perceptual, motor, attention, memory, and decision-making demands as flying a fighter jet continuous control: navigation with joystick, 2nd-order thrust control discrete events: firing missles, making bonus selections with mouse must learn rules for when to fire, boundaries... Large body of previous studies/data Multiple Emphasis on Components (MEC) protocol transfers to operational setting (attention mgmt) Space Fortress: Laboratory Task

P M I MOUSE BUTTONS JOYSTICK THE FORTRESS SHIP BONUS AVAILABLE $ MISSLE A MINE PNTS CNTRL VLCTY VLNER IFF INTRVL SPEED SHOTS 200 100 119 0 W 90 70

Implemented decision-making procedures for automating mouse and joystick Added if-then-else rules in C source code emulate Decision-Making with rules Agent simple, but satisfies criteria: situated, goal-oriented, autonomous First version of agent played too “perfectly” Make it play “realistically” by adding some delays and imprecision (e.g. in aiming) Implementation of a Partner Agent

Hypothesis: Training with agent improves final scores Protocol: 10 sessions of 10 3-minute trials each (over 4 days) each session 1/2 hour: 8 practice trials, 2 test trials Groups: Control (standard instructions+practice) Partner Agent: (instructions+practice, alternate mouse and joystick between trainee and agent) Participants: 40 male undegrads at WSU <20 hrs/wk playing video games Experiment 1

Results of Expt 1 *Difference in final scores was significant at p<0.05 level by paired T-test (with dof=38): t=2.33>2.04

Results of Expt 1 raises follow-up question: What is the effect of the level of expertise simulated by the agent? Can make the agent more or less accurate. Recall: correlation with partner’s intelligence Is it better to train with an expert? or perhaps with a partner of matching skill-level?... novices might have trouble comprehending experts strategies since struggling to keep up Effect of Level of Simulated Expertise of Agent?

Results of Expt 2 Conclusion: Training with an expert partner agent is best.

Principled approach to using agents in training systems: as partners - cognitive benefits Works best if there is a high degree of de-coupling among sub-tasks if greater interaction, agent might have to “cooperate” with human by interpreting and responding to apparent strategies Desiderata for Partner Agents: 1. Correctness 2. Consistency (necessary for modeling) 3. Realism (how to simulate human “errors”?) 4. Exploration (errors lead to unusual situations) Lessons Learned for Future Applications

Should also consider effect of workload, skill, and attention (user modeling) Working hypothesis: effective teamwork requires sufficient reserve capacity and attention management to be able to monitor activities of teammates of offer help or information Design of a team training protocol look at impact of attention training on frequency of interactions and helping behaviors within team Application to Team Training

Agents can track trainees’ actions using team plan, offer hints (either online or via AAR) Standard approach: plan recognition Team context increases complexity of explaining actions and mistakes failed because lack domain knowledge, situational information, or “it’s not my responsibility”? The more Traditional Approach: Agent-Based Coaching

What’s missing from teamwork simulations? we have roles, proactive information sharing special teams: Tactical Decision Making (TDM) C2 is what many teams are “doing” in many application areas (civilian as well as military) distributed actions distributed sensors, uncertainty adversarial environment, ambiguity of enemy intent How to “practice” doing C2 better as a team? it’s all about gathering and fusing information... Modeling Command and Control

many field studies of TDM teams... Naturalistic Decision Making (Klein) Situation Awareness (Endsley) Recognition-Primed Decision Making (RPD) Cognitive Aspects of C2 while (situation not clear) choose feature unknown initiate find-out procedure trigger response action

Basic Activities to Integrate mission objectives information gathering, situation assessment tactical decision making implicit goals: maintain security maintain communications maintain supplies emergency procedures, handling threats

Implement RPD loop in TRL represent situations, features, weights in JARE find-out procedures e.g. use radar, UAV, scouts, RFI to Bde, phone, email, web site, lab test... challenges: information management (selection, tracking, uncertainty, timeouts) priority management among activities Overview of Approach

C2/CAST: declarative and procedural KB’s (rules and plans)

situations: S1...Sn e.g. being flanked, ambushed, bypassed, diverted, enveloped, suppressed, directly assaulted features associated with each sit.: Fi1...Fim RPD predicts DM looks for these features weights: based on relevance of feature (+/-) evidence(Si)=Sj=1..m wij. Fij > qi unknowns: assume most probable value: Fi=true if P[Fi=true]>0.5, else Fi=false Model of Situation Assessment

(see ICCRTS’03 paper for details) basic loop: while situation is not determined (i.e. no situation has evidence>threshold), pick a relevant feature whose value is unknown select a find-out procedure, initiate it information management issues ask most informative question first (cost? time?) asynchronous, remember answers pending some information may go stale over time (revert to unknown, re-invoke find-out) Situation Awareness Algorithm

Model: current “alert” level suspends lower-level activities 5 - handling high-level threats 4 - situation awareness 3 - handling low-level threats 2 - maintenance tasks for implicit goals 1 - pursuing targets of opportunity 0 - executing the mission Priorities high-level threat occurs, suspend mission resume mission when threat handled

Extending RPD to model to team as a “shared plan” agents have shared model of common situations and relevant information work together to disambiguate and derive consesus on identity of situation infer what local information is relevant to the group and sychronize views (resolve conflicts) Knowledge acquisition of air combat situations for modeling AWACS WD’s Current Work on C2

Wayne Shebilske (Wright State, Psych) Richard Volz (Texas A&M, Comp. Sci.) John Yen (Penn State, IST) Collaborators

Research on the Use of Intelligent Agents in Training Systems at Texas A&M