A Biologically Motivated Software Architecture for an Intelligent Humanoid Robot Richard Alan Peters II, D. Mitchell Wilkes, Daniel M. Gaines, and Kazuhiko Kawamura Center for Intelligent Systems Vanderbilt University Nashville, Tennessee, USA
Intelligence The ability of an individual to learn from experience, to reason well, to remember important information, and to cope with the demands of daily living. (R. Sternberg, 1994). • Intelligence has emerged through evolution and is manifest in mammals. The processes of designing an intelligent robot might be facilitated by mimicking the natural structures and functions of mammalian brains.
Topics • Mammalian Brains • ISAC, the Vanderbilt Humanoid Robot • A Control System Architecture for ISAC • A Partial Implementation • Research Issues
The Structure of Mammalian Brains Krubitzer, Kass, Allman, Squire • The evolution of structure • Common features of neocortical organization • Species differences • Memory and association • Attention • Implications for robot control architectures
The Evolution of Structure Figure: Leah Krubitzer
Common features of Neocortical Organization • Somatosensory Cortex (SI, SII) • Motor Cortex (M) • Visual Cortex (VI, VII) • Auditory Cortex (AI) • Association Cortex • Size differences in cortical modules are disproportionate to size differences in cortex
Common features of Neocortical Organization Figure: Leah Krubitzer
Species Differences • Sizes and shapes of a specific cortical field • Internal organization of a cortical field • Amount of cortex devoted to a particular sensory or cognitive function • Number of cortical fields • Addition of modules to cortical fields • Connections between cortical fields
Memory: a Functional Taxonomy Squire • Immediate memory: data buffers for current sensory input; holds information for about 0.1s • Working memory: scratch-pads, e.g. phono-logical loop, visuospatial sketch pad; the representation of sensory information in its absence • Short term memory (IM & WM) is a collection of memory systems that operate in parallel • Long-term memory: can be recalled for years; different physically from STM
Memory: Biological Mechanisms • Immediate memory — chemicals in synapse • Working memory — increase in presynaptic vesicles; intra-neuron and inter-neuron protein release and transmitter activity • Long-term memory — growth of new synapses; requires transcription of genes in neurons.
Association • The simultaneous activation of more than one sensory processing area for a given set of external stimuli • A memory that links multiple events or stimuli • Much of the neocortex not devoted to sensory processing appears to be involved in association
Memory and Sensory Data Bandwidth • Bandwidth of signals out of sensory cortical fields is much smaller than input bandwidth • Sensory cortical fields all project to areas within association cortex • Suggests: Environment is scanned for certain salient information, much is missed. Previous memories linked by association fill in the gaps in information.
Attention: a Definition • An apparent sequence of spatio-temporal events, to which a computational system or subsystem allocates a hierarchy of resources. • In that sense, the dynamic activation of structures in the brain is attentional .
Attention: Some Types • Visual — where to look next • Auditory — sudden onset or end of sound • Haptic — bumped into something • Proprioceptic — entering unstable position • Memory — triggered by sensory input • Task — action selection • Conscious — recallable event sequence
Attention: Executive Control Figure: Posner & Raichle
Mammalian Brains • Have sensory processing modules that work continually in parallel • Selectively filter incoming sensory data and supplement that information from memory through context and association • Exhibit dynamic patterns of activity through local changes in cellular metabolism — shifts in activation
Physical Structure of ISAC • Arms: two 6 DOF actuated by pneumatic McKibben artificial muscles • Hands: anthropomorphic, pneumatic with proximity sensors and 6-axis FT sensors at wrists • Vision: stereo color PTV head • Audition: user microphone • Infrared motion sensor array
ISAC Hardware under Construction • Hybrid pneumatic / electric anthropomorphic hand • Head mounted binaural microphone system • Finger tip touch sensors
Computational Structure of ISAC • Network of standard PCs • Windows NT 4.0 OS • Special hardware limited to device controllers • Designed under Vanderbilt’s Intelligent Machine Architecture (IMA)
Low-Level Software Architecture: IMA • Software agent (SA) design model and tools • SA = 1 element of a domain-level system descr. • SA tightly encapsulates all aspects of an element • SAs communicate through message passing • Enables concurrent SA execution on separate machines on a network • Facilitates inter-agent communication w/ DCOM • Can implement almost any logical architecture
Primitive Agent Types • Hardware: provide abstractions of sensors and actuators, and low level processing and control (e.g., noise filtering or servo-control loops). • Behavior: encapsulate tightly coupled sensing - actuation loops. May or may not have runtime parameters . • Environment: process sensor data to update an abstraction of the environment. Can support behaviors such as ``move-to'' or ``fixate'' which require run-time parameters. • Task: encapsulate decision-making capabilities, and sequencing mechanisms for hardware, behavior, and environment agents.
Agent Object Model • The agent object model describes how an agent network, defined by the robot-environment model, is constructed from a collection of component objects
IMA Component Objects • Agent Comp. — agent interfaces to manager and to persistent streams • Policy Comp. — encapsulates an OS thread • Representation Comp. — a DCOM object that communicates an agent’s state to other agents • Mechanism Comp. — configurable objects that can be invoked to perform one of a set of computations • Agent Links — interfaces defined by representations • Relationship Comp. — manage a set of agent links to selectively update and / or use each participant link
Properties of IMA • Granularity — multiple logical levels • Composition — agents can incorporate agents • Reusable — can be combined for new functionalities • Inherently Parallel — asynchronous, concurrent op. • Explicit Representation — sensor info is ubiquitous • Flat Connectivity — all agents are logically equal w.r.t. sensory info access and actuator commands • Incremental — All modules that produce commands for the hardware work in an incremental mode
A Bio-Inspired Control Architecture • IMA can be used to implement almost any control architecture. • Individual agents can tightly couple sensing to actuation, and incorporate each other a la subsumption (Brooks). • IMA Inter-agent communications facilitate motor schemas (Arkin). • Composition of agents which have flat connectivity enables hybrid architectures
ISAC Control System Architecture • Primary Software Agents • Sensory EgoSphere • Attentional Networks • Database Associative Memory • Attentional Control via Activation • Learning • System Status Self Evaluation
Example Primary Software Agents: • Visual attention • Color segmentation • Object recognition • Face recognition • Gesture recognition • Vision • Audition • Aural attention • Sound segmentation • Speech recognition • Speaker identification • Sonic localization
Example Primary Software Agents • L & R Arm control • L & R Hand control • PTV motion • Others • Motor • Infrared motion det. • Peer agents • Object agents • Attention agent • Sensory data recd’s.
Higher Level Software Agents: • Robot self agent • Human agent • Object agents (various) • Visually guided grasping • Converse • Reflex control • Memory association • Visual tracking • Visual servoing • Move EF to FP • Dual arm control • Person Id • Interpret V-Com • Reflex control
Agents and Cortical Fields • Agents can be designed to be functionally analogous to the common field structure of the neocortex. • Visual, auditory, haptic, proprioceptic, attentional, and memory association agents remain on constantly and always transform the current sensory inputs from the environment
Atlantis: a Three Layer Architecture • Deliberator — world model, planner • Sequencer — task queue, executor, monitors • Controller — sensor / actuator coupled behaviors • Erran Gat
Atlantis: General Schematic • Figure: Erran Gat
Three-Layer Control with IMA • Elements of control layer: agents. • Sequencing: through links depending on activation vectors. (Due to flat connectivity.) • Deliberation: Various agents modify the links and activation levels of others. (Due to composability.)
Virtual Three-Layer Architecture • Deliberative Agent Sn IMA Agent S1 S3 ... S2 activation link max activation
3-Layer Control through Schemas • Agents compete with each other for control of other agents and the use of hardware resources. • They do this by modifying activation vectors associated with agent relationships. • Thus, the sequencer is equivalent to a motor schema
Handoff Task 1. Invoke Human Hand Env. Agent 2. Close Robot Gripper Res. Agent 3. Invoke Box Env. Agent 4. Open Robot Gripper Move-to Move-to Box Env. Agent Open/Close Human Hand Env. Agent Robot Gripper Res. Agent Activate Activate Skin Color Tracking Beh. Agent Visual Servoing Beh. Agent Primitive Agent Type Relationship Simple Task Agent Operation
Current Implementation • ISAC is being designed primarily as a human service robot; to interact smoothly, naturally with, with people. • Several high level agents mediate the interaction: robot self agent, human agent, object agents.
Two High-Level Agents • Human Agent: encapsulates what the robot knows about the human • Robot Self Agent: maintains knowledge about the internal state of the robot and can communicate this with the human
Human-Robot Interaction Desiderata • The robot is “user-friendly;” it has a humanoid appearance and can converse with a person. • A person can discover the robot’s abilities through conversation. • The person can evoke those abilities from the robot. • The robot can detect its own failures and report them to the person. • The person and the robot can converse on the robot’s internal state for failure diagnosis.
Human-Robot Interaction A IMA Primitive Agent DBAM Hardware Interface Human Interaction Robot Self Agent Human Agent Human Robot A A A A A A A Software System
Robot Self Agent • Human Interaction: maps the person's goals into robot action • Action Selection: activates primitive agents based on information from the human interaction component • Robot Status Evaluation: detects failure in primitive agents and maintains information about the overall state of the robot
Human Agent Name Time of Last Interact. Last Phrase Self Agent Emotion Module Conversation Module Human Hand Env. Agent Activator Module Interrogation Module Pronoun Module Env.Agent 1 Description Module Env.Agent M Task Agent 1 Task Agent 2 Task Agent N
Human Interaction Agents • Conversation Module • Interprets the human's input and generates responses to the human • Based upon the Maas-Neotek robots developed by Mauldin • Increases or decreases the activation level of a primitive agent • Pronoun Module • Resolves human phrases and to environment primitive agents. • acts as a pointer or reference for use by task primitive agents • points to other modules of the Robot Self Agent or to primitive agents for purposes of failure description • Interrogation Module • handles discussions about the robot's status involving questions from the human
Status Evaluation Agents • Emotion Module • artificial emotional model used to describe the current state of the robot • provide internal feedback to the robot's software agents • fuzzy evaluator is used to provide a textual description for each emotion • Description Module • contains names and a description of the purpose of primitive agents • information about the status of primitive agents: active or inactive, and successful or unsuccessful
Action Selection Agents • Activator Module is a “clearing-house” for contributing to the activation state of primitive agents • Other Robot Self Agent modules can adjust the activation level of primitive agents through the Activator Module
Details of Agents: IMA Component Level Emotion Agent Conversation Agent Description Agent Fuzzy Text Text In Config PA Abilities Rep N Rep 1 Rep 2 Description Relationship PA Names Interpreter Text Out Rel 1 Rel 2 Rel N Activator Agent Pronoun Agent Interrogation Agent Rep 1 Rep 2 Rep N Text In Config Why Activator What Text In Binding Mechanism Pointer Link1 Link2 LinkN Where
Speech Recognition Pointing Finger Detection Central Sequencing Agents From Person From Camera Head Skintone Detection Speaker Identification Face Detection Infrared Motion Detection From IR Sensor Array Human Agent
Human Agent Robot Self Agency Text-To- Speech Human ID Human Detect. HandShake Task Game Task HandOff Task Voice ID Face Detect Skin-Tone Tracking Hand Arm Speech Rec, Pan/Tilt Head Color Img Testbed System