Loading in 2 Seconds...

Learning Interaction Protocols through imitation A data mining approach

Loading in 2 Seconds...

- 115 Views
- Uploaded on

Download Presentation
## Learning Interaction Protocols through imitation A data mining approach

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Learning Interaction Protocols through imitationA data mining approach

Building BlocksBuilding BlocksBuilding Blocks

Artificial Intelligence, Adv (E) (2013)

Do not distribute beyond this class

Yasser Mohammad

Nishida Lab.

Situated Modules

- Used in many systems until now mainly with Robovie
- Situated modules are executed in serial

[Ishiguro et al. 1999]

Route Guidance Listener (2006)

Structure

Adjustment

Redesign

Tune/Adapt (Supervised)

Parameter

Adjustment

Analyze Human Human Interactions

Implement Model

Evaluate Model

Model

Controller

2 WOZ experiments using motion captured data

[Kanda et al. 2007]

Engineering vs. Learning Approaches

Standard Engineering Approach

Structure

Adjustment

Redesign

Tune/Adapt (Supervised)

Parameter

Adjustment

Analyze Human Human Interactions

Implement Model

Evaluate Model

Model

Controller

Learning/Imitation Approach

Adapt (Unsupervised)

Parameter &

Structure Adjustment

Collect Human Human Interactions

Develop

Interact

Training Data

Controller

Communication Protocol

Model of Commands

Bird’s Eye ViewShared ground

Learned models

and protocol

Adapted models

and protocol

models

and protocol

models

and protocol

Watch

Mimic

Interact

Adapt

Learner Robot

Co-action

Learned action

Adapted actions

action

action

Primordial Knowledge

Model

Our Long Term Model

Watch External Behavior

Learn Actions’ Model

Learn Commands’ Model

Learn Communication Protocol

Commands

Feedback

- Main Insights
- Learning By Watching is Ubiquitous in humans
- Learning Actions and Commands are related
- Change in Behavior is what matters

Actions

Operator

Actor

Learner

Interaction

Design Procedure

Structure

Adjustment

Redesign

Tune/Adapt (Supervised)

Parameter

Adjustment

Analyze Human Human Interactions

Implement Model

Evaluate Model

Model

Controller

Redesign

Evaluate

Structure

Adjustment

Analyze Task & Required Basic Actions

Decide Required Behavior

(H-H Interactions)

Learn Parameters

(FPGA)

Intentions

Processes

Floating Point Genetic Algorithm

Select 2 individuals and generate 4:

Calculate probability of passing:

Crossover

1. Calculate probabilities over 1~m:

2. Calculate P(mutation@ k) as:

3. Select mutation site according to P(mutation @ k)

4. Mutate parameter using:

Mutation

Eliting

Tournament

Cross Over

Mutation

FPGA – Preliminary Evaluation

- Fitness function:
- 100 generations
- 100 individuals
- Two comparison algorithms

Proposed>A1

p=0.0133

Proposed>A2

p=0.0032

[Mohammad & Nishida 2010d]

Applications – Gaze Control

- Fixed Structure Gaze Controller (18 parameters)

- Dynamic Structure Gaze Controller (7 parameters)

[Mohammad & Nishida 2010d]

Applications – Gaze Control

- Fixed vs. Dynamic Structure GC
- Six novel sessions
- Four control GCs
- Follow
- Stare
- Random

[Mohammad & Nishida 2010d]

Learning by watching/imitation/mimicry

2. Learn

1. Watch

Command stream

Action stream

Interaction Protocol

Discovery Phase

Constrained Motif Discovery

Learner

Commands

Discrete Commands

Discrete Actions

320000310

23340003204402

Feedback

Actions

Association Phase

Baysian Network Induction

Operator

Actor

learned

Interaction Protocol

Behavior Generation Model

offline

3. Act

Commands

Piecewise Linear Controller Gen.

Controller Generation

Feedback

Actions

Robot/Agent Controller

Feedback Controller

online

Operator

Learned Actor

Building Blocks

- Behavior Discovery
- Motif Discovery
- Change Point Detection
- Behavior Association
- Bayesian Network Induction
- Causality Analysis
- Behavior Generation
- Piecewise Linear Controller Generation
- Behavior Adaptation
- Bayesian Network Combination

Gaze Control:Data Collection Experiment

- 44 participants
- ages 19-37 (27% females)
- Untrained to interact with robots
- Two objects (chair/stepper)
- Easily assembled (7 steps both)
- Not so easy (2 ordering steps both)
- Two roles:
- Instructor: explains about a single object three times:
- Good listener
- Bad listener
- Robot
- Listener: listens to two explanations about two objects:
- Good listener
- Bad listener

Gaze Control:Evaluation Experiment

- Internet poll
- 35 subjects
- Watch 2 videos:
- ISL learned controller
- carefully designed controller
- Age (ranged from 24 to 43 with an average of 31.16 years).
- Gender (8 females and 30 males).
- Experience in dealing with robots (ranged from I never saw one before to I program robots routinely).
- Expectation of robot attention in a range from 1 to 7 (4 +-1.376).
- Expectation of robot's behavior naturalness in a range from 1 to 7 (3.2 +-1.255).
- Expectation of robot's behavior human-likeness in a range from 1 to 7 (3.526 +-1.52).

Building Blocks

- Behavior Discovery
- Motif Discovery
- Change Point Detection
- Behavior Association
- Bayesian Network Induction
- Causality Analysis
- Behavior Generation
- Piecewise Linear Controller Generation
- Behavior Adaptation
- Bayesian Network Combination

(1) Behavior Discovery

Proposed

Advantages

- Utilizes relation between actions and commands
- removes irrelevant dimensions
- No need for separate clustering step
- No predefined model

Command Stream

Action Stream

Robust Singular

Spectrum Transform

Discover

Change Points

Discover

Change Points

Granger-Causality

Maximization

Natural Delay Discovery

X

Constrained

Motif Discovery

Discover Motifs

Discover Motifs

Remove Irrelevant

Dimensions

Remove Irrelevant

Dimensions

Motif Discovery

- Given a timeseries (an ordered list of real numbers), find approximately recurring subsequences

Chiu 2013

Motif Discovery

- Given a time series X(t) find recurring patterns of length L using distance function D

Constrained Motif Discovery

- Given a time series X(t) find recurring patterns of length between L1 and L2 using distance function D

subject to the constraint P(t), where P(t) is an estimation of the probability that a motif occurrence exists near time step t.

A motif is likely near here

DGCMD

- Advantages:
- Controlled Exhaustiveness (# candidates).
- Controlled Sensitivity (Tc).
- No random subwindow as needed by some MD algorithms.
- No upper bound on motif size as needed by most MD algorithms.
- Disadvantages:
- Can become quadratic if # candidates is large.
- Sensitive to outlier segments (long subwindows of outliers).

DGCMD – Evaluation

- 50440 time series
- Variable length (102~106)
- Variable noise level (0~20%PP)
- Variable motif types
- Variable # of occurrences
- Motif Discovery Algorithms:
- Projections (most accurate)
- Catalano et al. (fastest)
- Constrained Motif Discovery Alg.:
- MCFull
- MCInc
- DGCMD

[Mohammad & Nishida 2009]

How good is the constraint?

Probability of discovering a motif

- not using the constraint

- using the constraint

Relative entropy between constraint

and motif locations

Entropy of the constraint

Number of motif occurrences

Window length

Average motif length

Time series length

[Mohammad & Nishida 2010a]

How to get the constraint?

- Main insight
- The generating dynamics change near the beginning and end of motifs.
- We need to find points in the time series where generating dynamics change

Building Blocks

- Behavior Discovery
- Motif Discovery
- Change Point Detection
- Behavior Association
- Bayesian Network Induction
- Causality Analysis
- Behavior Generation
- Piecewise Linear Controller Generation
- Behavior Adaptation
- Bayesian Network Combination

Change Point Discovery

- Given a time series X(t) find for every time step the probability that X(t) is changing form (underlying dynamics are changing!!)

Available Techniques

- CUMSUM
- Detects only mean change
- Inflection Point Detection
- Assumes any variation is a change!!
- Autoregressive Modeling
- Assumes a specific generating model
- Mixtures of Gaussians
- Assumes a specific generating model
- Discrete Cosine Transform
- Finds only global changes
- Wavelet Analysis
- Tons of parameters
- Singular Spectrum Transform (SST) [Ide et al. 2005]
- Most General, no ad-hoc adjustment

Main idea

- At every point
- Use few values before it to represent the past: H
- Use few values after it to represent the future: G
- Compare the past with the future. The more dissimilar, the highest the score

H is a hyper plan

Future

Past

G

H

G is a set of Eigen vectors

Numeric Example

- X(t)={-4,-3,-2,-1,0,1,2,3,4,-1,1,-1,1,-1,1,-1,1}
- Parameters:
- w=g=4,n=m=2,l=1
- At t=6

Future

SVD

Change

Angle

Numeric Example (Continued

- X(t)={-4,-3,-2,-1,0,1,2,3,4,-1,1,-1,1,-1,1,-1,1}
- Parameters:
- w=g=4,n=m=2,l=1
- At t=6

Future

Change

Angle

Numeric Example

- X(t)={-4,-3,-2,-1,0,1,2,3,4,-1,1,-1,1,-1,1,-1,1}
- Parameters:
- w=g=4,n=m=2,l=1
- At t=10

Future

SVD

Change

Angle

Numeric Example (Continued

- X(t)={-4,-3,-2,-1,0,1,2,3,4,-1,1,-1,1,-1,1,-1,1}
- Parameters:
- w=g=4,n=m=2,l=1
- At t=6

Change

Angle

Singular Spectrum Transform

- Advantages:
- No predefined generation model.
- Comparably few parameters (5).
- PCA using SVD works for ANY matrix so no ad-hoc preprocessing is needed.
- Linear in the length of the time series.
- Disadvantages
- Still there are 5 parameters hard to select.
- Specificity degrades very fast with increased noise level.
- Inadequate for time series with no background signal.

Robust Singular Spectrum Transform

Parameters

w,n

Future

Past

G

H

Future

Change

Angles

[Mohammad & Nishida 2009b]

RSST vs. SST – Real world data

- Explanation Scenario
- 22 participants
- 3 conditions:
- Natural listening
- Unnatural listening
- Robot
- Physiological Sensors:
- Respiration
- Skin Conductance
- Pulse

RSST vs. SST – Physio-psychological data analysis

[Mohammad & Nishida 2009d]

- Behavior Discovery
- Motif Discovery
- Change Point Detection
- Behavior Association
- Bayesian Network Induction
- Causality Analysis
- Behavior Generation
- Piecewise Linear Controller Generation
- Behavior Adaptation
- Bayesian Network Combination

Behavior Association

After Discovering Basic Motifs in both actions and commands and detecting their occurrence in all time series as in this graph

Command 1

Action 1

use the natural delay between commands and actions calculated during the discovery phase.

For every command-action pair calculate the joint-activation of them by the number of occurrences of the action within the natural delay interval of the command.

Use the joint-activation values to induce a Baysian Network describing the relation between actions and commands

Mohammad & Nishida 2009

Causality Based Delay Estimation

To find delay between and

Regress actions using actions & gestures

Regress actions using actions only

Compare residues

Calculate g-causality statistic

Find the delay that maximizes g-causality

[Mohammad & Nishida 2009c]

Example: Associating Actions and Gestures

- Guided Navigation Scenario

Correct prediction 95.2%.

[Mohammad & Nishida 2009c]

- Behavior Discovery
- Motif Discovery
- Change Point Detection
- Behavior Association
- Bayesian Network Induction
- Causality Analysis
- Behavior Generation
- Piecewise Linear Controller Generation
- Behavior Adaptation
- Bayesian Network Combination

Behavior Controller Generation

Convert the Baysian Network learned into L0EICA controller

Command Process & Action Process & Link Effect Channel

Motor Babbling

PLGC

[Mohammad & Nishida 2010c]

Motor Babbling

generate a straight line in one dimension while minimizing disturbance

to all others

Mohammad & Nishida 2010c

PLGC

[Mohammad & Nishida 2010c]

- Behavior Discovery
- Motif Discovery
- Change Point Detection
- Behavior Association
- Bayesian Network Induction
- Causality Analysis
- Behavior Generation
- Piecewise Linear Controller Generation
- Behavior Adaptation
- Bayesian Network Combination

ABN Combination

- Main assumption
- Action nodes are more compatible than gesture nodes
- Algorithm
- Associate action nodes with similar stored pattern
- Set of action node association links
- Associate gesture nodes with similar stored pattern
- Set of action node association links
- Calculate Link Competence Index for association links
- Set of LCIs for gestures and actions
- Resolve association link conflicts using LCIs
- Final ABN

Associating action/gesture nodes

- Compile AN1 and AN2 lists {every action node}
- Calculate
- Calculate for all nodes and order them
- Create a link iff

for any

- Set

Gesture association links are calculated the same way

[Mohammad & Nishida 2010c]

LCI Calculation

[Mohammad & Nishida 2010c]

Guided Navigation

- Roles: Actor & Operator
- Protocol: explicit
- Nonverbal Behaviors:
- Operator’s Gesture
- Actor’s motion
- Sensors
- Accelerometers (BPACK)
- Motion Capture (PhaseSpace)
- Procedure
- Offline experiment
- Online experiment

Guided Navigation

- Task Oriented
- Explicit Protocol
- 1 way Interaction

[Mohammad and Nishida 2009c]

Guided Navigation – Online experiment

- 18 subjects (6 days)
- Task: operator in GN scenario
- Procedure
- WOZ session (training & familiarization)
- 3 Sessions on these conditions:
- WOZ
- Per-participant learner
- Accumulating learner

[Mohammad and Nishida 2010c]

Experimental Setup

[Mohammad and Nishida 2009c]

Guided Navigation – Online Examples

- Number of failures:
- Per-Participant 1/18
- Accumulating 4/17

Conclusions

- Unsupervised learning of interaction protocols is possible using three main data mining technologies:
- Motif discovery <The Heart>
- Change point discovery <The speeding engine>
- Causality analysis <Natural delays>
- Several algorithms for solving these three problems were introduced and are available (among others) in source code as a MATLAB toolbox called CPMD
- We have shown that it is possible to learn both implicit interaction protocols (gaze control) and explicit interaction protocols (guided navigation) without explicit modeling
- By manipulating learned BNs, it is possible to improve the interactive behavior of agents over time based on interactions with multiple people

References

[Ishiguro et al. 1999] Ishiguro, H.; Kanda, T.; Kimoto, K.; Ishida, T., "A robot architecture based on situated modules," Intelligent Robots and Systems, 1999. IROS '99. Proceedings. 1999 IEEE/RSJ International Conference on , vol.3, no., pp.1617,1624 vol.3, 1999

[Catalano 2006] Joe Catalano, Tom Armstrong, and Tim Oates. Discovering patterns in real-valued time series. In Knowledge Discovery in Databases: PKDD 2006, pages 462–469, 2006.

[Chiu 2003] B. Chiu, E. Keogh, and S. Lonardi, “Probabilistic discovery of time series motifs,” in KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM, 2003, pp. 493–498.

[Ide 2005] T. Ide and K. Inoue, “Knowledge discovery from heterogeneous dynamic systems using change-point correlations,” in Proc. SIAM Intl. Conf. Data Mining, 2005.

[Kanda et al. 2007] Kanda, T., Kamasima, M., Imai, M. et al. “A Humanoid Robot That Pretends to Listen to Route Guidance from a Human”, Auton. Robots, Vol. 22, Number 1, pages 87-100, 2007.

[Mohammad and Nishida 2009a] Yasser Mohammad, Toyoaki Nishida, Shogo Okada, “Unsupervised Simultaneous Learning of Gestures, Actions and their Associations for Human-Robot Interaction," . IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009, pp.2537-2544, 11-15 Oct. 2009

[Mohammad and Nishida 2009b] Yasser Mohammad and Toyoaki Nishida, Robust Singular Spectrum Transform, The Twenty Second International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (IEA/AIE 2009), June 2009, Taiwan, pp 123-132.

[Mohammad and Nishida 2009c] Yasser Mohammad and Toyoaki Nishida, Measuring Naturalness During Close Encounters Using Physiological Signal Processing, The Twenty Second International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (IEA/AIE 2009), June 2009, Taiwan, pp. 281-290

[Mohammad and Nishida 2009d]] Yasser Mohammad and Toyoaki Nishida, Using Physiological Signals to Detect Natural Interactive Behavior, Applied Intelligence, 13(1) 79-92

[Mohammad 2009 PhDThesis] Yasser Mohammad, Autonomous Development of Natural Interactive Behavior for Robots and Embodied Agents, PhD Thesis, Kyoto University, September 2009

[Mohammad and Nishida 2010a] Yasser Mohammad, Toyoaki Nishida, Learning Interaction Protocols using Augmented Baysian Networks Applied to Guided Navigation, Taipei, Taiwan, IROS 2010.

[Mohammad and Nishida 2010c] Yasser Mohammad, Toyoaki Nishida, Learning Interaction Protocols using Augmented Baysian Networks Applied to Guided Navigation, Taipei, Taiwan, IROS 2010.

[Mohammad and Nishida 2010d] Yasser Mohammad and Toyoaki Nishida, Controlling Gaze with an Embodied Interactive Control Architecture, Applied Intelligence, Vol. 32, No. 2, 2010, pp 148-163

Download Presentation

Connecting to Server..