180 likes | 270 Views
Explore the collaboration and development through interactive learning between human and robot, using recurrent neural networks and consolidation learning for improved performance.
E N D
Collaboration Development through Interactive Learning between Human and Robot • Tetsuya OGATA, Noritaka MASAGO, Shigeki SUGANO, Jun TANI.
Introduction • “Recent” studies about welfare robots or robots as pets attracted lots of attention • They must work flexibly and cooperatively with humans • They would also have a establish relations with people in daily life
The Aims • To demonstrate interactive learning between a human operator and a robot system • Both human and robot are in the role of the learner But... • These sorts of systems are usually difficult to stabilize over long operation times
Previous Work • Most similar studies focus on short operations • Exploring collapse and modification of relationships between people
The Robot • Robovie • 2 arms • 4 DOF • 'Human-like' head • Audiovisual sensors • Many tactile sensors attached to its body
The Environment • A 4x4m course • The outside walls marked alternately red and blue
The Experiment • The human and robot join arms and attempt to travel clockwise through the maze without hitting obstacles • Try to do it in the shortest time • The movement is a combination of the human's influence and the robot's neural network
Limited Senses • Both the human and robot have very limited sensory information • Robot has poor vision, and only local information such as ultrasonic sensors. • It has no global position information • The human has a blindfold on • But can see the space before the experiment begins • Both sides are anticipating future sensory input and generating the next motor commands
The Model • A Recurrent Neural Network (RNN) • The input consists of: • Current sensory input • Current motor values • The output is predictions of: • Next sensory input • Next motor values
Their model can run in one of two modes • It can work in Open Loop Mode which directly maps inputs to outputs • Closed Loop Mode takes the output and puts it straight into the input • Can generate predictions of arbitrary length • Similar to mental rehearsal
Consolidation Learning • When a RNN tries to learn something new, it severely damages everything it already knows. • One way to avoid this: • Save all past teaching data in a database • Add new data • Use all of the data to retrain • Learning time increases with data
Consolidation Learning • Analogous to biology • Temporary memory stored in hippocampus • Consolidated into long-term memory during sleep • New data is stored in a database • The RNN corresponds to the long-term memory • The RNN is trained using both the rehearsed patterns and the sequence of the new experience • This enables the incremental learning without damaging the structure of the RNN
Navigation • In initial stages, performs very badly • Has a collision avoidance system to help with the training • Simplified reinforcement learning for initial training • Robot and human go around workspace • Time measured • If performance is better, train RNN to incorporate new trial
Experiments • A feed forward neural network (FFNN) • A RNN • A RNN with consolidation learning • Trials interlaced with questionnaires meant to judge workloads • Effort, workload, complexity, performance, concentration...
Results • FFNN ultimately deteriorates • RNN ultimately stagnates • RNN with consolidation learning continued to improve
Robustness • The analyze the effect of consolidation learning they compared the conventional RNN to the consolidation learning RNN when subject to noise • Used the closed loop mode and introduced different amounts of noise to the inputs • Consolidation learning proved far more robust • Linked to 'operability' – Robots which don't cope well with noise can seem unwieldy
Collaboration • Miwa showed that human collaboration was developed through repeated phases (Miwa et. al, 2001) • The consolidation-learning method arguably demonstrates these phases • The RNN with consolidation learning might have similarity with human learning