keran
Uploaded by
16 SLIDES
315 VIEWS
160LIKES

Learning from Demonstrations: Kalman Filtering, Smoothing, and Dynamic Time Warping in Robotics

DESCRIPTION

This work explores learning ideal trajectories from demonstrations using Kalman filtering and smoothing techniques. By leveraging the Expectation-Maximization (EM) algorithm, the study addresses the challenges posed by imperfect human demonstrations while optimizing system parameters. Applications include autonomous helicopter aerobatics and surgical tasks such as knot-tying. The integration of dynamic time warping within the EM algorithm allows for effective estimation of hidden trajectories, leading to enhanced performance in robotic systems. Notable achievements include award-winning research in medical robotics.

1 / 16

Download Presentation

Learning from Demonstrations: Kalman Filtering, Smoothing, and Dynamic Time Warping in Robotics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript

Playing audio...

  1. Learning from Demonstrations Jur van den Berg

  2. Kalman Filtering and Smoothing • Dynamics and Observation model • Kalman Filter: • Compute • Real-time, given data so far • Kalman Smoother: • Compute • Post-processing, given all data

  3. EM Algorithm • Kalman smoother: • Compute distributions X0, …, Xt given parameters A, C, Q, R, and data y0, …, yt. • EM Algorithm: • Simultaneously optimize X0, …, Xtand A, C, Q, Rgiven data y0, …, yt.

  4. Learning from Demonstrations • Application of EM-algorithm • Example: • Autonomous helicopter aerobatics • Autonomous surgical tasks (knot-tying)

  5. Motivation • Learning an ideal “trajectory” of system • Human provides demonstrations of ideal trajectory • Human demonstrations imperfect • Multiple demonstrations implicitly encode ideal trajectory • Task: infer ideal trajectory from demonstrations

  6. Acquiring Demonstrations • Known system dynamics (A, B, Q) • Observations with known sensors (C, R) • Inertial measurement unit • GPS • Cameras • Use Kalman smoother to optimally estimate states x along demonstration trajectory

  7. Multiple Demonstrations • D demonstration trajectories of duration Tj • Hidden ideal trajectory z of duration T*

  8. Model of Ideal Trajectory • Main idea: use demonstrations as noisy observations of hidden ideal trajectory • Dynamics of hidden trajectory • Observation of hidden trajectory

  9. Inferring Ideal Trajectory • Dynamics model: Parameter N controls smoothness; A, B, Q known • Observation model: Parameters S encode relative quality of demonstrations • Use EM-algorithm with Kalman smoother to simultaneously optimize z and S (and N). • Initialize S with identity matrices

  10. Time Warping • But, this assumes demonstrations are of equal length and uniformly paced • Include Dynamic Time Warping into EM-algorithm • Such that demonstrations map temporally

  11. Time Warping • For each demonstration j, we have function tj(t) • Maps time t along zto time tj(t) along dj • Adapted observation model:

  12. Learning Time Warping • tj(t) is (initially) unknown • Assume (initially): • T* = (T1 + … + TD) / D • tj(t) = (Tj / T*) t • Adapted EM-algorithm: • Run Kalman smoother with current S and t • Optimize S by maximizing likelihood • Optimize t by maximizing likelihood (Dynamic Time Warping)

  13. Dynamic Time Warping • Match demonstration j with z • Assume that demonstration moves locally • twice as slow as z • same pace as z • twice as fast as z • Dynamic Programmingto find optimal “path” • Cost function: likelihood of

  14. Example: Helicopter Airshow • Thesis work of Pieter Abbeel • Unaligned demonstrations: • Movie • Time-aligned demonstrations: • Movie • Execution of learnt trajectory • Movie

  15. Example Surgical Knot-tie • ICRA 2010 Best Medical Robotics Paper Award • Video of knot-tie

  16. Conclusion • Learning from demonstrations • Includes Dynamic Time Warping into EM-algorithm

More Related