1 / 20

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO. it’s me . F. Giannone. A. Cherubini. L. Iocchi. M. Lombardo. G. Oriolo. Overview: environment. Robotic Agent. Humanoid Robot. NAO. Produced by Aldebaran. Application. Robotic Soccer. SDK. Simulator.

faustine
Download Presentation

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UsingOpenRDKtolearn walk parameters fortheHumanoid Robot NAO it’s me F. Giannone A. Cherubini L. Iocchi M. Lombardo G. Oriolo

  2. Overview:environment Robotic Agent • Humanoid Robot NAO • Produced by Aldebaran Application Robotic Soccer SDK Simulator

  3. Overview:(sub)tasks At First !!! Modelling Module Vision Module Elaborate raw data to obtain more reliable information Process raw data from environment Environment Actuate robot motors accordindly Decide the best behaviour to accomplish the agent goal Behaviour Control Module Motion Control Module At First !!!

  4. Make Nao walk…how? For these reasonswe decided to develop our walk model and to tune it using machine learnig tecniques • called through an interface(NaoQi Motion Proxy) Nao is equipped with a set of motion utilities including a walk implementation that can be • partially customized by tuningsome parameters Main Advantage • Ready to Use (…to be tuned) …and a Drawback • Based on an unknow Walk Model No flexibility at all!!!

  5. SPQR Walking library development workflow SPQR Walk Model Test the walk model on Webots simulator Develop the Walk model using Matlab Design and Implement a C++ library for our RDK Soccer Agent on Webots simulator SPQR Walking Library Test our Walking RDK Agent on real NAO robot Finally tunewalk parameters (on webots simulator and on NAO)

  6. A simple walking RAgent for Nao Switches between two states: walk - stand Simple Behaviour Module Motion Control Module SPQR Walking Library uses NaoQi Adaptor Webots Client Smemy TCP channel NAO (NaoQi) WEBOTS

  7. SPQR Walking Engine Model 21 degrees of freedom NAO model characteristics No actuated trunk No dynamic model available We follow the “Static Walking Pattern”: Use a-priori definition of the desired trajectories defined by: Choose a set of variable output: 3D coordinates of selected pointsof the robot Choose and parametrize the desiredtrajectories for these variables at each phase of the gait • Velocity Commands (v,ω) • v is linear velocity • ω is angolar velocity

  8. SPQR velocity commands (v,0) (v,0) Initial Half Step Rectilinear Walk Swing Stand Position Behavior Control Module (v,ω) (0,0) (v,ω) (v,ω) Curvilinear Walk Swing Motion Control Module (0,ω) Turn Step (0,0) Joints Matrix (0, ω) Final Half Step

  9. SPQR walking subtasks and parameters Biped walking Swing phase Double support phase SS% SPQR walk subtasks Foot trajectories inthe xz plane Arm control Hip yaw/pitchcontrol (turn) Center of masstrajectory in lateraldirection Xtot, Xsw0, XdsZst, Zsw Ks Hyp Yft, Yss, Yds, Kr

  10. Walk tuning: main issues • Possible choices • By hand • By using machine learning techniques • Machine Learning seems the best solution • Less human interaction • Explores the search space in a more systematic way • …but take care of some aspects • You need to define an effective fitness function • You need to choose the right algorithm to explore the parameter space • Only a limited amount of experiments can be done on a real robot

  11. Webots Real Nao SPQR Learning System Architecture Learning library uses Learner Iterationexperiments Fitness (GPS) RAgent Datato evaluatethe fitness uses Walking library

  12. SPQR Learner Learner Policy Gradient(e.g., PGPR) Firstiteration? No Apply the chosenalgorithm (strategy) Nelder MeadSimplex Method Genetic Algorithm Yes Return initial Iteration and iteration information Return next Iteration and iteration information

  13. Policy Gradient (PG) iteration *=   normalized() p’=p+* Given a point p inthe parameter space  IRK Generate n (n=mk) policiesfrom p (for each componentof p: pi ,pi+, or pi-) For each k  {1, …, K}, if F0 > F+ and F0 > F- then k=0 else k= F+ -F- For each k  {1, …, K}, compute Fk+, Fk0, Fk- Evaluate the policies

  14. Enhancing PG: PGPR At each iteration i, the gradient estimate (i)can be used to obtain a metric for measuring the relevance of the parameters. forgetting factor Given the relevance and a threshold T, PGPR prunes less relevant parametersin next iterations.

  15. Curvilinear biped walking experiment • The robot move along a curve with radius R for a time t Fitness function: In which: path length radial error

  16. Simulators in learning tasks • Advantages • You can test the gait model and the learning algorithm without being biased by noise • Limits • The results of the experiments on the simulator can be ported on the real robot, but specialized solutions for the simulated model can be not so effective on the real robot (e.g., it does not take into account asymmetries, models are not very accurate)

  17. Results (1) • Five sessions of PG, 20 iterations each, all starting from the same initial configuration • SS%, Ks, Yft have been set to hand-tuned values • 16 policies for each iteration • Fitness increasesin a regular way • Low varianceamong the fivesimulations

  18. Results (2) Final parameter setsfor the five PG runs Zsw Xsw0 Xs Kr Five runs of PGPR

  19. Bibliography • A. Cherubini, F. Giannone, L. Iocchi, M. Lombardo, G. Oriolo. “Policy Gradient Learning for a Humanoid Soccer Robot”. Accepted for Journal of Robotics and Autonomous Systems. • A. Cherubini, F. Giannone, L. Iocchi, and P. F. Palamara, “An extended policy gradient algorithm for robot task learning”, Proc. of IEEE/RSJ International Conference on Intelligent Robots and System, 2007. • A. Cherubini, F. Giannone, and L. Iocchi, “Layered learning for a soccer legged robot helped with a 3D simulator”, Proc. of 11th International Robocup Symposium, 2007. • http://openrdk.sourceforge.net • http://www.aldebaran-robotics.com/ • http://spqr.dis.uniroma1.it

  20. ??? ??? Any Questions ??? ???

More Related