1 / 24

Chapter 5 Multi-Cue 3D Model-Based Object Tracking

Visual Perception and Robotic Manipulation Springer Tracts in Advanced Robotics. Chapter 5 Multi-Cue 3D Model-Based Object Tracking. Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical and Computer Systems Engineering

zwi
Download Presentation

Chapter 5 Multi-Cue 3D Model-Based Object Tracking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visual Perception and Robotic Manipulation Springer Tracts in Advanced Robotics Chapter 5Multi-Cue 3D Model-Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical and Computer Systems Engineering Monash University, Australia

  2. Contents • Motivation and Background • Overview of proposed framework • Kalman filter • Colour tracking • Edge tracking • Texture tracking • Experimental results

  3. Introduction • Research aim: • Enable a humanoid robot to manipulate a priori unknown objects in an unstructured office or domestic environment. • Previous results: • Visual servoing • Robust 3D stripe scanning • 3D segmentation, object modelling Metalman

  4. Why Object Tracking? • Metalman uses visual servoing to execute manipulations: control signals are calculated from observed relative pose of gripper and object. • Object tracking allows Metalman to: • Handle dynamic scenes • Detect unstable grasps • Detect motion from accidental collisions • Compensate for calibration errors in kinematic and camera models

  5. Why Multi-Cue? • Individual cues only provide robust tracking under limited conditions: • Edges fail in low contrast, distracted by texture • Textures not always available, distracted by reflections • Colour gives only partial pose • Fusion of multiple cues provides robust tracking in unpredictable conditions.

  6. Multi-Cue Tracking • Mainly applied to 2D feature-based tracking. • Sequential cue tracking: • Selector (focus of attention) followed by tracker • Can be extended to multi-level selector/tracker framework (Tomaya and Hager 1999). • Cue integration: • Voting, fuzzy logic (Kragić and Christensen 2001) • Bayesian fusion, probabilistic models • ICondensation (Isard and Blake 1998)

  7. Proposed framework • 3D Model-based tracking: models extracted using segmentation of range data from stripe scanner. • Colour(selector), edges and texture (trackers) optimally fused in a Kalman filter framework. Colour + range scan Textured polygonal models

  8. Kalman filter • Optimally estimate object state xk given previous state xk-1 and new measurements yk. • System state comprises pose and velocity screw: xk = [pk, vk]T • State Prediction (constant velocity dynamics): p*k = pk-1 + vk-1·t vk = vk-1 • State Update: xk = x*k + Kk [ yk - y*(x*k) ] • Need measurement function for each cue: y*(x*k)

  9. Measurements • For each new frame, predict object pose of and project model onto image to define region of interest (ROI): • only process within ROI to eliminate distractions and reduce computational expense Captured frame, predicted pose & ROI

  10. Colour Tracking • Colour filter created from RGB histogram of texture • Image processing: • Apply filter to ROI • Calculate centroid of the largest connected blob • Measurement prediction: • Project centroid of model vertices at predicted pose onto the image plane

  11. Edge Tracking • To avoid texture, only consider silhouette edges • Image processing: • Extract directional edge pixels (Sobel masks) • Combine colour data to extract silhouette edges • Match pixels to projected model edge segments

  12. Edge Tracking • Fit line to matched points for each segment and extract angle and mean position • Measurement prediction: • Project model vertices to image plane • For each model edge, calculate angle and distance to measured mean point

  13. Texture Tracking • Textures represented as 8×8 pixel templates with high spatial variation of intensity • Image processing: • Render textured object in predicted pose • Apply feature detector (Shi & Tomasi 1994) • Extract templates, match to captured frame by SSD

  14. Texture Tracking • Apply outlier rejection: • Consistent motion vectors • Invertible matching • Calculate the 3D position of texture features on the surface of the model • Measurement prediction: • Project 3D surface features in current pose onto image plane

  15. Experimental Results • Three tracking scenarios: • Poor visual conditions • Occluding obstacles • Rotation about axis of symmetry • Off-line processing of captured video sequence: • Direct comparison of tracking performance using edges only, texture only, and nultimodal fusion. • Actual processing rate is about 15 frames/sec

  16. Poor Visual Conditions Colour, texture and edge tracking

  17. Poor Visual Conditions Edges only Texture only

  18. Occlusions Colour, texture and edge tracking

  19. Occlusions Edges only Texture only

  20. Occlusions Tracking precision

  21. Symmetrical Objects Colour, texture and edge tracking

  22. Symmetrical Objects Object orientation

  23. Conclusions • Fusion of multimodal visual features overcomes weaknesses in individual cues, and provides robust tracking where single cue tracking fails. • The proposed framework is extensible; additional modalities can be fused provided a suitable measurement model is devised.

  24. Open Issues • Include additional modalities: • optical flow (motion) • depth from stereo • Calculate measurement errors as part of feature extraction for measurement covariance matrix. • Modulate size of ROI to reflect current state covariance, so ROI automatically increases as visual conditions degrade, and decreases under good conditions to increase processing speed.

More Related