Aravind Sundaresan and Rama Chellappa Center for Automation Research University of Maryland, College Park MD USA. Multi-camera Tracking of Articulated Human Motion using Motion and Shape Cues. What is motion capture?.
Center for Automation Research
University of Maryland, College Park MD USAMulti-camera Tracking of Articulated Human Motion using Motion and Shape Cues
Motion capture (Mocap) is the process of analysing and expressing human motion in mathematical terms.
Initialisation, Pose estimation and Tracking.
Motion Analysis for clinical studies, Human-computer interaction, Computer animation.
Marker-based systems have shortcomings
Cumbersome, introduce artefacts, time consuming.
Marker-less system desirable.
Use multiple cameras (8) in our capture
640x480 grey scale images at 30 fps.
Calibrated using algorithm of Svoboda.
Use articulated human body model.
Super-quadrics for body segments.
Model described by joint locations and super-quadrics.
Pose is described by joint angles.
Use images from multiple cameras.
Compute 2-D pixel displacement between t and t+1.
Predict 3-D pose at t+1 using pixel displacement.
Compute spatial energy function as function of pose.
Minimise energy function to obtain pose at t+1.
Use motion and spatial cues for tracking.
Motion cues use texture.
Error accumulation: estimates only change in pose.
Spatial cues obtained from silhouettes, edges, etc.
Instability: Solutions are stable only “locally”.
Compute motion(t) from pixel displacement.
Predict pose(t+1) from pose(t) and motion(t).
Assimilate spatial cues into single energy function.
Correct pose(t+1) by minimising energy function.
Project model onto image to obtain
Body part label for pixel.
3-D location of pixel.
Mask for each body part
Find dense pixel correspondence using
Parametric optical flow-based algorithm for each segment..
Combine multiple spatial cues into a single “spatial energy function”.
Compute pose energy as function of dx, dy and Φ.
Given multiple views and 3-D pose
Compute 2-D pose for ith image
Compute Ei for ith camera using 2-D pose
3D pose energy, E = E1+ E2 + ... + En
Compute minimum energy pose using optimisation.