1 / 1

Learning Image Statistics for Bayesian Tracking

4. Key idea #1. 10. Key idea #2. 1. Introduction. 12. Results. Use the 3D model to predict the location of limb boundaries in the scene. Compute various filter responses steered to the predicted orientation of the limb.

fionan
Download Presentation

Learning Image Statistics for Bayesian Tracking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4. Key idea #1 10. Key idea #2 1. Introduction 12. Results • Use the 3D model to predict the location of limb boundaries in the scene. • Compute various filter responses steered to the predicted orientation of the limb. • Compute likelihood of filter responses given 3D model, using a statistical model learnedfrom examples. • Goal: Tracking and reconstruction of people in 3D from a monocular video sequence. Articulated human model. • Approach:Learn models of the appearance of humans in image sequences. • What are the image statistics of human limbs? How do they differ for general scenes? Posterior Likelihood Temporal prior Steered edge responses • Posterior propagated with Condensation (Isard & Blake 96) • Likelihood independence assumption: Scale specific! Steered ridge response: 5. Learning Distributions Tracking is improved using multiple cues (see video) |2nd derivative|  arm |2nd derivative| // arm Last frame, Edge Ridge Motion All Training data: 2. Why is it Hard? 2D-3D projection ambiguities Tracking with self occlusion (see video) Last frame Pon, Poff Pon/Poff 8. Ridge 13. Conclusions Occlusion Foreground Background • People have different appearance due to clothing, shape etc. • Need to discriminate between “people” and “non-people”, but allow for variation in human appearance • Generic, learned, model of appearance Combines multiple cues, exploits work on image statistics • Use the 3D model to predict position and orientation of features • Model of foreground and background Exploits the ratio between foreground and background likelihood, improves tracking Statistical differences. Pon/Poff can be used for discriminating foreground from background. Training data: Consecutive images from sequence x x+u 11. Bayesian Formulation Similarity between different limbs Unexpected position Motion response = I(x, t+1) - I(x+u,t) Pon = probability of filter response on limbs Poff = probability of filter response on general background Similar to Konishi et al. 00, but object specific - responses may be present or not. Poff Pon 6. Normalized Derivatives Edge Detection: 3. Previous Approaches 9. Motion 14. Related Work Statistics of natural images: S. Konishi, A. Yuille, J. Coughlan, and S. Zhu. Fundamental bounds on edge detection: An information theoretic evaluation of different edge cues. submitted: PAMI T. Lindeberg. Edge detection and ridge detection with automatic scale selection. IJCV, 30(2):117-156, 1998 J. Sullivan, A. Blake, and J. Rittscher. Statistical foreground modelling for object localization. In ECCV, pp 307-323, 2000 S. Zhu and D. Mumford. Prior learning and Gibbs reaction-diffusion. PAMI, 19(11), 1997 Tracking: J. Deutscher, A. Blake, and I. Reid. Articulated motion capture by annealed particle filtering. In CVPR, vol 2, pp 126-133, 2000 M. Isard and J. MacCormick. BraMBLe: A Bayesian multi-blob tracker. In ICCV, 2001 M. Isard and A. Blake. Contour tracking by stochastic propagation of conditional density. In ECCV, pp343-356, 1996 Image derivatives fx, fy, fxx, fxy, fyynormalized with a non-linear transfer function. High  1, low  0, • “Explain” the entire image • Konishi et al. 00: Parameters of transfer function optimized to maximize Bhattacharyya distance between Pon and Poff.: p(image | foreground,background) • Probabilistic model? • Under/over-segmentation, thresholds, ... Background Foreground Image flow, templates, color, stereo, ... p(image | foreground,background) = Vertical derivative Normalized vertical derivative p(fgr part of image | foreground) Instead, model the appearance of people in terms of filter responses const  p(fgr part of image | background) Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden hedvig@nada.kth.se Michael Black Brown University, RI, USA black@cs.brown.edu 7. Edge Steered edge response: Pon, Poff Pon/Poff

More Related