1 / 16

CRF-based Activity Recognition on Manifolds

CRF-based Activity Recognition on Manifolds. Presented by Arshad Jamal and Prakhar Banga. Introduction.

ajay
Download Presentation

CRF-based Activity Recognition on Manifolds

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CRF-based Activity Recognition on Manifolds Presented by Arshad Jamal and Prakhar Banga

  2. Introduction • Objective: To explore the STIP based activity analysis by finding a manifold structure for the HoG-HOF descriptor to reduce the dimension of the data and learn a hCRF based discriminative classifier to classify actions • Challenges: Huge diversity in the data (view-points, appearance, motion, lighting etc.) • Applications: video surveillance, video indexing and retrieval and human computer interaction

  3. Related Works • Various variants of spatio-temporal interest points (STIPs) based approaches exist • Generative models like Bayesian Networks to model key pose changes Discriminative models like a CRF network to model the temporal dynamics of silhouettes based features Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition

  4. Proposed Approach Labeled Training Dataset STIP Detector & Descriptor Manifold Learning (LPP) Learn CRF Classifier STIP Detector & Descriptor Dimensionality reduction Classifier Test Video Action Class

  5. Algorithm Details: STIP Detector • Looks for distinctive neighborhood in the video • High image variation in space and time • Describe it using distribution of gradient and optical flow Where Lij is Any (x, y, t) location in the video is STIP if I. Laptev. On Space-Time Interest Points. IJCV, 2005

  6. Algorithm Details: STIP Descriptor • Small spatio-temporal neighborhood extracted • Divided into 3x3x2 tiles I. Laptev. On Space-Time Interest Points. IJCV, 2005

  7. Dimensionality Reduction A broad classification • Linear: • Cannot capture inherent non-linearity in the manifold • Non-linear • May not be defined everywhere • Do not preserve neighborhood

  8. Locality preserving projections (LPP) Concentrates on preserving locality rather than minimizing the Least Square Error Capable of learning the non-linear manifold structure as optimally as possible

  9. Algorithm Details: HCRF based Classifier Output y • HCRF is a discriminative classifier conditioned globally on all the observations • Model parametersare foundby maximizing the conditional log likelihood on the labelled training data • Flexible configuration and connectivity of the Hidden variables Hidden states h1 h2 hm x1 x2 xm Observations Wang, Mori NIPS 2008

  10. Algorithm Details: HCRF based Classifier Given a sequence of STIPs x and it’s class label We wish to find p(y/x) Wang, Mori NIPS 2008

  11. Current Status • Datasets: • KTH: 600 videos of 6 actions by 25 actors in 4 scenarios • UCF-50 dataset • Features: 3D-Harris corner as STIP, 162-dim (72+90) HoG-HoF descriptors computed for the two datasets • Script based approach to add new datasets • Dimensionality reduction using LPP • Learn a low dimensional manifold using all the STIPs obtained from the full dataset • Project the data onto the learned manifold • HCRF model is learned using the training dataset • Initial results obtained

  12. Results: STIPs

  13. Result: LPP output ~1.2Lac STIPs collected from all action classes in KTH dataset 3-dimensions Plotted for visualization

  14. To be done… • Testing the classifier for different dataset • Compiling the action classification results for different datasets • Code debugging

  15. References • I. Laptev. On Space-Time Interest Points. IJCV, 2005 • I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In CVPR, 2008 • F. Lv and R. Nevatia. Single view human action recognition using key pose matching and viterbi path searching. In CVPR, 2007 • S. Wang, A. Quattoni, L.-P. Morency, D. Demirdjian, and T. Darrell. Hidden Conditional Random Fields for Gesture Recognition. In CVPR, 2006 • L.-P. Morency, A. Quattoni, and T. Darrell. Latent-Dynamic Discriminative Models for Continuous Gesture recognition. In CVPR, June 2007 • A. Quattoni, S. Wang, L.-P. Morency, M. Collins, and T. Darrell. Hidden conditional random fields. PAMI, Oct. 2007 • C. Sminchisescu. Selection and context for action recognition. In ICCV, 2009 • J. Sun, X. Wu, S. Yan, L. Cheong, T. Chua, and J. Li. Hierarchical spatio-temporal context modelling for action recognition. In CVPR, 2009 • L Wang, D Suter, Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition, IEEE TIP, 2007 Thank You

More Related