beyond actions discriminative models for contextual group activities n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Beyond Actions: Discriminative Models for Contextual Group Activities PowerPoint Presentation
Download Presentation
Beyond Actions: Discriminative Models for Contextual Group Activities

Loading in 2 Seconds...

  share
play fullscreen
1 / 59
ulani

Beyond Actions: Discriminative Models for Contextual Group Activities - PowerPoint PPT Presentation

71 Views
Download Presentation
Beyond Actions: Discriminative Models for Contextual Group Activities
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. M.Sc. Thesis Defense Beyond Actions: Discriminative Models for Contextual Group Activities TianLan School of Computing Science Simon Fraser University August 12, 2010

  2. Outline • Group Activity Recognition with Context • Structure-level (latent structures) • Feature-level (Action Context descriptor) • Introduction • Experiments

  3. Activity Recognition • Goal Enable computers to analyze and understand human behavior. Answering a phone Kissing

  4. Action vs. Activity Activity: a group of people forming a queue Action: Stand in a queue and facing left

  5. Activity Recognition • Activity Recognition is important • Activity Recognition is difficult intra-class variation, background clutter, partial occlusion, etc. HCI Surveillance Sport Entertainment

  6. Group Activity Recognition • Motivation human actions are rarely performed in isolation, the actions of individuals in a group can serve as context for each other. • Goal explore the benefit of contextual information in group activity recognition in challenging real-world applications

  7. Group Activity Recognition Context

  8. Group Activity Recognition • Two types of Context Talk … … group-person interaction person-person interaction

  9. Latent Structured Model Activity activity class h y h1 y h2 … action class Action Hidden layer x2 xn Feature x1 image x0

  10. Latent Structured Model group-person Interaction activity class hn y h1 y person-person Interaction h2 … action class x2 xn Structure-level x1 Feature-level image x0

  11. Difference from Previous Work • Group Activity Recognition • Our work • Group activity recognition in realistic videos • Two new types of contextual information • A unified framework • Previous Work • Single-person action recognition • Schuldt et al. icpr 04 • Relative simple activity recognition • Vaswani et al. cvpr 03 • Dataset in controlled conditions

  12. Difference from Previous Work • Latent Structured Models Previous work a pre-defined structure for the hidden layer, e.g. tree (HCRF) ( Quattoni et al. pami 07, Felzenszwalb et al. cvpr 08) Our work latent structure for the hidden layer, automatically infer it during learning and inference.

  13. Outline • Group Activity Recognition with Context • Structure-level (latent structures) • Feature-level (Action Context descriptor) • Introduction • Experiments

  14. Structure-level Approach activity class y person-person Interaction hn y h1 … h2 action class Structure-level x2 xn x1 Feature-level image x0

  15. Structure-level Approach • Latent Structure Queue ? Talk Talk

  16. Model Formulation y Input: image-label pair (x,h,y) … hn y h1 h2 Image-Action Action-Activity Image-Activity Action-Action x1 x2 xn x0

  17. Inference • Score an image x with activity label y • Infer the latent variables NP hard !

  18. Inference • Holding Gy fixed, • Holding hy fixed, Loopy BP ILP

  19. Learning with Latent SVM Optimization: Non-convex bundle method (Do & Artieres, ICML 09)

  20. Feature-level Approach activity class y person-person Interaction hn y h1 … h2 action class Structure-level x2 xn x1 Feature-level image x0

  21. Feature-level Approach activity class y • Model action class h y h1 h2 … Action Context Descriptor x1 x2 xn image x0

  22. Action Context Descriptor τ τ z + action Focal person Context (b) (a) action (c)

  23. Action Context Descriptor Feature Descriptor Multi-class SVM e.g. HOG by Dalal & Triggs score score score score max action class action class action class action class …

  24. Outline • Group Activity Recognition with Context • Structure-level (latent structures) • Feature-level (Action Context descriptor) • Introduction • Experiments

  25. Dataset • Collective Activity Dataset (Choi et al. VS 09) • 5 action categories: crossing, waiting, queuing, walking, talking. (per person) • 44 video clips

  26. Collective Activity Dataset

  27. Dataset • Nursing Home Dataset • activity categories: fall, non-fall. (per image) • 5 action categories: walking, standing, sitting, bending and falling. (per person) • In total 22 video clips (2990 frames), 8 clips for test, the rest for training. 1/3 are labeled as fall.

  28. Nursing Home Dataset

  29. Baselines h2 h4 h4 h4 h4 • root (x0) + svm (no structure) • No connection • Min-spanning tree • Complete graph within r h2 h2 h2 h1 Hidden layer h1 h1 h3 h3 h3 r h1 h3 Structure-level approach

  30. System Overview u Person Detector Model Person Descriptor Video v • Pedestrian Detection • by Felzenszwalb et al. • Background Subtraction • HOG by Dalal & Triggs • LST by Loy et al. • at cvpr 09

  31. Results – Collective Activity Dataset

  32. Results – Correct Examples

  33. Results – Incorrect Examples Crossing Waiting

  34. Walking Talking Queuing

  35. Results – Nursing Home Dataset

  36. Results – Correct Examples

  37. Results – Incorrect Examples

  38. Conclusion • A discriminative model for group activity recognition with context. • Two new types of contextual information: • group-person interaction • person-person interaction • structure-level: Latent structure • Feature-level: Action Context descriptor • Experimental results demonstrate the effectiveness of the proposed model

  39. Future Work • Modeling Complex Structures • Temporal dependencies among action • Contextual Feature Descriptors • How to encode discriminative context? • Weakly supervised Learning • e.g. multiple instance learning for fall detection

  40. Thank you!

  41. Pairwise Weight hj y hk

  42. Pairwise Weight

  43. Pairwise Weight

  44. Infer the graph structures

  45. Results – Nursing Home Dataset 0/1 loss – optimize overall accuracy

  46. Results – Nursing Home Dataset new loss – optimize mean per-class accuracy

  47. Person Detectors • Collective Activity Dataset: • Pedestrian Detector (Felzenszwalb et al., CVPR 08) • Nursing Home Dataset Background Subtraction Moving Regions Video

  48. Person Descriptors • Collective Activity Dataset: • HOG • Nursing Home Dataset • Local Spatial Temporal (LST) Descriptor (Loy et al., ICCV 09) u v