1 / 1

Our Framework 2.1. Enforcing Temporal Consistency by Post Processing

Human Detection in Videos using Spatio -Temporal Pictorial Structures Amir Roshan Zamir , Afshin Dehghan , Ruben Villegas University of Central Florida. Results Videos Taken From TRECVID MED11 Dataset . 2.2. Enforcing Temporal Consistency by Embedding them into the Detection Process

hank
Download Presentation

Our Framework 2.1. Enforcing Temporal Consistency by Post Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Detection in Videos using Spatio-Temporal Pictorial StructuresAmir RoshanZamir, AfshinDehghan, Ruben VillegasUniversity of Central Florida • Results • Videos Taken From TRECVID MED11 Dataset. • 2.2.Enforcing Temporal Consistency by Embedding them into the Detection Process • Our Contribution: • Extending Spatial Pictorial Structures to Spatio-Temporal Pictorial Structures. • 1. Problem • Human Detection in Videos: • Making Human Detection in Videos more Accurate. • Possibility of Numerous False Detections. • Applications: • Video surveillance, Human Tracking, Action Recognition, etc. • Human Detection Output with Temporal Consistency and Part Adjustment • Human Detection Output with Temporal Consistency • Input Frame • Human Detection Output Configuration of parts Spatial Deformation Cost Appearance • Human Detection Output with Temporal Consistency and Part Adjustment • Human Detection Output with Temporal Consistency • Input Frame • Human Detection Output Temporal Deformation Cost i • 1.1Our Approach • Using Temporal Information (Transition of Human Parts in Pictorial Structures). • False Detections Should Be Temporally Less Consistent than True Detections. • Human Parts Transition Should Convey Information Which is Ignored in the Frame-By-Frame Scenario. i i Frame Number : 2 3 1 • More Elegant Approach than Post Processing(2.1). • Best Detections Are Determined During the Optimization Process. • Configuration of Parts are Limited to Transitions in Time (Temporal Deformation). • Head Trajectories Comparison • Part’s Trajectories on Video • Our Framework • 2.1. Enforcing Temporal Consistency by Post Processing • Human Detection from Yang and Ramanan [1] Articulated Pose Estimation using Flexible Mixtures of Parts. • 3. Learning Transition of Parts • Human Body Parts Have a Set Range of Motion that Can Be Approximated. • These Movements(Trajectories) Can Be Learned by Training on the Annotated Dataset. • We will Use the HumanEva Dataset [2] for Our Training. • Head Trajectory After Temporal Adjustment • Head Trajectory Before Temporal Adjustment • Part’s Trajectories After Temporal Adjustment • Part’s Trajectories Before Temporal Adjustment • Conclusion • Temporal Part Deformation Improves Human Detection in Videos Based on Our Experiments. • Less False Detections and More True Detections. • Part Trajectories are More Precise. • Annotated Parts in each frame • Temporally Consistent Detection with Part Adjustment • Temporally Consistent Detection without Part Adjustment • Immediate Output from Human Detection • Input Frame • Next Steps • Applying the Temporal Deformation Cost in the Optimization Process. • Train a Model that Considers Usual Human Part Transitions in Time. • References • [1] Y. Yang, D. Ramanan. “Articulated Pose Estimation using Flexible Mixture of Parts” Computer Vision and Pattern Recognition (CVPR) Colorado Springs, Colorado, June 2011 • [2] L. Sigal, A. O. Balan and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulate Human Motion, International Journal of Computer Vision (IJCV), Volume 87, Number 1-2, pp. 4-27, March, 2010. • Parts Trajectories of Annotations • This Transitions will Be Learned and Embedded in Our Optimization Process to Restrict the Detections.

More Related