1 / 10

Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. Analysis and Recognition of Video Data Tamir Nuriel. Flowchart of the approach. Interest Points Detector. Gaussian smoothing in the space dimension. Gabor filters in the time dimension.

cale
Download Presentation

Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words Analysis and Recognition of Video Data Tamir Nuriel

  2. Flowchart of the approach

  3. Interest Points Detector • Gaussian smoothing in the space dimension. • Gabor filters in the time dimension. • Extract spatial-temporal cube around interesting points.

  4. Descriptor • Brightness gradients on x, y and t directions. • The computed gradients are concatenated to form a vector. This descriptor is then projected to a lower dimensional space using the principal component analysis (PCA) dimensionality reduction technique. • Instead of performing dimension reduction using PCA - Histogram of gradients in each direction.

  5. Codebook Formation • The codebook is constructed by clustering using the k-means algorithm and Euclidean distance as the clustering metric. • The center of each resulting cluster is defined to be a spatial-temporal codeword.

  6. Learning the Action Models by pLSA • Maximizing • E-step: • M-step:

  7. Experimental results • Patches from different actions from the KTH dataset:

  8. Experimental results • Marking patches in video

  9. Experimental results • Confusion Matrix

  10. References • J. C. Niebles, H. Wang and L. Fei-Fei, “Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words”, International Journal of Computer Vision. In press. 2008. • C. Schuldt, I. Laptev, B. Caputo, “Recognizing human actions: a local SVM approach”, In Proc. ICPR 2004. • L. Zelnik-Manor, M. Irani, “Event-based analysis of video”, CVPR 2001.

More Related