1 / 41

Recognizing Human Actions by Attributes

Recognizing Human Actions by Attributes. CVPR2011 Jingen Liu, Benjamin Kuipers , Silvio Savarese Dept. of Electrical Engineering and Computer Science University of Michigan. Outline. Introduction Our Contributions Attribute-Based Action Representation Learning Data-Driven Attributes

babu
Download Presentation

Recognizing Human Actions by Attributes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, SilvioSavarese Dept. of Electrical Engineering and Computer Science University of Michigan

  2. Outline • Introduction • Our Contributions • Attribute-Based Action Representation • Learning Data-Driven Attributes • Knowledge Transfer Across Classes • Experiments and Discussion

  3. Introduction • the traditional approaches for human action recognition • the action golf-swinging • human actions are better described by action attributes

  4. manually specified attributes • Subjective • 2 – problem • Complete • Data – driven • Intra-class variably • Latent – variable , SVM

  5. Our Contributions • action attributes can be used to improve human action recognition • manually-specified attributes • latent variables • integrates manually-specified and data-driven attributes

  6. useful for recognizing novel action classes without training examples • significantly boost traditional action classification

  7. Attribute-Based Action Representation • previous works represent actions with low-level features • define an action attribute space

  8. Example • five attributes • “translation of torso”, “updown torso motion”, “arm motion”, “arm over shoulder motion”, “leg motion” • action class “walking” • represented by a binary vector {1, 0, 1, 0, 1}

  9. By introducing the attribute layer between the low-level features and action class labels , classifier f which maps x to a class label

  10. Attributes as Latent Variables • want to learn a classification model for recognizing an unknown action x • Treating attributes as latent variables • consider each attribute in the space as latent variables • ai ∈ [0, 1]

  11. Goal : learn a classifier fw to predict a new video x

  12. Raw feature : x • Class label : y • Attributes : a • Weight for each feature : w

  13. provides the score measuring how well the raw feature matches the action class

  14. provides the score of an individual attribute, and is used to indicate the presence of an attribute in the video x

  15. captures the co-occurrence of pair of attributes aj and ak

  16. parameter vector w is learned from a training dataset

  17. Learning Data-Driven Attributes • manual specification of attributes is subjective • data-driven attributes

  18. The Mutual Information (MI) • a good measurement to evaluate the quality of grouping • Given two random variables • X ∈ X = {x1, x2, ..., xn} • Y ∈ Y = {y1, y2, ..., ym} • where X represents a set of visual-words, and Y is a set of action videos • MI(X; Y )

  19. Given a set of features • Wish to obtain a set of clusters • The quality of clustering is measured by the loss of MI

  20. integrate the discovery of data-driven attributes into the framework of latent SVM • h ∈ H • H is the data-driven attribute space

  21. Knowledge Transfer Across Classes • transferring knowledge from known classes (with training examples) to a novel class (without training examples) • using this knowledge to recognize instances of the novel class

  22. Experiments and Discussion • Datasets and Action Attributes • Experimental Results • Experiments on Olympic Sports Dataset

  23. Datasets and Action Attributes • UIUC Dataset • 532 videos of 14 actions • such as walk, hand-clap, jump-forward … • Combining existing datasets into a larger one • KTH dataset • six classes and about 2,300 videos • Weizmann dataset • 10 classes and about 100 videos • UIUC • Olympic Sports dataset • it is collected from YouTube , it contains realistic human actions

  24. Experimental Results • Recognizing novel action classes • Attributes boosting traditional action recognition

  25. Recognizing novel action classes • use the leave-two-classes-out-cross-validation strategy in experiments on the UIUC dataset • each run leave two classes out as novel classes (|Z| = 2)

  26. The average accuracy of leave-two-classes-out-cross-validation on the UIUC dataset for recognizing novel action classes.

  27. Divide the UIUC dataset into two disjoint sets • Y : training set • contains 10 action classes • Z : testing set • contains four classes • the testing and training classes share some common attributes

  28. Example (a)

  29. Attributes boosting traditional action recognition • using our proposed framework to prove that action attributes do improve performance of traditional action recognition • Our results demonstrate that a significant improvement occurs with the use of manually-specified attributes.

  30. To further demonstrate the correlation between manually-specified attributes and data-driven attributes • This map is constructed from the training data

  31. Dissimilarity between 100 data-driven attributes (rows) and 34 manually-specified attributes (columns) • Colder color has lower value

  32. The effect of removing a set of human-specified attributes • some specified attributes (e.g., the human-specified attribute set a = {1, 8, 9, 10, 11}, columns ) are more correlated with data-driven attributes.

  33. “Specified attributes” means only using this type of attributes for recognition • “B” indicates the performance before attributes removal • “A” indicates the performance after removing the attributes. • “Mixed Attributes” means using both manually-specified and data-driven attributes for recognition

  34. Using manually-specified attributes only • Remove human-specified attribute set a = {1, 8, 9, 10, 11} • the performance from 72% to 64%

  35. Using both manually-specifiedand data-driven attributes • Remove human-specified attribute set a = {1, 8, 9, 10, 11} • doesn’t cause an obvious performance decrease

  36. Experiments on Olympic Sports Dataset • using the Olympic Sports dataset, which contains 16 action classes and about 781 videos, for recognizing novel action classes and traditional training based recognition

  37. The performance of recognizing novel testing classes • Five cases • 4 classes are used for testing • 12 classes used for training

  38. THANK YOU !

More Related