Learning an Attribute Dictionary for Human Action Classification

Learning an Attribute Dictionary for Human Action Classification Qiang Qiu Qiang Qiu, Zhuolin Jiang, and Rama Chellappa, ”Sparse Dictionary-based Representation and Recognition of Action Attributes”, ICCV 2011

Action Feature Representation Shape Motion HOG

Action Sparse Representation Sparse code Action Dictionary = 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0.64 0.53 -0.40 0.35 0 0 0 = + 0.63 × - 0.33 × - 0.36 × 0.43 × = + 0.53× - 0.40 × +0.35 × 0.64×

K-SVD Sparse codes x1 x2 • K-SVD • Input: signals Y, dictionary size, sparisty T • Output: dictionary D, sparse codes X Input signals Dictionary d1 d2 d3 … y2 y1 = D Y 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0.64 0.53 -0.40 0.35 0 0 0 X arg min |Y- DX|2 s.t. i , |xi|0 ≤ T D,X [1] M. Aharon and M. Elad and A. Bruckstein, K-SVD: An Algorithm for Designing Overcomplete Dictionries for Sparse Representation, IEEE Trans. on Signal Process, 2006

Objective Learn a Compact Discriminative and Dictionary.

Probabilistic Model for Sparse Representation • A Gaussian Process • Dictionary Class Distribution

More Views of Sparse Representation y1 y2 y3 y4 x1 x2 x3 x4 = xd1 l1 l2 l1 l2 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0 0 0 0 0.64 0.53 -0.40 0.35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.28 0.698 0.37 0.25 0 0 0 0 0 -0.42 0 0 0 0 0.42 0.47 0 0.32 d1 d2 d3 … l1 l2 l1 l2

x1 x2 x3 x4 xd1 d1 A Gaussian Process 0 0.53 0 0 0 0.64 0 0 0.63 0 0 0 0.43 0 0 0 -0.33 -0.40 0 -0.42 xd2 d2 d3 xd3 d4 xd4 y2 y4 y1 y3 xd5 d5 … … l1 l2 l1 l2 l1 l1 l2 l2 • A Gaussian Process • Covariance function entry: K(i,j) = cov(xdi, xdj) • P(Xd*|XD*) is a Gaussian with a closed-form conditional variance

x1 x2 x3 x4 xd1 d1 Dictionary Class Distribution 0 0.53 0 0 0 0.64 0 0 0.63 0 0 0 0.43 0 0 0 -0.33 -0.40 0 -0.42 xd2 d2 d3 xd3 d4 xd4 y2 y4 y1 y3 xd5 d5 … … l1 l2 l1 l2 l1 l1 l2 l2 • Dictionary Class Distribution • P(L|di), L [1, M] • aggregate |xdi| based on class labels to obtain a M sized vector • P(L=l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348 • P(L=l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37

Dictionary Learning Approaches • Maximization of Joint Entropy (ME) • Maximization of Mutual Information (MMI) • Unsupervised Learning (MMI-1) • Supervised Learning (MMI-2)

Maximization of Joint Entropy (ME) - Initialize dictionary using k-SVD Do = - Start with D* = • Untill |D*|=k, iteratively choose d* from Do\D*, • d* = arg max H(d|D*) d • where ME dictionary • A good approximation to ME criteria • arg max H(D) D

Maximization of Mutual Information for Unsupervised Learning (MMI-1) - Initialize dictionary using k-SVD Do = - Start with D* = • Untill |D*|=k, iteratively choose d* from Do\D*, MMI dictionary Coverage Diversity • A near-optimal approximation to MMI • arg max I(D; Do\D) • within (1-1/e) of the optimum • Closed form: D • d* = arg max H(d|D*) - H(d|Do\(D* d)) d

x1 x2 x3 x4 xd1 Revisit d1 -0.33 -0.40 0 -0.42 0 0.64 0 0 0.63 0 0 0 0.43 0 0 0 0 0.53 0 0 xd2 d2 d3 xd3 d4 xd4 y2 y4 y1 y3 xd5 d5 … … l1 l2 l1 l2 l1 l1 l2 l2 • Dictionary Class Distribution • P(L|di), L [1, M] • aggregate|xdi|based on class labels to obtain a M sized vector • P(l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348 • P(l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37 • P(Ld) = P(L|d) • P(LD) = P(L|D) , where

Maximization of Mutual Information for Supervised Learning (MMI-2) - Initialize dictionary using k-SVD Do = - Start with D* = • Untill |D*|=k, iteratively choose d* from Do\D*, • d* = arg max [H(d|D*) - H(d|Do\(D* d))] • + λ[H(Ld|LD*) – H(Ld|LDo\(D* d))] d - MMI-1 is a special case of MMI-2 with λ=0.

Keck gesture dataset

Representation Consistency [1] [1] J. Liu and M. Shah, Learning Human Actions via Information Maximization, CVPR 2008

Recognition Accuracy The recognition accuracy using initial dictionary Do: (a) 0.23 (b) 0.42 (c) 0.71

Recognizing Realistic Actions • 150 broadcast sports videos. • 10 different actions. • Average recognition rate: 83:6% • Best reported result 86.6%

Learning an Attribute Dictionary for Human Action Classification

Learning an Attribute Dictionary for Human Action Classification

Presentation Transcript

Meta Learning: For Classification

Human Action

Creating an Extended Attribute

Feature learning for image classification

Classification for reporting and learning

Human Action

Human Action

HUMAN ACTION CLASSIFICATION USING 3-D CONVOLUTIONAL NEURAL NETWORKS

An Action-Oriented Human Rights Conference

Action Learning for Facilitators

Attribute Learning for Understanding Unstructured Social Activity

Building a Learning Culture: An Action Learning Perspective

Transfer Learning for Image Classification

Human Action Recognition using Spatio-Temporal Classification

THE HUMAN MEDICAL DICTIONARY

ACTIVE LEARNING FOR TEXT CLASSIFICATION

Sparse Granger Causality Graphs for Human Action Classification

An Attribute Grammar for Tiny

Learning an Attribute Dictionary for Human Action Classification

An Attribute Grammar for Tiny

An Overview of Action Learning