140 likes | 342 Views
HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences. Omar Oreifej Zicheng Liu CVPR 2013. Research Question. Input: Depth sequences information (only) of segmented video (1 video for 1 activity) Output:
E N D
HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences Omar Oreifej Zicheng Liu CVPR 2013
Research Question • Input: • Depth sequences information (only) of segmented video (1 video for 1 activity) • Output: • Feature (activity descriptor) for Activity recognition: Classify the activity • Why use depth information rather than color information? • Depth reflects pure geometry and shape cues. • Depth is insensitive to changes in lighting conditions.
Two key points about activity recognition feature • Capture the shape cues at a specific time instance • Capture the motion cues over the time
HON4D: Histogram of Oriented 4D Normals • Analogous to HOG feature • Calculated in 4D space: 2D images (x, y), depth (z), time (t) • Surface normals capture the shape cues • Change in the surface normals over time capture the motion cues. • Normals in 3D example
4D surface normal • Depth (z) considered as a function of time (t), space (x, y) • A surface S in a 4D space • The normal to the surface S is
Histogram of 4D normals • How to quantize the 4D space, i.e., get the bin of histogram? • Polychoron: 4D regular geometric objects, • analogous to cube in 3D space • Dvide the 4D space uniformly with its vertices • use 600-cell polychoron with 120 vertices • Each vertex is referred as a projector, i.e., one bin of histogram. • Histogram: Project 4D normals into 120 projectors • HON4D: 120-dimension feature
Histogram of 4D normals • Projection of 4D normals • set of 120 projectors • set of unit normals computed over all depth sequences • Projection with inner product • 120-dimensional HON4D descriptor feature by Normalization
Non-Uniform Quantization== Projectors Refinement • Uniform space quantization is not always optimal. • A better Non-Uniform Quantization could lead to better classification. = refine the projectors to better capture the distribution of the normals. How to evaluate the importance of each projector?
Non-Uniform Quantization== Projectors Refinement • First intuition: calculate the projector density is the training video set. High density does not necessarily means high contribution in classification
Non-Uniform Quantization== Projectors Refinement • Consider a SVM classifier for the training set, • is the set of support vectors in the training set. • Discriminative projector density High accumulation of normal vectors High contribution in the final classification
Projectors Refinement • Sort the projectors based on • Induce random perturbations of each of highest projectors. Augment the density-learned projectors In experiments: 120 300
Experiments • Use SVM (polynomial kernel) classifier • Three datasets • MSR Action 3D: • 20 actions (arm wave, hammer, hand catch, high throw,…) • HON4D+: 88.89% • HON4D: 85.85% • MSR Gesture 3D (American Sign Language) • 12 gestures (bathroom, blue, finish, green, hungry, …) • HON4D+: 92.45% • HON4D: 87.29% • 3D Action Pairs Dataset • HON4D+: 96.97% • HON4D: 93.33%