HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences Omar Oreifej Zicheng Liu CVPR 2013

Research Question • Input: • Depth sequences information (only) of segmented video (1 video for 1 activity) • Output: • Feature (activity descriptor) for Activity recognition: Classify the activity • Why use depth information rather than color information? • Depth reflects pure geometry and shape cues. • Depth is insensitive to changes in lighting conditions.

Two key points about activity recognition feature • Capture the shape cues at a specific time instance • Capture the motion cues over the time

HON4D: Histogram of Oriented 4D Normals • Analogous to HOG feature • Calculated in 4D space: 2D images (x, y), depth (z), time (t) • Surface normals capture the shape cues • Change in the surface normals over time capture the motion cues. • Normals in 3D example

4D surface normal • Depth (z) considered as a function of time (t), space (x, y) • A surface S in a 4D space • The normal to the surface S is

Histogram of 4D normals • How to quantize the 4D space, i.e., get the bin of histogram? • Polychoron: 4D regular geometric objects, • analogous to cube in 3D space • Dvide the 4D space uniformly with its vertices • use 600-cell polychoron with 120 vertices • Each vertex is referred as a projector, i.e., one bin of histogram. • Histogram: Project 4D normals into 120 projectors • HON4D: 120-dimension feature

Histogram of 4D normals • Projection of 4D normals • set of 120 projectors • set of unit normals computed over all depth sequences • Projection with inner product • 120-dimensional HON4D descriptor feature by Normalization

Non-Uniform Quantization== Projectors Refinement • Uniform space quantization is not always optimal. • A better Non-Uniform Quantization could lead to better classification. = refine the projectors to better capture the distribution of the normals. How to evaluate the importance of each projector?

Non-Uniform Quantization== Projectors Refinement • First intuition: calculate the projector density is the training video set. High density does not necessarily means high contribution in classification

Recall SVM

Non-Uniform Quantization== Projectors Refinement • Consider a SVM classifier for the training set, • is the set of support vectors in the training set. • Discriminative projector density High accumulation of normal vectors High contribution in the final classification

Projectors Refinement • Sort the projectors based on • Induce random perturbations of each of highest projectors. Augment the density-learned projectors In experiments: 120  300

Flow chart

Experiments • Use SVM (polynomial kernel) classifier • Three datasets • MSR Action 3D: • 20 actions (arm wave, hammer, hand catch, high throw,…) • HON4D+: 88.89% • HON4D: 85.85% • MSR Gesture 3D (American Sign Language) • 12 gestures (bathroom, blue, finish, green, hungry, …) • HON4D+: 92.45% • HON4D: 87.29% • 3D Action Pairs Dataset • HON4D+: 96.97% • HON4D: 93.33%

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

Presentation Transcript

AS 9 - REVENUE rECOGNITION

Speech Recognition and Understanding

MSA- multiple sequence alignment

INFINITE SEQUENCES AND SERIES

OBJECT ORIENTED ANALYSIS AND DESIGN

Action Recognition

BLAST

Neural Networks

Writing for Recognition

Gesture Recognition

Named Entity Recognition

Object Oriented Analysis and Design using UML

multimodal+emotion+recognition

PROTEIN PATTERN DATABASES

Prologo

6.870 Object Recognition and Scene Understanding

Infinite Sequences and Series

Statistical Studies: Statistical Investigations

Chapter 4 Data-Oriented Models

Speech Recognition

Reversible Data Hiding