HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences Omar Oreifejand Zicheng Liu From University of Centreal Florida and Microsoft Research IEEE Conference on Computer Vision and Pattern Recognition 2013

Outline Introduction Histogram of 4D Normal (HON4D) Non-Uniform Quantization Experiments Conclusion

Introduction • Low-cost depth sensors such as Kinect triggered significant attention in • Object detection • Activity recognition • There are several advantages comparing with color images • Pure geometry and shape cues • Insensitive to changes in lighting condition • This paper propose a novel activity descriptor for depth sequence in the 4D space of depth, time, and spatial coordinates denoted by HON4D

Introduction Section 2 Section 3 The steps for computing HON4D descriptor

Histogram of 4D Normal • The 4D surface Normal • the depth image sequence • The depth value where denotes the time and denotes the spatial coordinate • In a 4D space • is a surface • is the normal to the surface • This paper use the histogram of the 4D normal as feature for recognition • Only the orientation of the normal is used

Histogram of 4D Normal [9] A. Klaser, M. Marszalek, and C. Schmid. A spatio-temporal descriptor based on 3d-gradients. InBMVC, 2008. • Comparing HOG3D[9] with HON4D • Surface 1 and 2 have similar gradient orientation but surface 1 has a higher inclination • The HON4D differentiate thesesurfaces while HOG3D cannot

Histogram of 4D Normal • Quantize the corresponding space into specific bins • HOG • The gradient is two-dimensional • Quantize a circle to obtain the bins • HOG3D • The gradient is three-dimensional • Use a polyhedron to quantize the orientation • Use the vertices of the polygon aspredefined orientations

Histogram of 4D Normal [7] B. Grnbaum, V. Kaibel, V. Klee, and G. M. Ziegler. Convex polytopes (2nd ed.). In New York and London: Springer-Verlag, ISBN 0-387-00424-6, 2003. • Quantize the 4D space uniformly • Use a polychorons[7] with 120 vertices • 8 vertices obtained by permutations of • 16 vertices obtained by permutations of • 96 vertices obtained by even permutation of where is the golden ratio • By above definition we have 120 projectors • The set of unit normal • Compute the component of each direction by

Non-Uniform Quantization • Uniform space quantization is not optimal. • Example: Two different classes of activities is very similar in most of the bins • Idea: enlarge the information that can be use to distinguish the classes

Non-Uniform Quantization • Finding the projectors which is close to the optimal • Let where k is obtained for a video k • Project on the projectors where is the normal vectors for a video k • The density of the projectors is compute by estimating how many unit normal fall into each bins • The higher density does not necessarily contribute more in classification

Non-Uniform Quantization ASVM classifier which scores a sample usingscore where w is a support vector We adjust the density function as We induce m random perturbations for each projectors according to their discriminative density

Experiments • Three standard 3D activity datasets and one dataset collected by themselvesare used in the experiments • MSR Action 3D • MSR Gesture 3D • MSR Daily Activity 3D • 3D Action Pairs • 120 verticepolychoron are used for initial HON4D • The non-uniform HON4D end up with typically ~300 projectors

Experiments • MSR Action 3D Dataset • Twenty actions • high arm wave, horizontal arm wave, hammer, hand catch, forward punch”, “high throw”, “draw x”, draw tick”, draw circle, hand clap, two hand wave, sideboxing, bend, forward kick, side kick, jogging, tennis swing, tennis serve, golf swing, pick up&throw.

Experiments • MSR Hand Gesture Dataset • 12 gestures • bathroom, blue, finish, green, hungry, milk, past, pig, store, where, j, z

Experiments • Six pair of actions performed by ten actors with 3 times each • Pick up a box/Put down a box, Lift a box/Place a box, Push a chair/Pull a chair, Wear a hat/Take off a hat, Put on a backpack/Take off a backpack, Stick a poster/Remove a poster

Experiments

Conclusion This paper presented a novel, simple, and easily implementable descriptor for activity recognition form depth sequences This paper gives an idea that a uniform quantization is not the best for action recognition and proofs it in the experimental results. The HON4D descriptor includes the shape cue and the motion information which is very distinguishable for activity and action recognition

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences