rolling

Rolling Rotations for Recognizing Human Actions from 3D Skeletal Data

Abstract: Recently, skeleton-based human action recognition has been receiving significant attention from various research communities due to the availability of depth sensors and real-time depth-based 3D skeleton estimation algorithms. In this work, we use rolling maps for recognizing human actions from 3D skeletal data. The rolling map is a well-defined mathematical concept that has not been explored much by the vision community. First, we represent each skeleton using the relative 3D rotations between various body parts. Since 3D rotations are members of the special orthogonal group SO(3), our skeletal representation becomes a point in the Lie group SO(3) × . . . × SO(3), which is also a Riemannian manifold. Then, using this representation, we model human actions as curves in this Lie group. Since classification of curves in this non-Euclidean space is a difficult task, we unwrap the action curves onto the Lie algebra by combining the logarithm map with rolling maps, and perform classification in the Lie algebra. Experimental results on three action datasets show that the proposed approach performs equally well or better when compared to state-of-the-art.

Contributions

We combine the logarithm and rolling maps to flatten the special orthogonal group SO(3) for recognizing human actions from 3D skeletal data. To the best of our knowledge, rolling maps were never used in the context of human action recognition.
Most existing works on rolling maps use a geodesic curve as the rolling curve. In contrast to this, we propose to use mean action curves, which are non-geodesic, as rolling curves.
Existing literature does not provide closed form expressions for the rolling map in the case of a non-geodesic rolling curve. In this work, we show how to compute a piecewise smooth rolling map corresponding to a given (discrete) non-geodesic rolling curve in SO(3).
We introduce a scale-invariant skeletal representation by using only 3D rotations (instead of full rigid body transformations) to describe the relative geometry between various body parts. Using only the rotations reduces the feature dimensionality by half compared to our earlier SE(3)-based representation.
We show that the proposed scale-invariant rotation-based representation performs equally well when compared to our earlier full rigid body transformation-based representation by evaluating it on three action datasets: Florence3D-Action dataset, MSR-Action Pairs dataset and G3D-Gaming dataset.

Experimental results

Comparison between using the logarithm map at a point and unwrapping while rolling (in terms of recognition accuracy)
Dataset	Florence3D	MSRPairs	G3D
Logarithm map at a point (Standard)	86.83	92.96	87.82
Unwrapping while rolling (Proposed)	89.82	94.09	87.95

Comparison with state-of-the-art skeleton-based approaches on Florence3D dataset
Multi-Part Bag-of-Poses	82.00
Motion Trajectories	87.04
Elastic Functional Coding	89.67
SE(3)-based representation	90.71
Proposed (concatenated representation)	89.82
Proposed (FTP representation)	91.40

Comparison with state-of-the-art skeleton-based approaches on MSRPairs dataset
SE(3)-based representation	93.65
Proposed (concatenated representation)	94.09
Proposed (FTP representation)	94.67

Comparison with state-of-the-art skeleton-based approaches on G3D dataset
RBM + HMM	86.40
SE(3)-based representation	91.09
Proposed (concatenated representation)	87.95
Proposed (FTP representation)	90.94

Code and data
Use the below link to download the matlab code and data used in our experiments.

Rolling rotations code

Please cite the below paper if you use this code for your research.

Publications

Raviteja Vemulapalli and Rama Chellappa, "Rolling Rotations for Recognizing Human Actions from 3D Skeletal Data", CVPR, 2016. [PDF]