摘要

This letter proposes a new feature describing the trajectories of surface patches (ToSP) on human bodies for action recognition by a novel scheme of utilizing RGB and depth videos. RGB data contains appearance information by which we track specific patches on body surfaces while depth data contains spatial information by which we describe surface patches. Specifically, we use spatial-temporal interest points as initial points to track in two directions. A ToSP is extracted by keeping the neighborhood in point cloud of each point on the trajectory. By using the temporal pyramid, ToSPs are matched on several levels based on the surface feature extracted from ToSP segments. The proposed feature captures both the shape and position variations of surface patches, thus it has the advantages of trajectories and local spatial-temporal features. The experiment results show that the proposed feature outperforms the existing trajectories based features and depth features.