Multi-Domain and Multi-Task Learning for Human Action Recognition
IEEE Transactions on Image Processing, 28(2), pp 853-867, 2019-2
Domain-invariant (view-invariant and modality-invariant) feature representation is essential for human action recognition. Moreover, given a discriminative visual representation, it is critical to discover the latent correlations among multiple actions in order to facilitate action modeling. To address these problems, we propose a multi-domain and multi-task learning (MDMTL) method to: 1) extract domain-invariant information for multi-view and multi-modal action representation and 2) explore the relatedness among multiple action categories. Specifically, we present a sparse transfer learning-based method to co-embed multi-domain (multi-view and multi-modality) data into a single common space for discriminative feature learning. Additionally, visual feature learning is incorporated into the multi-task learning framework, with the Frobenius-norm regularization term and the sparse constraint term, for joint task modeling and task relatedness-induced feature learning. To the best of our knowledge, MDMTL is the first supervised framework to jointly realize domain-invariant feature learning and task modeling for multi-domain action recognition. Experiments conducted on the INRIA Xmas Motion Acquisition Sequences data set, the MSR Daily Activity 3D (DailyActivity3D) data set, and the Multi-modal & Multi-view & Interactive data set, which is the most recent and largest multi-view and multi-model action recognition data set, demonstrate the superiority of MDMTL over the state-of-the-art approaches.
Domain-invariant Learning; multi-task learning; human action recognition
tianjin univ; chinese acad sci