摘要

Isolated sign language recognition (SLR) is a long-standing research problem. The existing methods consider inclusively ambiguous data to represent a sign and ignore the fact that only scarce key information can represent the sign efficiently since most information are redundant. Furthermore, inclusion of redundant information may result in inefficiency and difficulty in modeling the long-term dependency for SLR. This letter delivers a novel sequence-to-sequence learning method based on keyframe centered clips (KCCs) for Chinese SLR. Different from conventional methods, only key information is considered to represent a sign significantly. The frames-to-word task is transformed into a KCCs-to-subwords task successfully, to allow for different attention in the input data. The empirical results of the proposed method outperform significantly the state-of-the-art SLR systems on our dataset containing 310 Chinese sign language words.