Deep attention network for joint hand gesture localization and recognition using static RGB-D images

Li, Yuan; Wang, Xinggang; Liu, Wenyu; Feng, Bin<sup>*</sup>

doi:10.1016/j.ins.2018.02.024

摘要

This paper presents an effective deep attention network for joint hand gesture localization and recognition using static RGB-D images. Our method trains a CNN framework based on a soft attention mechanism in an end-to-end manner, which is capable of automatically localizing hands and classifying gestures using a single network rather than relying on the conventional means of stage-wise hand segmentation/detection and classification. More precisely, our attention network first computes the weight for each proposal generated from the entire image, in order to judge the probability of the hand appearing in a given region. It then implements a global-sum operation for all proposals, which is influenced by their corresponding weights, in order to obtain a representation of the entire image. We demonstrate the feasibility and effectiveness of our method through extensive experiments on the NTU Hand Digits (NTU-HD) benchmark and the challenging HUST American Sign Language (HUST-ASL) dataset. Moreover, the proposed attention network is simple to train, without requiring bounding-box or segmentation mask annotations, which makes it easy to apply in hand gesture recognition systems. Based on the proposed attention network and taken RGB-D images as input, we obtain the state-of-the-art hand gesture recognition performance on the challenging HUST-ASL dataset.

出版日期2018-5
单位华中科技大学

全文

访问全文

收藏分享被引(33) 浏览

更新时间：2024-05-10 21:17

Deep attention network for joint hand gesture localization and recognition using static RGB-D images

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友