摘要

This paper presents a newly proposed neural network-based method for dual-microphone voice activity detection in handset applications. The method innovatively takes advantage of features of the signals picked by the dual-microphones. They are subband signed power difference and inter-microphone cross correlation, where the former provides specific and accurate power difference information at various frequency bands and the latter provides detailed spatial location information of both microphones. Compared with existing methods, the proposed method has such an advantage that makes it very suitable for handset applications. That is, in the method, no threshold or parameter needs tuning for different types of noise environments since the method can adapt itself to those environments. In this paper the performance is extensively evaluated under various noise environments including directional speech interferences. Compared with the existing method based on the power level difference ratio [1], the proposed method shows significant improvements in terms of accuracy and robustness.