Comparative study of singing voice detection methods

作者:You Shingchern D; Wu Yi Chung; Peng Shih Hsien
来源:Multimedia Tools and Applications, 2016, 75(23): 15509-15524.
DOI:10.1007/s11042-015-2894-9

摘要

Detecting Singing segments in a segment of a soundtrack is an important and useful technique in musical signal processing and retrieval. In this paper, we study the accuracy of detecting singing segments using the HMM (Hidden Markov Model) classifier with various features, including MFCC (Mel Frequency Cepstral Coefficients), LPCC (Linear Predictive Cepstral Coefficients), and LPC (Linear Prediction Coefficients). Simulation results show that detecting singing segments in a soundtrack is more difficult than detecting them among pure-instrument segments. In addition, combining MFCC and LPCC yield higher accuracy. The bootstrapping technique has only limited accuracy improvement to detect all singing segments in a soundtrack. To be complete, we also conduct an experiment to show that the time to perform music identification can be reduced by more than 40 % if we incorporate the singing-voice detection mechanism into the identification process.

  • 出版日期2016-12