Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization

Kameoka Hirokazu<sup>*</sup>; Higuchi Takuya; Tanaka Mikihiro; Li Li

doi:10.1109/TASLP.2018.2795746

摘要

One successful approach for audio source separation involves applying nonnegative matrix factorization (NMF) to a magnitude spectrogram regarded as a nonnegative matrix. This can be interpreted as approximating the observed spectra at each time frame as the linear sum of the basis spectra scaled by time-varying amplitudes. This paper deals with the problem of the unsupervised instrument-wise source separation of polyphonic signals based on an extension of the NMF approach. We focus on the fact that each piece of music is typically played on a handful of musical instruments, which allows us to assume that the spectra of the underlying audio events in a polyphonic signal can be grouped into a reasonably small number of clusters in the mel-frequency cepstral coefficient (MFCC) domain. Based on this assumption, we propose formulating factorization of amagnitude spectrogram and clustering of the basis spectra in the MFCC domain as a joint optimization problem and derive a novel optimization algorithm based on the majorization-minimization principle. Experimental results revealed that our method was superior to a two-stage algorithm that consists of performing factorization followed by clustering the basis spectra, thus showing the advantage of the joint optimization approach.

出版日期2018-6

全文

访问全文

收藏分享被引(3) 浏览

更新时间：2021-03-17 20:36

Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友