Deep learning of chroma representation for cover song identification in compression domain

Fang Jiunn-Tsair; Chang Yu-Ruey; Chang Pao-Chi<sup>*</sup>

doi:10.1007/s11045-017-0476-x

摘要

Methods for identifying a cover song typically involve comparing the similarity of chroma features between the query song and another song in the data set. However, considerable time is required for pairwise comparisons. In addition, to save disk space, most songs stored in the data set are in a compressed format. Therefore, to eliminate some decoding procedures, this study extracted music information directly from the modified discrete cosine transform coefficients of advanced audio coding and then mapped these coefficients to 12-dimensional chroma features. The chroma features were segmented to preserve the melodies. Each chroma feature segment was trained and learned by a sparse autoencoder, a deep learning architecture of artificial neural networks. The deep learning procedure was to transform chroma features into an intermediate representation for dimension reduction. Experimental results from a covers80 data set showed that the mean reciprocal rank increased to 0.5 and the matching time was reduced by over 94% compared with traditional approaches.

出版日期2018-7
单位中央民族大学

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2023-12-15 17:04

Deep learning of chroma representation for cover song identification in compression domain

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友