摘要

A non-intrusive speech quality assessment method for complex environments was proposed. In the proposed approach, a new sparse representation-based speech reconstruction algorithm was presented to acquire the quasi-clean speech from the noisy degraded signal. Firstly, an over-complete dictionary of the clean speech power spectrum was learned by the K-singular value decomposition algorithm. Then in the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the noise spectrum which was adjusted by a posteriori SNR-weighted factor, and the orthogonal matching pursuit approach was applied to reconstruct the clean speech spectrum from the noisy speech. The quasi-clean speech was considered as the reference to a modified PESQ perceptual model, and the mean opinion score of the noisy degraded speech was achieved via the distortions estimation between the quasi-clean speech and the degraded speech. Experimental results show that the proposed approach obtains a correlation coefficient of 0.925 on NOIZEUS complex environment database, which is 99% similar to the performance of the intrusive standard ITU-T PESQ, and 7.1% outperforms non-intrusive standard ITU-T P.563.