A new pitch-range based feature set for a speaker's age and gender classification

Barkana Buket D<sup>*</sup>; Zhou Jingcheng

doi:10.1016/j.apacoust.2015.04.013

摘要

This paper presents a pitch-range (PR) based feature set for age and gender classification. The performance of the proposed feature set is compared With MFCCs, energy, relative spectral transform-perceptual linear prediction (RASTA_PLP), and fundamental frequency (F0). Voice activity detection (VAD) is performed to extract speech utterances before feature extraction. Two different classifiers, k-Nearest Neighbors (kNN) and Support Vector Machines (SVM) are used in order to evaluate the effectiveness of the feature sets. Experimental results are reported for the aGender database. Both kNN and SVM classifiers achieved the highest accuracy rates by the proposed PR feature set in age + gender and age classifications. PR features represent the pitch changes over time. In age + gender classification, the class of middle-aged female speaker is recognized with an accuracy of 92.86%, followed by senior female speakers with 83.61%, children with 83.02%, middle-aged male speakers with 73.58%, young female speakers with 67.35%, and senior male speakers with 34.33% by using 3PR features with the SVM classifier. Low classification accuracies are observed for young male speakers.

出版日期2015-11

全文

访问全文

收藏分享被引(37) 浏览

更新时间：2024-04-12 15:28

A new pitch-range based feature set for a speaker's age and gender classification

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友