Audio Pattern Recognition of Baby Crying Sound Events

Ntalampiras Stavros<sup>*</sup>

doi:10.17743/jaes.2015.0025

摘要

This article addresses a problem arising within the paralinguistic audio signal processing domain that of classifying the state of an infant based on the patterns exhibited by the crying sound events. More specifically we propose a methodology able to distinguish among the following five states: (a) hungry, (b) uncomfortable (need change), (c) need to burp, (d) in pain, and (e) need to sleep. A great variety of audio parameters (Perceptual Linear Prediction, Mel Frequency Cepstral Coefficients, Perceptual Wavelet Packets, Teager Energy Operator, Temporal Modulation) related to the task at hand along with a series of classification techniques (Multilayer Perceptron, Support Vector Machine, Random Forest, Reservoir Network, Gaussian Mixture model, Hidden Markov model) were customized for addressing the issue in a reliable manner. The final implementation exploits a representation of the audio structure including a set of descriptors capturing heterogeneous aspects of the signal. Subsequently we introduce the usage of Reservoir Networks to the specific problematic that demonstrated quite encouraging performance. The final goal of the method is to provide an automatic and non-invasive framework for monitoring infants and helping inexperienced/trainee pediatricians and/or parents and babysitters to diagnose their pathological status.

出版日期2015-5

全文

访问全文

收藏分享被引(25) 浏览

更新时间：2024-04-12 15:30

Audio Pattern Recognition of Baby Crying Sound Events

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友