摘要

Recognition errors of proper nouns and foreign words significantly decrease the performance of ASR-based speech applications such as voice dialing systems, speech summarization, spoken document retrieval, and spoken query-based information retrieval (IR). The reason is that proper nouns and words that come from other languages are usually the most important key words. The loss of such words due to misrecognition in turn leads to a loss of significant information from the speech source. This paper focuses on how to improve the performance of Indonesian ASR by alleviating the problem of pronunciation variation of proper nouns and foreign words (English words in particular). To improve the proper noun recognition accuracy, proper-noun specific acoustic models are created by supervised adaptation using maximum likelihood linear regression (MLLR). To improve English word recognition, the pronunciation of English words contained in the lexicon is fixed by using rule-based English-to-Indonesian phoneme mapping. The effectiveness of the proposed method was confirmed through spoken query based Indonesian IR. We used Inference Network-based (IN-based) IR and compared its results with those of the classical Vector Space Model (VSM) IR, both using a tf-idf weighting schema. Experimental results show that IN-based IR outperforms VSM IR.

  • 出版日期2010-9