
When extracting answers in Chinese question-answering system, synonymy will cause to lose several correct answers, and polysemy will cause to extract wrong answers. In order to solve these problems, this paper proposes a method to calculate similarity between question and sentence based on Latent Semantic Analysis (LSA). This method represents the question and sentence with space vector model, statistically analyzes the abundant question-answering sentence pair corpus with the help of latent semantic analysis theory, and constructs a latent word-sentence semantic space, which gets rids of the correlativity between word. And then similarity calculation between question and sentence is implemented in this semantic space. So the question of synonymy and polysemy is solved effectively. Finally, combining question type and similarity between question and sentence, the experiment on extracting sentence as answer for Chinese factoid question is done. The MRR value with LSA is 0.47, which is better than VSM obviously. The results show that this method makes a very better effect.

  • 出版日期2006
