摘要

For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to rank the answers. Firstly, the method analyses questions to generate the query string, and then submits the query string to search engines to retrieve relevant documents. Secondly, the method makes retrieved documents segmentation and identifies the most relevant candidate answers, in addition, it uses the rhetorical relations of rhetorical structure theory to analyze the relationship to determine the inherent relationship between paragraphs or sentences and generate the answer candidate paragraphs or sentences. Thirdly, we construct the answer ranking model,, and extract five feature groups and adopt Ranking Support Vector Machine (SVM) algorithm to train ranking model. Finally, it re-ranks the answers with the training model and find the optimal answers. Experiments show that the proposed method combined with discourse structure features can effectively improve the answer extracting accuracy and the quality of non-factoid answers. The Mean Reciprocal Rank (MRR) of the answer extraction reaches 69.53%.