摘要

Automatic text summarization is an essential tool in this era of information overloading. In this paper we present an auto! extractive Arabic text summarization system where the user can cap the size of the final summary. It is a direct system whe machine learning is involved. We use a two pass algorithm where in pass one, we produce a primary summary using Rhetc Structure Theory (RST); this is followed by the second pass where we assign a score to each of the sentences in the primary sum%26apos; These scores will help us in generating the final summary. For the final output, sentences are selected with an objective of maxim the overall score of the summary whose size should not exceed the user selected limit. We used ROUGE to evaluate our sy generated summaries of various lengths against those done by a (human) news editorial professional.

  • 出版日期2012-8