Automatic indexing of electronic texts - AspectiX

作者:Ladewig C*; Henkes M
来源:NFD Information-Wissenschaft und Praxis, 2001, 52(3): 159-164.

摘要

The method of syntactic, content-based evaluation of electronic texts, AspectiX*, is based on an index, whose elements are linked to a universal aspect-centred classification system, which allows for a syntactic retrieval. Using these classification elements, which center around the content of the respective search item, the information in the electronic text is extracted and the results are evaluated in correspondence with the according aspect. These aspects make it possible to automatically classify unknown text documents on the basis of their content, regardless of language or topic, - without having to rely on a sequence of symbols, as is the case with search engines on the web. During these tasks the index can be intellectually and automatically extended, and delivers retrieval results of nearly 100% precision, with a simultaneous recall rate of almost 100%. This proves the method AspectiX 40% superior to other search engines in terms of precision and recall, which is going to be shown in several trial runs using three unequally sized and thematically different databases.

  • 出版日期2001-5