摘要

Technical documents, which often have complicated structures, are often produced during Architecture/Engineering/Construction (A/E/C) projects and research. Applying information retrieval (IR) techniques directly to long or multi-topic documents often does not lead to satisfactory results. One way to address the problem is to partition each document into several "passages", and treat each passage as an independent In this research, a novel passage partitioning approach is designed. It generates passages according to domain knowledge, which is represented by base domain ontology. Such a passage is herein defined as an OntoPassage. In order to demonstrate the advantage of the OntoPassage partitioning approach, this research implements a concept-based IR system to illustrate the application of such an approach. The research also compares the OntoPassage partitioning approach with several conventional passage partitioning approaches to verify its IR effectiveness. It is shown that, with the proposed OntoPassage approach, IR effectiveness on domain-specific technical reports is as good as conventional passage partitioning approaches. In addition, the OntoPassage approach provides the possibility to display the concepts in each passage, and concept-based IR may thus be implemented.

  • 出版日期2012-4