An automatic document processing system for medical data extraction

作者:Adamo Francesco; Attivissimo Filippo*; Di Nisio Attilio; Spadavecchia Maurizio
来源:Measurement, 2015, 61: 88-99.
DOI:10.1016/j.measurement.2014.10.032

摘要

This paper illustrates an automatic document processing system for the extraction of data contained in medical laboratory results printed on paper. The final goal of the research is to automate the collection of medical data and to enable an efficient management and dissemination of the information. The following processing steps of the system are described in detail; image preprocessing; layout analysis for the identification of the tables contained in the document; extraction and classification of the laboratory results. Among the many features of the system there are the use of an open source OCR engine, as a basis of further processing, and the storage in XML format of the data retrieved, for ease of sharing. The knowledge base used to guide the data extraction is also explained. The proposed approach has been tested on several document formats and performance analyzed.

  • 出版日期2015-2

全文