A learning framework for the optimization and automation of document binarization methods

Cheriet Mohamed; Moghaddam Reza Farrahi<sup>*</sup>; Hedjam Rachid

doi:10.1016/j.cviu.2012.11.003

摘要

Almost all binarization methods have a few parameters that require setting. However, they do not usually achieve their upper-bound performance unless the parameters are individually set and optimized for each input document image. In this work, a learning framework for the optimization of the binarization methods is introduced, which is designed to determine the optimal parameter values for a document image. The framework, which works with any binarization method, has a standard structure, and performs three main steps: (i) extracts features, (ii) estimates optimal parameters, and (iii) learns the relationship between features and optimal parameters. First, an approach is proposed to generate numerical feature vectors from 20 data. The statistics of various maps are extracted and then combined into a final feature vector, in a nonlinear way. The optimal behavior is learned using support vector regression (SVR). Although the framework works with any binarization method, two methods are considered as typical examples in this work: the grid-based Sauvola method, and Lu's method, which placed first in the DIBCO'09 contest. The experiments are performed on the DIBC0'09 and H-DIBCO'10 datasets, and combinations of these datasets with promising results.

出版日期2013-3

全文

访问全文

收藏分享被引(5) 浏览

更新时间：2019-03-29 16:50

A learning framework for the optimization and automation of document binarization methods

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友