摘要

An enhanced text detection technique (ETDT) is proposed, which is expected to aid the visually impaired to overcome their reading challenges. This work enhances the edge-preserving maximally stable extremal regions (eMSER) algorithm using the pyramid histogram of oriented gradients (PHOG). Histogram of oriented gradients (HOG) derived from different pyramid levels is important while detecting maximally stable extremal regions (MSER) in the ETDT approach because it gives more spatial information when compared to HOG information from a single level. To group text, a four-line, text-grouping method is newly designed for this work. Also, a new text feature, Shapeness Score is proposed, which significantly identifies text regions when combined with the other features based on morphology and stroke widths. Using the feature vector of dimension 10, the J48 decision tree and AdaBoost machine learning algorithms identify the text regions in the images. The algorithm yields better results than the existing benchmark algorithms for the ICDAR 2011 born-digital dataset and must be improved with respect to the scene text dataset.

  • 出版日期2017-10