摘要

This paper presents a robust text detection approach based on color-enhanced contrasting extremal region (CER) and neural networks. Given a color natural scene image, six component-trees are built from its grayscale image, hue and saturation channel images in a perception-based illumination invariant color space, and their inverted images respectively. From each component-tree, color-enhanced CERs are extracted as character candidates. By using a "divide-and-conquer" strategy, each candidate image patch is labeled reliably by rules as one of five types, namely, Long, Thin, Fill, Square-large and Square-small, and classified as text or non-text by a corresponding neural network, which is trained by an ambiguity-free learning strategy. After pruning unambiguous non-text components, repeating components in each component-tree are pruned further. Remaining components are then grouped into candidate text-lines and verified by another set of neural networks. Finally, results from six component-trees are combined, and a post-processing step is used to recover lost characters. Our proposed method achieves superior performance on both ICDAR-2011 and ICDAR-2013 "Reading Text in Scene Images" "test sets.