摘要

Text detection in natural scene images is important and challenging work for image analysis. In this paper, we present a robust system to detect natural scene text according to text region appearances. The framework includes three parts: auto image partition, two-stage grouping and two-layer classification. The first part partitions images into unconstrained sub-images through statistical distribution of sampling points. The designed two-stage grouping method performs grouping in each sub-image in first stage and connects different partitioned image regions in second stage to group connected components (CCs) to text regions. Then a two-layer classification mechanism is designed for classifying candidate text regions. The first layer is to compute the similarity score of region blocks and the second layer is a SVM classifier using HOG features. We add a normalization step to rectify perspective distortion before candidate text region classification which improves the accuracy and robustness of the final output result. The proposed system is evaluated on four types datasets including two ICDAR Robust Reading Competition datasets, a born-digital image dataset, a video image dataset and a perspective distortion image dataset. The experimental results demonstrate that our proposed framework outperforms state-of-the-art localization algorithms and is robust in dealing with multiple background outliers.