摘要
In this paper, we propose a novel method for fast arbitrary-oriented text detection in scene images. Our proposed method is simple and effective which can predict word-level bounding boxes via a single fully convolutional network. Our method extracts features from the input images by residual network and apply multi-level fusion over the extracted features. It has two outputs, pixel-wise classification between text and non-text and word-level bounding boxes. Our method achieves an F-measure of 83.46% and 56.39% on ICDAR2015 Incidental Scene Text benchmark and COCO-Text dataset respectively, outperforming the previous methods by a large margin. Also, it can run at over 11 FPS on 704 ×1280 images, which is much faster than the previous works.
- 出版日期2018
- 单位上海交通大学