Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene Text Detection in Video Images

Liang Guozhu; Shivakumara Palaiahnakote; Lu Tong<sup>*</sup>; Tan Chew Lim

doi:10.1109/TIP.2015.2465169

摘要

Scene text detection from video as well as natural scene images is challenging due to the variations in background, contrast, text type, font type, font size, and so on. Besides, arbitrary orientations of texts with multi-scripts add more complexity to the problem. The proposed approach introduces a new idea of convolving Laplacian with wavelet sub-bands at different levels in the frequency domain for enhancing low resolution text pixels. Then, the results obtained from different sub-bands (spectral) are fused for detecting candidate text pixels. We explore maxima stable extreme regions along with stroke width transform for detecting candidate text regions. Text alignment is done based on the distance between the nearest neighbor clusters of candidate text regions. In addition, the approach presents a new symmetry driven nearest neighbor for restoring full text lines. We conduct experiments on our collected video data as well as several benchmark data sets, such as ICDAR 2011, ICDAR 2013, and MSRA-TD500 to evaluate the proposed method. The proposed approach is compared with the state-of-the-art methods to show its superiority to the existing methods.

出版日期2015-11
单位南京大学

全文

访问全文

收藏分享被引(46) 浏览

更新时间：2024-04-13 15:11

Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene Text Detection in Video Images

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友