摘要

In optical character recognition, text strings should be extracted from images first. But only the complete text strings can accurately express the meanings of the words, so the extracted individual characters should be grouped into text strings before recognition. There are lots of text strings in topographic maps, and these texts consist of the characters with multi-colored, multi-sized and multi-oriented, and the existing methods cannot effectively group them. In this paper, a dynamic character grouping method is proposed to group the characters into text strings based on four consistency constraints, which are the color, size, spacing and direction respectively. As we know that the characters in the same word have similar colors, sizes and distances between them, and they are also on some curve lines with a certain bending, but the characters in different words are not. Based on these features of the characters, the background pixels around the characters are expanded to link the characters into text strings. In this method, due to the introduction of the color consistency constraint, the characters with different colors can be grouped well. And this method can deal with curved character strings more accurately by the improved direction consistency constraint. The final experimental results show that this method can group the characters more efficiently, especially for the case in which the beginning or the end characters of words are close to the characters of the other words.