Text line segmentation for gray scale historical document images Conference Paper uri icon

abstract

  • In this paper we present a new approach for text line segmentation that works directly on gray-scale document images. Our algorithm constructs distance transform directly on the gray-scale images, which is used to compute two types of seams: medial seams and separating seams. A medial seam is a chain of pixels that crosses the text area of a text line and a separating seam is a path that passes between two consecutive rows. The medial seam determines a text line and the separating seams define the upper and lower boundaries of the text line. The medial and separating seams propagate according to energy maps, which are defined based on the constructed distance transform. We have performed various experimental results on different datasets and received encouraging results.

publication date

  • January 1, 2011