Aligning transcript of historical documents using energy minimization Conference Paper uri icon

abstract

  • An ongoing considerable effort for digitizing historical manuscripts has produced images of original manuscripts, some accompanied by transcripts. Aligning the text in the input image with the text in the transcript will allow learning, training and evaluating recognition algorithms. Here we propose a system that computes the alignment by formulating the problem as an energy minimization task, where the alignment is performed between the input line image to a synthetic one. The energy function works at a connected component level and it combines a visual similarity measure and a learned distance metric that separates between inter-word and intra-word connected components.

publication date

  • August 23, 2015