Piece-wise painting technique for line segmentation of unconstrained handwritten text: A specific study with Persian text documents

Alaei, A. and Nagabhushan, P. and Pal, U. (2011) Piece-wise painting technique for line segmentation of unconstrained handwritten text: A specific study with Persian text documents. PATTERN ANALYSIS AND APPLICATIONS, 14 (4). pp. 381-394. ISSN 1433-7541

[img] Text (Full Text)
Cmp_2011_Nagabhushan.pdf - Published Version
Restricted to Registered users only

Download (3MB) | Request a copy
Official URL: DOI 10.1007/s10044-011-0226-x

Abstract

The most important and difficult task in text document analysis is to achieve line segmentation accurately, particularly when the document is composed of unconstrained handwritten text. To accomplish this objective a painting scheme is proposed in this research work. Being motivated by the fact that the handwritten Persian texts offer the most critical challenges in the process of text-line segmentation, the new method has been devised by studying the cursive Persian text scripts extensively; yet, in general the proposed line segmentation algorithm is applicable to handwritten text in any language/script. The text block is vertically decomposed into parallel pipe structures called as strip. Each row in each strip is painted by a gray intensity, which is the average intensity value of gray values of all pixels present in that row-strip. Subsequently, the painted pipes are converted into two-tone painting and it is smoothed. The white/black spaces in each pipe of the smoothed image are analyzed to get a short line of separation, phrased as Piece-wise Potential Separating Line (PPSL), between two consecutive black spaces. The PPSLs are concatenated to produce the segmentation of text lines. Some additional procedures are built to handle certain anomalies, which may occur. The scheme is validated by extensive experimentation. We tested the proposed algorithm with 52 pages of Persian text documents containing totally 823 lines and correct line segmentation of 92.35% is achieved. Moreover, the proposed algorithm was also tested with two different datasets of 152 and 200 handwritten text-pages of different languages. Efficiency and script independency of the proposed algorithm were proved when compared with various approaches presented in recent literature.

Item Type: Article
Uncontrolled Keywords: Text-line segmentation; Color representation; Piece-wise painting algorithm; Piece-wise potential separating line; Persian (Farsi) cursive text document
Subjects: D Physical Science > Computer Science
Divisions: Department of > Computer Science
Depositing User: Users 23 not found.
Date Deposited: 16 Aug 2019 06:10
Last Modified: 16 Aug 2019 06:10
URI: http://eprints.uni-mysore.ac.in/id/eprint/2364

Actions (login required)

View Item View Item