Segmentation of offline handwritten Arabic text

Ghaleb, Hashem and Nagabhushan, P. and Pal, Umapada (2017) Segmentation of offline handwritten Arabic text. In: 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), 3-5 April 2017, Nancy, France.

Full text not available from this repository. (Request a copy)
Official URL: https://dx.doi.org/ 10.1109/ASAR.2017.8067757

Abstract

Arabic script is cursive in both printed and handwritten forms. This intrinsic nature of cursiveness renders the segmentation task challenging. An Arabic word generally consists of multiple parts known as Parts of Arabic Words (PAWs) or simply sub-words. Sub-words share the same vertical space quite frequently which makes vertical projection segmentation technique inefficient. Several Arabic letters have annexed parts (diacritics) which are located above or below the main parts of the character. The relative positions of the annexed parts and main parts vary a lot in handwritten text. In this paper the task of segmenting offline handwritten Arabic text up to character level is taken up. Firstly, graph-theoretic modeling is utilized to extract connected components of word image. These components are subjected to a thorough analysis to facilitate the segmentation of input image into sub-words. In the sequel diacritics are removed. Then, large number of candidate segmentation points is identified based on two strategies that utilize stroke thickness as a heuristic. Final segmentation points are obtained using a set of rules on the candidate segmentation points. Finally, each sub-word is segmented and diacritics are brought back to their respective segments taking into account the issue of diacritics displacement. Experimentation is conducted on a set of handwritten images of Arabic text drawn from IFN/ENIT dataset. The results obtained are encouraging.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: feature extraction;graph theory;handwriting recognition;handwritten character recognition;image segmentation;natural language processing;text analysis;Arabic script;diacritics;offline handwritten Arabic text;word image;candidate segmentation points;handwritten images;Image segmentation;Shape;Hidden Markov models;Handwriting recognition;Character recognition;Guidelines;Histograms;Arabic sub-word segmentation;Overlapping Arabic sub-words;Arabic Character Segmentation;Handwritten Arabic text recoginition
Subjects: D Physical Science > Computer Science
Divisions: Department of > Computer Science
Depositing User: C Swapna Library Assistant
Date Deposited: 08 Jul 2019 09:33
Last Modified: 08 Jul 2019 09:33
URI: http://eprints.uni-mysore.ac.in/id/eprint/4882

Actions (login required)

View Item View Item