Foreground text segmentation in complex color document images using Gabor filters

Nirmala, S. and Nagabhushan, P. (2012) Foreground text segmentation in complex color document images using Gabor filters. Signal, Image and Video Processing, 6 (4). pp. 669-678. ISSN 1863-1711

[img] Text (Full Tet)
Foreground text segmentation in complex color document images using Gabor filters.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: https://doi.org/10.1007/s11760-010-0196-2

Abstract

Extraction of foreground contents in complex background document images is very difficult as background texture, color and foreground font, size, color, tilt are not known in advance. In this work, we propose a RGB color model for the input of complex color document images. An algorithm to detect the text regions using Gabor filters followed by extraction of text using color feature luminance is developed too. The proposed approach consists of three stages. Based on the Gabor features, the candidate image segments containing text are detected in stage-1. Because of complex background, certain amount of high frequency non-text objects in the background are also detected as text objects in stage-1. In stage-2, certain amount of false text objects is dropped by performing the connected component analysis. In stage-3, the image segments containing textual information, which are obtained from the previous stage are binarized to extract the foreground text. The color feature luminance is extracted from the input color document image. The threshold value is derived automatically using this color feature. The proposed approach handles both printed and handwritten color document images with foreground text in any color, font, size and orientation. For experimental evaluations, we have considered a variety of document images having non-uniform/uniform textured and multicolored background. Performance of segmentation of foreground text is evaluated on a commercially available OCR. Evaluation results show better recognition accuracy of foreground characters in the processed document images against unprocessed document images.

Item Type: Article
Uncontrolled Keywords: Color document image, Gabor filters, Connected component analysis, Color feature, Thresholding, Segmentation of foreground text, OCR
Subjects: D Physical Science > Computer Science
Divisions: Department of > Computer Science
Depositing User: C Swapna Library Assistant
Date Deposited: 15 Jul 2019 10:19
Last Modified: 15 Jul 2019 10:19
URI: http://eprints.uni-mysore.ac.in/id/eprint/5205

Actions (login required)

View Item View Item