Aradhya, V. N. Manjunath and Hemantha Kumar, G. and Noushath, S. (2008) Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis. Engineering Applications of Artificial Intelligence, 21 (4). 658 - 668. ISSN 0952-1976
Full text not available from this repository. (Request a copy)Abstract
Character recognition lies at the core of the discipline of pattern recognition where the aim is to represent a sequence of characters taken from an alphabet Kasturi, R., Gorman, L.O., Govindaraju, V., 2002. Document image analysis: a primer. Sadhana 27 (Part 1), 3–22. Though many kinds of features have been developed and their test performances on standard database have been reported, there is still room to improve the recognition rate by developing improved features. In this paper, we present a multilingual character recognition system for printed South Indian scripts (Kannada, Telugu, Tamil and Malayalam) and English documents. South Indian languages are most popular languages in India and around the world. The proposed multilingual character recognition is based on Fourier transform and principal component analysis (PCA), which are two commonly used techniques of image processing and recognition. PCA and Fourier transforms are classical feature extraction and data representation techniques widely used in the area of pattern recognition and computer vision. Our experimental results show the good performance over the data sets considered.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Document analysis, Multi-lingual character recognition, South Indian languages, Fourier transform, Principal component analysis (PCA) |
Subjects: | D Physical Science > Computer Science |
Divisions: | Department of > Computer Science |
Depositing User: | Manjula P Library Assistant |
Date Deposited: | 22 Aug 2019 07:09 |
Last Modified: | 22 Aug 2019 07:09 |
URI: | http://eprints.uni-mysore.ac.in/id/eprint/6881 |
Actions (login required)
View Item |