Visualizing ccitt group 3 and group 4 tiff documents and transforming to run-length compressed format enabling direct processing in compressed domain

Javed, Mohammed and Krishnanand, S. H. and Nagabhushan, P. and Chaudhuri, B. B. (2016) Visualizing ccitt group 3 and group 4 tiff documents and transforming to run-length compressed format enabling direct processing in compressed domain. Procedia Computer Science, 85. 213 - 221.

[img] Text (Full Text)
Visualizing CCITT Group 3 and Group 4.pdf - Published Version
Restricted to Registered users only

Download (815kB) | Request a copy
Official URL: https://doi.org/10.1016/j.procs.2016.05.214

Abstract

Compression of data could be thought of as an avenue to overcome Big data problem to a large extent particularly to combat the storage and transmission issues. In this context, documents, images, audios and videos are preferred to be archived and communicated in the compressed form. However, any subsequent operation over the compressed data requires decompression which implies additional computing resources. Therefore developing novel techniques to operate and analyze directly the contents within the compressed data without involving the stage of decompression is a potential research issue. In this context, recently in the literature of Document Image Analysis (DIA) some works have been reported on direct processing of run-length compressed document data specifically targeted on CCITT Group 3 1-D documents. Since, run-length data is the backbone of other advanced compression schemes of CCITT such as CCITT Group 3 2-D (T.4) and CCITT Group 4 2-D (T.6) which are widely supported by TIFF and PDF formats, the proposal in this paper is to intelligently generate the run-length data from the compressed data of T.4 and T.6, and thus extend the idea of direct processing of documents in Run-Length Compressed Domain (RLCD). The generated run-length data from the proposed algorithm is experimentally validated and 100 correlation is reported with a data set of compressed documents. In the end, text segmentation and word spotting application in RLCD is also demonstrated.

Item Type: Article
Additional Information: International Conference on Computational Modelling and Security (CMS 2016)
Uncontrolled Keywords: Run-length compressed domain processing, Run-length data, Modified Huffman(MH), Modified Read(MR), Modified Modified Read(MMR)
Subjects: D Physical Science > Computer Science
Divisions: Department of > Computer Science
Depositing User: Manjula P Library Assistant
Date Deposited: 14 Jun 2019 09:43
Last Modified: 11 Dec 2019 09:11
URI: http://eprints.uni-mysore.ac.in/id/eprint/3096

Actions (login required)

View Item View Item