Skip to main content

Über dieses Buch

1. This book constitutes the refereed proceedings of the 4th Workshop on Document Analysis and Recognition, DAR 2018, held in Conjunction with ICVGIP 2018, in Hyderabad, India, in December 2018. The 12 revised full papers and 2 short papers presented were carefully reviewed and selected from 22 submissions. The papers are organized in topical sections: document layout analysis and understanding; handwriting recognition and symbol spotting; character and word segmentation; handwriting analysis; datasets and performance evaluation.



Document Layout Analysis and Understanding


MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System

Content of the document images are often shows hierarchical multi-layered tree structure. Further, the algorithms for document image applications like line detection, paragraph detection, word recognition, layout analysis etc. require pixel level annotation. In this paper, a Multi-layered Document Image Annotation System (MultiDIAS) has been introduced. The proposed system simultaneously provide a platform for hierarchical and pixel level annotation of document. MultiDIAS label the document image in four hierarchical layers (layout type, entity type, line type, word type) assigned by the user. The output generated are four ground-truth images and an XML file representing the metadata information. The MultiDIAS is tested on a complex handwritten manuscript written by renowned film director Satyajit Ray for the movie ‘Goopi Gyne Bagha Byne’. This annotated data generated using MultiDIAS can further be used in a wide range of applications of document image understanding and analysis.
Arnab Poddar, Rohan Mukherjee, Jayanta Mukhopadhyay, Prabir Kumar Biswas

Attributed Paths for Layout-Based Document Retrieval

A document is rich in its layout. The entities of interest can be scattered over the document page. Traditional layout matching has involved modeling layout structure as grids, graphs, and spatial histograms of patches. In this paper we propose a new way of representing layout, which we call attributed paths. This representation admits a string edit distance based match measure. Our experiments show that layout based retrieval using attributed paths is computationally efficient and more effective. It also offers flexibility in tuning the match criterion. We have demonstrated effectiveness of attributed paths in performing layout based retrieval tasks on datasets of floor plan images [14] and journal pages [1].
Divya Sharma, Gaurav Harit, Chiranjoy Chattopadhyay

Textual Content Retrieval from Filled-in Form Images

Form processing refers to the process of extraction of information from filled-in forms. In this work, we have addressed three very crucial challenges of a form processing system, namely touching component separation, text non-text separation and handwritten-printed text separation. The proposed method is evaluated on a database having 50 filled-in forms written in Bangla, collected during an essay competition in a school. The experimental results are promising.
Soulib Ghosh, Rajdeep Bhattacharya, Sandipan Majhi, Showmik Bhowmik, Samir Malakar, Ram Sarkar

Handwriting Recognition and Symbol Spotting


A Study on the Effect of CNN-Based Transfer Learning on Handwritten Indic and Mixed Numeral Recognition

Filling up forms at post offices, railway counters, and for application of jobs has become a routine for modern people, especially in a developing country like India. Research on automation for the recognition of such handwritten forms has become mandatory. This applies more for a multilingual country like India. In the present work, we use readily available pre-trained Convolutional Neural Network (CNN) architectures on four different Indic scripts, viz. Bangla, Devanagari, Oriya, and Telugu to achieve a satisfactory recognition rate for handwritten Indic numerals. Furthermore, we have mixed Bangla and Oriya numerals and applied transfer learning for recognition. The main objective of this study is to realize how good a CNN model trained on an entire different dataset (of natural images) works for small and unrelated datasets. As a part of practical application, we have applied the proposed approach to recognize Bangla handwritten pin codes after their extraction from postal letters.
Rahul Pramanik, Prabhat Dansena, Soumen Bag

Online Handwritten Bangla Character Recognition Using Frechet Distance and Distance Based Features

This paper inspects the impact of feature vector produced by Frechet Distance (FD) along with the conventional distance based features to recognize online handwritten Bangla characters. FD based feature computation starts with dividing a character sample into different rectangular zones. Then FD values are computed from each zone to every other zones. In distance based feature extraction technique also a character sample is divided into several segments and distances are measured from a particular segment to all other segments. Feature vectors so produced are experimented on 10000 online handwritten Bangla characters. SVM (Support Vector Machine) classifier produces the reasonably satisfactory recognition accuracy of 98.98% when FD based features are combined with distance based features.
Shibaprasad Sen, Jewel Chakraborty, Snehanjan Chatterjee, Rohit Mitra, Ram Sarkar, Kaushik Roy

An Efficient Multi Lingual Optical Character Recognition System for Indian Languages Through Use of Bharati Script

Optical character recognition performs a critical part in interpreting videos and documents. Document specific issues like low image quality, distortions, composite background, noise etc. and language specific issues like cursive connectivity among the characters etc. makes OCR challenging and erroneous for Indian languages. The language specific challenges can be overcome by computing the script-based features and can achieve better accuracy. Computing the script based invariant features and patterns is computationally complex and error prone. In this background, we put forward Bharathi script (www.​bharatiscript.​com) based OCR system in which the inherent drawbacks of Indian scripts i.e. Hindi, Tamil, Telugu etc. are eliminated. The proposed OCR model has been tested on a synthetic dataset of documents of Bharathi script (in which Hindi scripts are converted to Bharathi script). Thorough experimental analysis with varied levels of noise confirms the promising results of character recognition accuracy of the proposed OCR model which out-performs the state-of-the-art OCR systems for Indian scripts. The proposed model achieves 76.70% with test documents consists of 50% noise and 99.98% with test documents of 0% noise.
Chandra Sekhar Vorugunti, Srinivasa Chakravarthy, Viswanath Pulabaigari

Character and Word Segmentation


Telugu Word Segmentation Using Fringe Maps

In this paper, we propose a word segmentation method that is based on fringe maps on Telugu script. Our objective is to create a data set of word images for enabling direct training for recognition on those. The standard methods employed for the task of word segmentation in Telugu OCR systems are projection profiles and run-length smearing. However those methods have their limitations. In this work a different application of fringe maps is shown for line segmentation into words. Fringes were previously applied successfully for carrying out classification and line segmentation. Telugu script, which has consonant modifiers that are usually placed below or below-right to the base consonants. This kind of orthographic property leads to characters that may touch each other. One way to deal with touched characters is to make use of segmentation free methods, which do not need prior segmentation of word images into characters or connected components. The novelty of our method is that we analyze fringe maps of document images to find an appropriate fringe value threshold and apply it for word segmentation of Telugu documents. Encouraging results are observed with our fringe value threshold based word segmentation. We observe that choosing higher threshold fringe values leads to under-segmentation of words, whereas lower values cause over-segmentation of words. Our word segmentation approach is successfully compared with the widely used projection profiles based word segmentation method.
Koteswara Rao Devarapalli, Atul Negi

An Efficient Character Segmentation Algorithm for Connected Handwritten Documents

This paper proposes an efficient method of character segmentation for handwritten text. The main challenge in character segmentation of hand-written text is the varied size of each letter in different documents, connected alphabets in a word in cursive writing and the presence of ligatures within an open character. Hence, this paper proposes an adaptive vertical pixel count algorithm to solve the problem of over-segmentation due to the presence of open characters such as ‘w’, ‘v’ and ‘m’. Proposed algorithm works effectively against both the handwritten and standard text. The proposed method is evaluated on IAM and self-created data set.
Vishal Rajput, N. Jayanthi, S. Indu

Handwriting Analysis


A Deep Learning Architecture Based Dimensionality Reduction and Online Signature Verification

In this paper, we propose a novel hybrid deep learning based autoencoder-CNN-Softmax architecture aims at obtaining reduced dimension feature set from raw feature set. The reduced feature set forms an input to CNN layers to learn deep global features. These global features are used to train the SoftMax layer for online signature classification. Ability to reduce the noisy features and to discover the hidden corelated features makes the proposed architecture light weight and efficient to use in critical applications like online signature verification (OSV) and to deploy in resource constraint mobile devices. We demonstrate the superiority of our model for feature correlation learning and signature classification by conducting experiments on standard datasets MCYT, SUSIG. The experimentation confirms that the proposed model achieves better accuracy (lower error rates) with a lesser number of features compared to the current state-of-the-art models. The proposed models yield state-of-the-art performance of 0.4% EER on MCYT-100 dataset and 3.47% with SUSIG dataset.
Chandra Sekhar Vorugunti, Viswanath Pulabaigari

Word-Wise Handwriting Based Gender Identification Using Multi-Gabor Response Fusion

Handwriting based gender identification at the word level is challenging due to free style writing, use of different scripts, and inadequate information. This paper presents a new method based on Multi-Gabor Response (MGR) fusion for gender identification at the word level. It first explores weighted-gradient features for word segmentation from text line images. For each word, the proposed method obtains eight Gabor response images. Then it performs sliding window operation over MGR images to smooth the values. For each smoothed MGR images, we perform fusion operation that chooses the Gabor response value which contributes to the highest peak in the histogram. This process results in a feature matrix, which is fed to CNN for gender identification. Experimental results on our dataset (multi scripts) apart from English, and benchmark databases, namely, IAM, KHATT, and QUWI, which contain handwritten English and Arabic text, show that the proposed method outperforms the existing methods.
Maryam Asadzadeh Kaljahi, P. V. Vidya Varshini, Palaiahnakote Shivakumara, Umapada Pal, Tong Lu, D. S. Guru

A Secure and Light Weight User Authentication System Based on Online Signature Verification for Resource Constrained Mobile Networks

The rapid advances in mobile and networking technologies results in usage of mobiles for critical applications like m-commerce, m-payments etc. Even though mobile based services offer many benefits, authenticating the user logging into the system is a big challenge. To mitigate this concern, secure mobile applications based on user online signature verification (OSV) has been proposed. Unfortunately, these models would intensify the substantial computational overhead on thin and resource-constrained mobile devices. This summarizes for a critical need of OSV models which are computationally efficient and achieves higher classification accuracy. Recently, several OSV models have been defined in the literature. However, these models are not computationally effective for resource-constrained mobile devices, because the proposed verification models ought to require not only higher feature dimension but also heavy weight writer specific parameter fixation logic. In this manuscript, we propose an efficient and light weight OSV model for resource-constrained mobile devices. Our approach employs dimensionality reduction based on DBSCAN clustering technique and user specific parameter selection. Thorough experimental analysis are conducted on benchmarking online signature datasets MCYT-100 (DB1) and MCYT-330 (DB2) datasets which confirms the efficiency of proposed model with latest OSV models.
Chandra Sekhar Vorugunti, D. S. Guru, Viswanath Pulabaigari

Datasets and Performance Evaluation


Benchmark Datasets for Offline Handwritten Gurmukhi Script Recognition

Handwritten character recognition is an imperative issue in the field of pattern recognition and machine learning research. In the recent years, several techniques for handwritten character recognition have been proposed. Due to the lack of publicly accessible benchmark datasets of Gurmukhi script, no extensive comparisons have been undertaken between those techniques, especially for this script. Over the years, datasets and benchmarks have proven their fundamental importance in character recognition research, and objective comparisons in many fields. This paper presents a collection of seven benchmark datasets (HWR-Gurmukhi_1.1, HWR-Gurmukhi_1.2, HWR-Gurmukhi_1.3, HWR-Gurmukhi_2.1, HWR-Gurmukhi_2.2, HWR-Gurmukhi_2.3, and HWR-Gurmukhi_3.1) with different sizes for offline handwritten Gurmukhi character recognition collected from various public places. A few exploratory outcomes based on precision, False Acceptance Rate (FAR), and False Rejection Rate (FRR) using different classification techniques, namely, k-NN, RBF-SVM, MLP, Neural Network, Decision Tree, and Random Forest are also presented in this paper.
Munish Kumar, R. K. Sharma, M. K. Jindal, Simpel Rani Jindal, Harjeet Singh

Benchmark Dataset: Offline Handwritten Gurmukhi City Names for Postal Automation

Handwriting recognition delineate the computer’s ability to convert human handwriting into text that can be processed by machine. Postal automation plays a significant role in image processing and pattern recognition field. Handwritten city name recognition is the part of postal automation. For assessing the performance of the existing techniques for handwritten city name recognition, a standardized dataset proves useful. But due to lack of publicly accessible benchmark dataset in Gurmukhi script, a systematic comparison of the existing techniques for Gurmukhi city name recognition is not feasible. In this paper, we have presented a dataset for Gurmukhi postal automation named as HWR-Gurmukhi_Postal_1.0 which contains total 40,000 samples of names of various cities which are written in Gurmukhi script. This dataset can be seen as a benchmark for comparison among existing techniques for handwritten city name recognition.
Harmandeep Kaur, Munish Kumar


Weitere Informationen

Premium Partner