LettersHandwritten digit recognition using biologically inspired features
Introduction
Handwritten digit recognition despite being a well studied problem is still an active topic of research. This problem is relevant for tasks like postal mail sorting or form data processing. Several works have been devoted to the problem from a feature extraction or classification perspective. In this text we analyze the application of the map transformation cascade (MTC) [1] to this task, which works as feature extractor combined with a classifier. MTC is a model for visual recognition where simple and complex cells are arranged in a hierarchy like proposed by Hubel and Wiesel for the visual cortex [2] and incorporated in several models like Neocognitron [3] and HMAX [4]. In [1] the MTC relation and comparison with Neocognitron was established using a nearest neighbor classifier. In this text we discuss how it relates to HMAX [4] and compares with other pattern recognition methods on two popular datasets of handwritten digits using a linear classifier. A combination of HMAX's features and a classifier has been shown to achieve good results on object recognition [5].
In the next section we make a short overview of biological vision and computational models for visual recognition. Afterwards we describe MTC and finally evaluate its performance of MTC on handwritten digit recognition using the USPS and MNIST datasets. We analyze how the performance of the approach is affected by the number of training samples and finally measure the error rate on the entire dataset.
Section snippets
Related work
The classical hypothesis of Hubel and Wiesel [6] has been transposed into several computational models for visual recognition. The key idea is that two kinds of cells are arranged in layers, being the simple cells selective for a particular stimulus and a position of that stimulus in the visual field and complex cells also selective for a particular stimulus but less selective for its position in the visual field. These two types of cells are then arranged in a hierarchy where the cells'
Map transformation cascade
In this section we describe MTC which was previously proposed in [1]. The model was proposed to retain the functional principles of Neocognitron in a computationally simpler way. MTC is composed by two types of cells arranged hierarchically. Simple cells are responsible for selectivity by reacting to a particular stimulus. Complex cells are responsible for invariance to position of the stimulus. The two types of cells are arranged in layers of the same cell type. Layers are arranged in ordered
Experiments
In the experiments we evaluate the performance of MTC combined with a linear SVM.
A SVM, as originally proposed, solves a binary classification problem [36]. For the multi-class problem we used the ‘one-against-one’ approach [37], [38]. Therefore we solve a binary classification problem for all the two class combinations, training binary classifiers. The output of the binary classifiers is then combined by voting [39]. Another possible approach is the ‘one-against-all’, for a comparison
Conclusion
We evaluated the combination of MTC with a linear classifier. MTC showed good generalization for a small number of training examples. The combination of MTC and a linear SVM achieved competitive results on both USPS (2.64%) and MNIST (0.71%) datasets. MTC greatly improves the results relatively to using a deep belief network with a linear SVM [48]. It is also interesting that in [27] quasi-binary codes are unsuitable for classification, while the MTC binary codes can be used for classification
Acknowledgments
The authors would like to thank João Sacramento for much helpful comments. This work was supported by Fundação para a Ciência e Tecnologia (INESC-ID multiannual funding) through the PIDDAC Program funds and through an individual doctoral grant awarded to the first author (Contract SFRH/BD/61513/2009).
Ângelo Cardoso got a MSc in Information Systems and Computer Engineering from the Instituto Superior Técnico (IST), Technical University of Lisbon (TU-Lisbon) in 2007. Since 2009 he is a PhD student at IST, TU-Lisbon and INESC-ID. His PhD work is supported by an individual scholarship from Fundação para a Ciência e Tecnologia (FCT). He is currently working on biological models for object recognition and machine learning.
References (48)
- et al.
Neocognitron and the map transformation cascade
Neural Networks
(2010) - et al.
Shape representation in the inferior temporal cortex of monkeys
Curr. Biol.
(1995) Neocognitrona hierarchical neural network capable of visual pattern recognition
Neural Networks
(1988)Increasing robustness against background noisevisual pattern recognition by a neocognitron
Neural Networks
(2011)Neocognitron for handwritten digit recognition
Neurocomputing
(2003)- et al.
Handwritten digit recognitionbenchmarking of state-of-the-art techniques
Pattern Recognition
(2003) - et al.
Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat
J. Neurophysiol.
(1965) Neocognitrona self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position
Biol. Cybern.
(1980)- et al.
Hierarchical models of object recognition in cortex
Nat. Neurosci.
(1999) - et al.
A feedforward architecture accounts for rapid categorization
Proc. Natl. Acad. Sci. U.S.A.
(2007)
Eye, Brain, and Vision
Uniformity of monkey striate cortexa parallel relationship between field size, scatter, and magnification factor
J. Comp. Neurol.
Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex
J. Neurophysiol.
Psychophysical support for a two-dimensional view interpolation theory of object recognition
Proc. Natl. Acad. Sci. U.S.A.
View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex
Cerebral Cortex
Sequence regularity and geometry of orientation columns in the monkey striate cortex
J. Comp. Neurol.
Lateral inhibition between orientation detectors in the cat's visual cortex
Exp. Brain Res.
Cognitrona self-organizing multilayered neural network
Biol. Cybern.
Computational Maps in the Visual Cortex
Gradient-based learning applied to document recognition
Proc. IEEE
Cited by (25)
DRAW-A-PIN: Authentication using finger-drawn PIN on touch devices
2017, Computers and SecurityCitation Excerpt :In an online system like Draw-A-PIN, a sequence of x–y coordinates with time labels is acquired and used for recognition. While significant amount of research has been done on off-line systems (Cardoso and Wichert, 2013; Ciresan et al., 2012; LeCun et al., 1995; Liu et al., 2003), little research has been done for online systems (Connell and Jain, 2001; Kim and Sin, 2014). In this work, $P algorithm (Point-Cloud Recognizer), a fast, simple, and accurate gesture recognition approach that is based on templates and nearest-neighbor classification (Vatavu et al., 2012) is adopted and modified to be used as the Digit Recognizer.
Handwriting recognition of digits, signs, and numerical strings in Persian
2016, Computers and Electrical EngineeringCitation Excerpt :However, some of the researchers in other countries also are working on the English handwritten datasets. Therefore, after a on face searching, it seems that the recognition of Latin numeral characters has attracted much attention [7,8,9]. Because it is a handy case for testing various techniques (preprocessing, feature extraction, and classification) and it has many applications (postal mail sorting, check reading, form processing, etc.).
Modular neural networks with radial neural columnar architecture
2015, Biologically Inspired Cognitive ArchitecturesCitation Excerpt :The comparison has shown that recognition capability of the modular neural network exceeds that of the LiRA classifier while using the same set of features. In the present work, both classifiers also use identical set of LiRA features for recognition of handwritten digits of the MNIST database (http://yann.lecun.com/exdb/mnist/) which is rather often used to evaluate recognition capabilities of different classifiers (e.g., Cardoso & Wichert, 2013). The main motivation of this paper is to present a new radial neural columnar architecture for modular neural network with considerable reduction of the number of its learning connections versus the former full-connected modular assembly neural networks.
A linear approach for sparse coding by a two-layer neural network
2015, NeurocomputingImage receptive fields for artificial neural networks
2014, NeurocomputingCitation Excerpt :More recent variants like LeCun׳s Convolutional Neural Networks [19] or Hinton׳s Deep Learning architecture [20] obtained remarkable results for some applications like automatic classification of manuscript numbers or characters. Several research teams develop these networks with success in challenging benchmarks, e.g. Cardoso and Wichert [21], Cireşan et al. [22], and Krizhevsky et al. [23]. These networks are relatively large.
Ângelo Cardoso got a MSc in Information Systems and Computer Engineering from the Instituto Superior Técnico (IST), Technical University of Lisbon (TU-Lisbon) in 2007. Since 2009 he is a PhD student at IST, TU-Lisbon and INESC-ID. His PhD work is supported by an individual scholarship from Fundação para a Ciência e Tecnologia (FCT). He is currently working on biological models for object recognition and machine learning.
Andreas Wichert studied computer science at the University of Saarland, where he graduated in 1993. Afterwards, he became a PhD student at the Department of Neural Information Processing, University of Ulm. He received a PhD in computer science in 2000. He has since worked in the field of fMRI as a researcher with an interdisciplinary group, Department of Psychiatry III Ulm, changing to F&K Delvotec bonding machines where he led the development of a diagnostic expert system. From 2004 to 2005 he was the scientific director of MITI Research Group Klinikum rechts der Isar of the Technical University Munich. Recently he joined the Faculdade de Ciências da Universidade de Lisboa Departamento de Informática and Departamento de Informática, Universidade Técnica de Lisboa (DEI-IST).