Skip to main content
Erschienen in: Neural Computing and Applications 7/2020

01.01.2019 | Original Article

A GA based hierarchical feature selection approach for handwritten word recognition

verfasst von: Samir Malakar, Manosij Ghosh, Showmik Bhowmik, Ram Sarkar, Mita Nasipuri

Erschienen in: Neural Computing and Applications | Ausgabe 7/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Feature selection plays a key role in reducing the dimensionality of a feature vector by discarding redundant and irrelevant ones. In this paper, a Genetic Algorithm-based hierarchical feature selection (HFS) model has been designed to optimize the local and global features extracted from each of the handwritten word images under consideration. In this context, two recently developed feature descriptors based on shape and texture of the word images have been taken into account. Experimentation is conducted on an in-house dataset of 12,000 handwritten word samples written in Bangla script. This database comprises names of 80 popular cities of West Bengal, a state of India. Proposed model not only reduces the feature dimension by nearly 28%, but also enhances the performance of the handwritten word recognition (HWR) technique by 1.28% over the recognition performance obtained with unreduced feature set. Moreover, the proposed HFS-based HWR system performs better in comparison with some recently developed methods on the present dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922CrossRef Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922CrossRef
2.
Zurück zum Zitat Chen XW (2003) An improved branch and bound algorithm for feature selection. Pattern Recogn Lett 24(12):1925–1933CrossRef Chen XW (2003) An improved branch and bound algorithm for feature selection. Pattern Recogn Lett 24(12):1925–1933CrossRef
3.
Zurück zum Zitat Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125CrossRef Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125CrossRef
4.
Zurück zum Zitat Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171CrossRef Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171CrossRef
5.
Zurück zum Zitat Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437CrossRef Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437CrossRef
6.
Zurück zum Zitat Guyon I, Gunn S, Nikravesh M, Zadeh LA (2008) Feature extraction: foundations and applications. Springer, Berlin, p 207 Guyon I, Gunn S, Nikravesh M, Zadeh LA (2008) Feature extraction: foundations and applications. Springer, Berlin, p 207
7.
Zurück zum Zitat Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166CrossRef Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166CrossRef
8.
Zurück zum Zitat Sánchez-Maroño N, Alonso-Betanzos A, Tombilla-Sanromán M (2007) Filter methods for feature selection–a comparative study. In: International conference on intelligent data engineering and automated learning, Springer, Heidelberg, pp 178–187 Sánchez-Maroño N, Alonso-Betanzos A, Tombilla-Sanromán M (2007) Filter methods for feature selection–a comparative study. In: International conference on intelligent data engineering and automated learning, Springer, Heidelberg, pp 178–187
9.
Zurück zum Zitat Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671CrossRef Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671CrossRef
10.
Zurück zum Zitat Cateni S, Colla V, Vannucci M (2014) A hybrid feature selection method for classification purposes. In: European modelling symposium, IEEE Press, New York, pp 39–44 Cateni S, Colla V, Vannucci M (2014) A hybrid feature selection method for classification purposes. In: European modelling symposium, IEEE Press, New York, pp 39–44
11.
Zurück zum Zitat Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the 6th international symposium on micro machine and human science. IEEE, pp 39–43 Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the 6th international symposium on micro machine and human science. IEEE, pp 39–43
12.
Zurück zum Zitat Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168(30):1024–1036CrossRef Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168(30):1024–1036CrossRef
13.
Zurück zum Zitat Meiri R, Zahavi J (2006) Using simulated annealing to optimize the feature selection problem in marketing applications. Eur J Oper Res 171:842–858CrossRef Meiri R, Zahavi J (2006) Using simulated annealing to optimize the feature selection problem in marketing applications. Eur J Oper Res 171:842–858CrossRef
14.
Zurück zum Zitat Panda R, Naik MK, Panigrahi BK (2011) Face recognition using bacterial for aging strategy. Swarm Evol Comput 1:138–146CrossRef Panda R, Naik MK, Panigrahi BK (2011) Face recognition using bacterial for aging strategy. Swarm Evol Comput 1:138–146CrossRef
15.
Zurück zum Zitat Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064CrossRef Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064CrossRef
16.
Zurück zum Zitat Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Remote Sens Lett 12(2):309–313CrossRef Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Remote Sens Lett 12(2):309–313CrossRef
17.
Zurück zum Zitat Uysal AK, Gunal S (2014) Text classification using genetic algorithm oriented latent semantic features. Expert Syst Appl 41(13):5938–5947CrossRef Uysal AK, Gunal S (2014) Text classification using genetic algorithm oriented latent semantic features. Expert Syst Appl 41(13):5938–5947CrossRef
18.
Zurück zum Zitat Leardi R (2000) Application of genetic algorithm-PLS for feature selection in spectral data sets. J Chemom 14(5–6):643–655CrossRef Leardi R (2000) Application of genetic algorithm-PLS for feature selection in spectral data sets. J Chemom 14(5–6):643–655CrossRef
20.
Zurück zum Zitat Tan F, Fu X, Zhang Y, Bourgeois AG (2008) A genetic algorithm-based method for feature subset selection. Soft Comput Fus Found Methodol Appl 12(2):111–120 Tan F, Fu X, Zhang Y, Bourgeois AG (2008) A genetic algorithm-based method for feature subset selection. Soft Comput Fus Found Methodol Appl 12(2):111–120
21.
Zurück zum Zitat Welikala RA, Fraz MM, Dehmeshki J, Hoppe A, Tah V, Mann S, Barman SA (2015) Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy. Comput Med Imaging Gr 43:64–77CrossRef Welikala RA, Fraz MM, Dehmeshki J, Hoppe A, Tah V, Mann S, Barman SA (2015) Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy. Comput Med Imaging Gr 43:64–77CrossRef
22.
Zurück zum Zitat Katiyar G, Mehfuz S (2016) A hybrid recognition system for off-line handwritten characters. SpringerPlus 5(1):357CrossRef Katiyar G, Mehfuz S (2016) A hybrid recognition system for off-line handwritten characters. SpringerPlus 5(1):357CrossRef
23.
Zurück zum Zitat Kim G, Kim S, Tek T, Kyungki S (2000) Feature selection using genetic algorithms for handwritten character recognition. In: Proceedings of the 7th international workshop on frontiers in handwriting recognition. International Unipen Foundation, pp 103–112 Kim G, Kim S, Tek T, Kyungki S (2000) Feature selection using genetic algorithms for handwritten character recognition. In: Proceedings of the 7th international workshop on frontiers in handwriting recognition. International Unipen Foundation, pp 103–112
24.
Zurück zum Zitat Shi D, Shu W, Liu H (1998) Feature selection for handwritten Chinese character recognition based on genetic algorithms. In: IEEE International conference on systems, man, and cybernetics. 5:4201–4206 Shi D, Shu W, Liu H (1998) Feature selection for handwritten Chinese character recognition based on genetic algorithms. In: IEEE International conference on systems, man, and cybernetics. 5:4201–4206
25.
Zurück zum Zitat Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2002) Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: Proceedings of 16th international conference on pattern recognition. 1:568–571 Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2002) Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: Proceedings of 16th international conference on pattern recognition. 1:568–571
26.
Zurück zum Zitat Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2003) A methodology for feature selection using multiobjective genetic algorithms for handwritten digit string recognition. Int J Pattern Recognit Artif Intell 17(06):903–929CrossRef Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2003) A methodology for feature selection using multiobjective genetic algorithms for handwritten digit string recognition. Int J Pattern Recognit Artif Intell 17(06):903–929CrossRef
27.
Zurück zum Zitat Morita M, Sabourin R, Bortolozzi F, SuenCY (2003) Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition. In: Proceedings of 7th international conference on document analysis and recognition. IEEE, pp 666–670 Morita M, Sabourin R, Bortolozzi F, SuenCY (2003) Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition. In: Proceedings of 7th international conference on document analysis and recognition. IEEE, pp 666–670
29.
Zurück zum Zitat Singh PK, Sarkar R, Nasipuri M (2015) Offline script identification from multilingual indic-script documents: a state-of-the-art. Comput Sci Rev 15:1–28MathSciNetCrossRef Singh PK, Sarkar R, Nasipuri M (2015) Offline script identification from multilingual indic-script documents: a state-of-the-art. Comput Sci Rev 15:1–28MathSciNetCrossRef
30.
Zurück zum Zitat Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn 42(7):1467–1484CrossRef Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn 42(7):1467–1484CrossRef
31.
Zurück zum Zitat Roy PP, Bhunia AK, Das A, Dey P, Pal U (2016) HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recogn 60:1057–1075CrossRef Roy PP, Bhunia AK, Das A, Dey P, Pal U (2016) HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recogn 60:1057–1075CrossRef
32.
Zurück zum Zitat Madhvanath S, Govindaraju V (2001) The role of holistic paradigms in handwritten word recognition. IEEE Trans Pattern Anal Mach Intell 23(2):149–164CrossRef Madhvanath S, Govindaraju V (2001) The role of holistic paradigms in handwritten word recognition. IEEE Trans Pattern Anal Mach Intell 23(2):149–164CrossRef
33.
Zurück zum Zitat Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: International conference on computational intelligence and communication networks. IEEE, pp 257–261 Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: International conference on computational intelligence and communication networks. IEEE, pp 257–261
34.
Zurück zum Zitat Bhowmik S, Roushan MG, Sarkar R, Nasipuri M, Polley S, Malakar S (2014) Handwritten Bangla word recognition using HOG descriptor. In: 4th International conference of emerging applications of information technology. IEEE, pp 193–197 Bhowmik S, Roushan MG, Sarkar R, Nasipuri M, Polley S, Malakar S (2014) Handwritten Bangla word recognition using HOG descriptor. In: 4th International conference of emerging applications of information technology. IEEE, pp 193–197
35.
Zurück zum Zitat Barua S, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Bangla handwritten city name recognition using gradient-based feature. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications. Springer, Singapore, pp 343–352 Barua S, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Bangla handwritten city name recognition using gradient-based feature. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications. Springer, Singapore, pp 343–352
36.
Zurück zum Zitat Malakar S, Sharma P, Singh PK, Das M, Sarkar R, Nasipuri M (2017) A holistic approach for handwritten hindi word recognition. Int J Comput Vis Image Process (IJCVIP) 7(1):59–78CrossRef Malakar S, Sharma P, Singh PK, Das M, Sarkar R, Nasipuri M (2017) A holistic approach for handwritten hindi word recognition. Int J Comput Vis Image Process (IJCVIP) 7(1):59–78CrossRef
37.
Zurück zum Zitat Sahoo S, Nandi SK, Barua S, Pallavi, Bhowmik S, Malakar S, Sarkar R (2018) Handwritten Bangla word recognition using negative refraction based shape transformation. J Intell Fuzzy Syst 35(2):1765–1777CrossRef Sahoo S, Nandi SK, Barua S, Pallavi, Bhowmik S, Malakar S, Sarkar R (2018) Handwritten Bangla word recognition using negative refraction based shape transformation. J Intell Fuzzy Syst 35(2):1765–1777CrossRef
38.
Zurück zum Zitat Malakar S, Ghosh P, Sarkar R, Das N, Basu S, Nasipuri M (2011) An improved offline handwritten character segmentation algorithm for Bangla script. In: Proceedings of the 5th Indian international conference on artificial intelligence, pp 71–90 Malakar S, Ghosh P, Sarkar R, Das N, Basu S, Nasipuri M (2011) An improved offline handwritten character segmentation algorithm for Bangla script. In: Proceedings of the 5th Indian international conference on artificial intelligence, pp 71–90
39.
Zurück zum Zitat Vajda S, Roy K, Pal U, Chaudhuri BB, Belaid A (2009) Automation of Indian postal documents written in Bangla and English. Int J Pattern Recognit Artif Intell 23(08):1599–1632CrossRef Vajda S, Roy K, Pal U, Chaudhuri BB, Belaid A (2009) Automation of Indian postal documents written in Bangla and English. Int J Pattern Recognit Artif Intell 23(08):1599–1632CrossRef
40.
Zurück zum Zitat Dzuba G, Filatov A, Gershuny D, Kil I, Nikitin V (1997) Check amount recognition based on the cross validation of courtesy and legal amount fields. Int J Pattern Recognit Artif Intell 11(04):639–655CrossRef Dzuba G, Filatov A, Gershuny D, Kil I, Nikitin V (1997) Check amount recognition based on the cross validation of courtesy and legal amount fields. Int J Pattern Recognit Artif Intell 11(04):639–655CrossRef
41.
Zurück zum Zitat Kim KK, Kim JH, Chung YK, Suen CY (2001) Legal amount recognition based on the segmentation hypotheses for bank check processing. In: Proceedings of 6th international conference on document analysis and recognition. IEEE, pp 964–967 Kim KK, Kim JH, Chung YK, Suen CY (2001) Legal amount recognition based on the segmentation hypotheses for bank check processing. In: Proceedings of 6th international conference on document analysis and recognition. IEEE, pp 964–967
43.
Zurück zum Zitat Phatak AM, Pande SS (2012) Optimum part orientation in rapid prototyping using genetic algorithm. J Manuf Syst 31(4):395–402CrossRef Phatak AM, Pande SS (2012) Optimum part orientation in rapid prototyping using genetic algorithm. J Manuf Syst 31(4):395–402CrossRef
44.
Zurück zum Zitat Spears WM, Jong D, Kenneth D (1995) On the virtues of parameterized uniform crossover. Naval Research Lab, Washinton DCCrossRef Spears WM, Jong D, Kenneth D (1995) On the virtues of parameterized uniform crossover. Naval Research Lab, Washinton DCCrossRef
45.
Zurück zum Zitat Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, 1:886–893 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, 1:886–893
46.
Zurück zum Zitat Bhowmik S, Sarkar R, Das B, Doermann D (2019) GiB: a Game theory Inspired Binarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455MathSciNetCrossRef Bhowmik S, Sarkar R, Das B, Doermann D (2019) GiB: a Game theory Inspired Binarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455MathSciNetCrossRef
47.
Zurück zum Zitat Gonzalez RC, Woods RE (2009) Digital image processing. Pearson Education, India Gonzalez RC, Woods RE (2009) Digital image processing. Pearson Education, India
48.
Zurück zum Zitat Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on computational cybernetics and simulation systems, man, and cybernetics. IEEE, 5:4104–4108 Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on computational cybernetics and simulation systems, man, and cybernetics. IEEE, 5:4104–4108
49.
Zurück zum Zitat Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for Off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recogn Lett 79:73–79CrossRef Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for Off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recogn Lett 79:73–79CrossRef
50.
Zurück zum Zitat Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recogn 5(1):39–46CrossRef Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recogn 5(1):39–46CrossRef
Metadaten
Titel
A GA based hierarchical feature selection approach for handwritten word recognition
verfasst von
Samir Malakar
Manosij Ghosh
Showmik Bhowmik
Ram Sarkar
Mita Nasipuri
Publikationsdatum
01.01.2019
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 7/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3937-8

Weitere Artikel der Ausgabe 7/2020

Neural Computing and Applications 7/2020 Zur Ausgabe

Premium Partner