Skip to main content
Erschienen in: International Journal of Multimedia Information Retrieval 3/2017

29.07.2017 | Trends and Surveys

Script identification algorithms: a survey

verfasst von: Parul Sahare, Sanjay B. Dhok

Erschienen in: International Journal of Multimedia Information Retrieval | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Script identification is being widely accepted techniques for selection of the particular script OCR (Optical Character Recognition) in multilingual document images. Extensive research has been done in this field, but still it suffers from low identification accuracy. This is due to the presence of faded document images, illuminations and positions while scanning. Noise is also a major obstacle in the script identification process. However, it can only be minimized up to a level, but cannot be removed completely. In this paper, an attempt is made to analyze and classify various script identification schemes for document images. The comparison is also made between these schemes, and discussion is made based upon their merits and demerits on a common platform. This will help the researchers to understand the complexity of the issue and identify possible directions for research in this field.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Sharma N, Pal U, Blumenstein M (2014) A study on word-level multi-script identification from video frames. In: International joint conference on neural networks, Beijing, pp 1827–1833. doi:10.1109/IJCNN.2014.6889906 Sharma N, Pal U, Blumenstein M (2014) A study on word-level multi-script identification from video frames. In: International joint conference on neural networks, Beijing, pp 1827–1833. doi:10.​1109/​IJCNN.​2014.​6889906
10.
15.
16.
Zurück zum Zitat Singh PK, Dalal SK, Sarkar R, Nasipuri M (2015) Page-level script identification from multi- script handwritten documents. In: 3rd international conference on computer, communication, control and information technology, Hooghly, pp 1–6. doi:10.1109/C3IT.2015.7060113 Singh PK, Dalal SK, Sarkar R, Nasipuri M (2015) Page-level script identification from multi- script handwritten documents. In: 3rd international conference on computer, communication, control and information technology, Hooghly, pp 1–6. doi:10.​1109/​C3IT.​2015.​7060113
17.
Zurück zum Zitat Benjelil M, Kanoun S, Mullot R, Alimi AM (2009) Arabic and Latin script identification in printed and handwritten types based on steerable pyramid features. In: 10th international conference on document analysis and recognition, Barcelona, pp 591–595. doi:10.1109/ICDAR.2009.287 Benjelil M, Kanoun S, Mullot R, Alimi AM (2009) Arabic and Latin script identification in printed and handwritten types based on steerable pyramid features. In: 10th international conference on document analysis and recognition, Barcelona, pp 591–595. doi:10.​1109/​ICDAR.​2009.​287
18.
Zurück zum Zitat Zhou L, Ping XJ, Zheng EG, Guo L (2010) Script identification based on wavelet energy histogram moment features. In: IEEE 10th international conference on signal processing, Beijing, pp 980–983. doi:10.1109/ICOSP.2010.5655843 Zhou L, Ping XJ, Zheng EG, Guo L (2010) Script identification based on wavelet energy histogram moment features. In: IEEE 10th international conference on signal processing, Beijing, pp 980–983. doi:10.​1109/​ICOSP.​2010.​5655843
19.
Zurück zum Zitat Peake GS, Tan TN (1997) Script and language identification from document images. In: Proceedings of workshop on document image analysis, Washington DC, pp 10–17, doi:10.1109/DIA.1997.627086 Peake GS, Tan TN (1997) Script and language identification from document images. In: Proceedings of workshop on document image analysis, Washington DC, pp 10–17, doi:10.​1109/​DIA.​1997.​627086
20.
Zurück zum Zitat Pan WM, Suen CY, Bui TD (2005) Script identification using steerable Gabor filters. In: Proceedings of the eight international conference on document analysis and recognition, Seoul, pp 883–887. doi:10.1109/ICDAR.2005.206 Pan WM, Suen CY, Bui TD (2005) Script identification using steerable Gabor filters. In: Proceedings of the eight international conference on document analysis and recognition, Seoul, pp 883–887. doi:10.​1109/​ICDAR.​2005.​206
21.
Zurück zum Zitat Singhal V, Navin N, Ghosh D (2003) Script-based classification of hand-written text documents in a multilingual environment. In: Proceedings of 13th international workshop on research issues in data engineering: multi-lingual information management, Hyderabad, pp 47–54. doi:10.1109/RIDE.2003.1249845 Singhal V, Navin N, Ghosh D (2003) Script-based classification of hand-written text documents in a multilingual environment. In: Proceedings of 13th international workshop on research issues in data engineering: multi-lingual information management, Hyderabad, pp 47–54. doi:10.​1109/​RIDE.​2003.​1249845
22.
Zurück zum Zitat Rajput GG, Anita HB (2010) Handwritten script recognition using dct and wavelet features at block level. IJCA, Special issue on RTIPPR 3:158–163 Rajput GG, Anita HB (2010) Handwritten script recognition using dct and wavelet features at block level. IJCA, Special issue on RTIPPR 3:158–163
23.
Zurück zum Zitat Lee WS, Kim NC, Jang IH (2010) Texture feature-based language identification using wavelet-domain bdip, bvlc, and nrma features. In: IEEE international workshop on machine learning for signal processing, Finland, pp 444–449. doi:10.1109/MLSP.2010.5588751 Lee WS, Kim NC, Jang IH (2010) Texture feature-based language identification using wavelet-domain bdip, bvlc, and nrma features. In: IEEE international workshop on machine learning for signal processing, Finland, pp 444–449. doi:10.​1109/​MLSP.​2010.​5588751
24.
Zurück zum Zitat Valkealahti K, Oja E (2007) Reduced multidimensional co-occurrence histograms in texture classification. IEEE Trans Pattern Anal Mach Intell 20:90–95. doi:10.1109/34.655653 CrossRef Valkealahti K, Oja E (2007) Reduced multidimensional co-occurrence histograms in texture classification. IEEE Trans Pattern Anal Mach Intell 20:90–95. doi:10.​1109/​34.​655653 CrossRef
26.
Zurück zum Zitat Hochberg J, Kelly P, Thomas T, Kerns LL (1997) Automatic script identification from document images using cluster-based templates. IEEE Trans Pattern Anal Mach Intell 19:176–181. doi:10.1109/34.574802 CrossRef Hochberg J, Kelly P, Thomas T, Kerns LL (1997) Automatic script identification from document images using cluster-based templates. IEEE Trans Pattern Anal Mach Intell 19:176–181. doi:10.​1109/​34.​574802 CrossRef
29.
Zurück zum Zitat Pal U, Chaudhuri BB (2001) Automatic identification of English, Chinese, Arabic, Devnagari and Bangla script line. In: Proceedings of sixth international conference on document analysis and recognition, Seattle, pp 790–794. doi:10.1109/ICDAR.2001.953896 Pal U, Chaudhuri BB (2001) Automatic identification of English, Chinese, Arabic, Devnagari and Bangla script line. In: Proceedings of sixth international conference on document analysis and recognition, Seattle, pp 790–794. doi:10.​1109/​ICDAR.​2001.​953896
30.
Zurück zum Zitat Gopakumar R, Subbareddy NV, Makkithaya K, Acharya UD (2010) Script identification from multilingual Indian documents using structural features. J Comput 2:106–111 Gopakumar R, Subbareddy NV, Makkithaya K, Acharya UD (2010) Script identification from multilingual Indian documents using structural features. J Comput 2:106–111
31.
Zurück zum Zitat Gopakumar R, Subbareddy NV, Makkithaya K, Acharya UD (2010) Zone-based structural feature extraction for script identification from Indian documents. In: 5th international conference on industrial and information systems, Mangalore, pp 420–425. doi:10.1109/ICIINFS.2010.5578668 Gopakumar R, Subbareddy NV, Makkithaya K, Acharya UD (2010) Zone-based structural feature extraction for script identification from Indian documents. In: 5th international conference on industrial and information systems, Mangalore, pp 420–425. doi:10.​1109/​ICIINFS.​2010.​5578668
32.
Zurück zum Zitat Padma MC, Vijaya PA (2010) Script identification from trilingual documents using profile based features. Int J Comput Sci Appl 7:16–33 Padma MC, Vijaya PA (2010) Script identification from trilingual documents using profile based features. Int J Comput Sci Appl 7:16–33
33.
Zurück zum Zitat Aithal PK, Rajesh G, Acharya DU, Krishnamoorthi M, Subbareddy NV (2011) Script identification for a tri-lingual document. In: 2nd international conference on advances in communication, network, and computing, pp 434–439. doi:10.1007/978-3-642-19542-6_82 Aithal PK, Rajesh G, Acharya DU, Krishnamoorthi M, Subbareddy NV (2011) Script identification for a tri-lingual document. In: 2nd international conference on advances in communication, network, and computing, pp 434–439. doi:10.​1007/​978-3-642-19542-6_​82
34.
Zurück zum Zitat Aithal PK, Rajesh G, Acharya DU, Krishnamoorthi M, Subbareddy NV (2010) Text line script identification for a tri-lingual document. In: 2nd international conference on computing, communication and networking technologies, Karur, pp 1–3. doi:10.1109/ICCCNT.2010.5592562 Aithal PK, Rajesh G, Acharya DU, Krishnamoorthi M, Subbareddy NV (2010) Text line script identification for a tri-lingual document. In: 2nd international conference on computing, communication and networking technologies, Karur, pp 1–3. doi:10.​1109/​ICCCNT.​2010.​5592562
35.
Zurück zum Zitat Prakash O, Shrivastava V, Kumar A (2013) An efficient approach for script identification. Int J Comput Trends Technol 4:1626–1631 Prakash O, Shrivastava V, Kumar A (2013) An efficient approach for script identification. Int J Comput Trends Technol 4:1626–1631
36.
Zurück zum Zitat Phan TQ, Shivakumara P, Ding Z, Lu S, Tan CL (2011) Video script identification based on text lines. In: International conference on document analysis and recognition, Beijing, pp 1240–1244. doi:10.1109/ICDAR.2011.250 Phan TQ, Shivakumara P, Ding Z, Lu S, Tan CL (2011) Video script identification based on text lines. In: International conference on document analysis and recognition, Beijing, pp 1240–1244. doi:10.​1109/​ICDAR.​2011.​250
37.
Zurück zum Zitat Tan GX, Gaudin CV, Kot AC (2009) Information retrieval model for online handwritten script identification. In: 10th international conference on document analysis and recognition, Barcelona, pp 336–340. doi:10.1109/ICDAR.2009.162 Tan GX, Gaudin CV, Kot AC (2009) Information retrieval model for online handwritten script identification. In: 10th international conference on document analysis and recognition, Barcelona, pp 336–340. doi:10.​1109/​ICDAR.​2009.​162
38.
Zurück zum Zitat Bashir R, Quadri SMK (2014) Entropy based script identification of a multilingual document image. In: International conference on computing for sustainable global development, New Delhi, pp 19–23. doi:10.1109/IndiaCom.2014.6828005 Bashir R, Quadri SMK (2014) Entropy based script identification of a multilingual document image. In: International conference on computing for sustainable global development, New Delhi, pp 19–23. doi:10.​1109/​IndiaCom.​2014.​6828005
39.
Zurück zum Zitat Bashir R, Quadri SMK (2013) Identification of Kashmiri script in a bilingual document image. In: Proceedings of the IEEE second international conference on image information processing, Waknaghat, pp 575–579. doi:10.1109/ICIIP.2013.6707658 Bashir R, Quadri SMK (2013) Identification of Kashmiri script in a bilingual document image. In: Proceedings of the IEEE second international conference on image information processing, Waknaghat, pp 575–579. doi:10.​1109/​ICIIP.​2013.​6707658
41.
Zurück zum Zitat Ghosh S, Chaudhuri BB (2011) Composite script identification and orientation detection for Indian text images. In: International conference on document analysis and recognition, Beijing, pp 294–298. doi:10.1109/ICDAR.2011.67 Ghosh S, Chaudhuri BB (2011) Composite script identification and orientation detection for Indian text images. In: International conference on document analysis and recognition, Beijing, pp 294–298. doi:10.​1109/​ICDAR.​2011.​67
42.
Zurück zum Zitat Cheng J, Ping X, Zhou G, Yang Y (2006) Script identification of document image analysis. In: Proceedings of the 1st international conference on innovative computing, information and control, Beijing, pp 178–181. doi:10.1109/ICICIC.2006.518 Cheng J, Ping X, Zhou G, Yang Y (2006) Script identification of document image analysis. In: Proceedings of the 1st international conference on innovative computing, information and control, Beijing, pp 178–181. doi:10.​1109/​ICICIC.​2006.​518
43.
Zurück zum Zitat Moussa SB, Zahour A, Benabdelhafid A, Alimi AM (2008) Fractal-based system for Arabic/Latin, printed/handwritten script identification. In: 19th international conference on pattern recognition, Florida, pp 1–4. doi:10.1109/ICPR.2008.4761838 Moussa SB, Zahour A, Benabdelhafid A, Alimi AM (2008) Fractal-based system for Arabic/Latin, printed/handwritten script identification. In: 19th international conference on pattern recognition, Florida, pp 1–4. doi:10.​1109/​ICPR.​2008.​4761838
44.
Zurück zum Zitat Padma MC, Vijaya PA (2009) Monothetic separation of Telugu, Hindi and English text lines from a multi script document. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, San, Antonio, pp 4870–4875. doi:10.1109/ICSMC.2009.5346045 Padma MC, Vijaya PA (2009) Monothetic separation of Telugu, Hindi and English text lines from a multi script document. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, San, Antonio, pp 4870–4875. doi:10.​1109/​ICSMC.​2009.​5346045
45.
Zurück zum Zitat Rajput GG, Anita HB (2011) Handwritten script identification from a bi-script document at line level using Gabor filters. In: Proceeding of SCAKD, pp 94–101 Rajput GG, Anita HB (2011) Handwritten script identification from a bi-script document at line level using Gabor filters. In: Proceeding of SCAKD, pp 94–101
46.
Zurück zum Zitat Jindal M, Hemrajani N (2013) Script identification for printed document images at text-line level using dct and pca. IOSR J Comput Eng 12:97–102CrossRef Jindal M, Hemrajani N (2013) Script identification for printed document images at text-line level using dct and pca. IOSR J Comput Eng 12:97–102CrossRef
47.
Zurück zum Zitat Obaidullah SM, Nibaran D, Roy K (2014) Gabor filter based technique for offline Indic script identification from handwritten document images. In: International conference on devices, circuits and communications, Ranchi, pp 1–5. doi:10.1109/ICDCCom.2014.7024723 Obaidullah SM, Nibaran D, Roy K (2014) Gabor filter based technique for offline Indic script identification from handwritten document images. In: International conference on devices, circuits and communications, Ranchi, pp 1–5. doi:10.​1109/​ICDCCom.​2014.​7024723
48.
Zurück zum Zitat Lu S, Li L, Tan CL (2007) Identification of Latin-based languages through character stroke categorization. In: 9th international conference on document analysis and recognition, Brazil, pp 352–356. doi:10.1109/ICDAR.2007.4378731 Lu S, Li L, Tan CL (2007) Identification of Latin-based languages through character stroke categorization. In: 9th international conference on document analysis and recognition, Brazil, pp 352–356. doi:10.​1109/​ICDAR.​2007.​4378731
50.
Zurück zum Zitat Das MS, Rani DS, Reddy CRK (2012) Heuristic based script identification from multilingual text documents. In: 1st international conference on recent advances in information technology, Dhanbad, pp 487–492. doi:10.1109/RAIT.2012.6194627 Das MS, Rani DS, Reddy CRK (2012) Heuristic based script identification from multilingual text documents. In: 1st international conference on recent advances in information technology, Dhanbad, pp 487–492. doi:10.​1109/​RAIT.​2012.​6194627
51.
Zurück zum Zitat Yeotikar PP, Deshmukh PR (2013) Script identification of text words from multilingual Indian document. Int J Comput Appl 1:22–29 Yeotikar PP, Deshmukh PR (2013) Script identification of text words from multilingual Indian document. Int J Comput Appl 1:22–29
52.
Zurück zum Zitat Dhandra BV, Hangarge M (2011) Morphological reconstruction for word level script identification. Int J Comput Sci Secur 1:41–51 Dhandra BV, Hangarge M (2011) Morphological reconstruction for word level script identification. Int J Comput Sci Secur 1:41–51
53.
Zurück zum Zitat Chanda S, Pal S, Franke K, Pal U (2009) Two-stage approach for word-wise script identification. In: 10th international conference on document analysis and recognition, Barcelona, pp 926–930. doi:10.1109/ICDAR.2009.239 Chanda S, Pal S, Franke K, Pal U (2009) Two-stage approach for word-wise script identification. In: 10th international conference on document analysis and recognition, Barcelona, pp 926–930. doi:10.​1109/​ICDAR.​2009.​239
54.
Zurück zum Zitat Chanda S, Pal U, Franke K, Kimura F (2010) Script identification—a Han and Roman script perspective. In: 20th international conference on pattern recognition, Istanbul, pp 2708–2711. doi:10.1109/ICPR.2010.1127 Chanda S, Pal U, Franke K, Kimura F (2010) Script identification—a Han and Roman script perspective. In: 20th international conference on pattern recognition, Istanbul, pp 2708–2711. doi:10.​1109/​ICPR.​2010.​1127
55.
Zurück zum Zitat Roy K, Alaei A, Pal U (2010) Word-wise handwritten Persian and Roman script identification. In: International conference on frontiers in handwriting recognition, Kolkata, pp 628–633. doi:10.1109/ICFHR.2010.103 Roy K, Alaei A, Pal U (2010) Word-wise handwritten Persian and Roman script identification. In: International conference on frontiers in handwriting recognition, Kolkata, pp 628–633. doi:10.​1109/​ICFHR.​2010.​103
56.
Zurück zum Zitat Roy K, Das SK, Obaidullah SM (2011) Script identification from handwritten document. In: 3rd national conference on computer vision, pattern recognition, image processing and graphics, Hubli, pp 66–69. doi:10.1109/NCVPRIPG.2011.22 Roy K, Das SK, Obaidullah SM (2011) Script identification from handwritten document. In: 3rd national conference on computer vision, pattern recognition, image processing and graphics, Hubli, pp 66–69. doi:10.​1109/​NCVPRIPG.​2011.​22
57.
Zurück zum Zitat Obaidullah SM, Roy K, Das N (2013) Comparison of different classifiers for script identification from handwritten document. In: IEEE international conference on signal processing, computing and control, Waknaghat, pp 1–6. doi:10.1109/ISPCC.2013.6663388 Obaidullah SM, Roy K, Das N (2013) Comparison of different classifiers for script identification from handwritten document. In: IEEE international conference on signal processing, computing and control, Waknaghat, pp 1–6. doi:10.​1109/​ISPCC.​2013.​6663388
58.
Zurück zum Zitat Piao M, Cui RR (2013) An approach to script identification in multi-language text image. In: 6th international conference on intelligent networks and intelligent systems, Shenyang, pp 248–251. doi:10.1109/ICINIS.2013.70 Piao M, Cui RR (2013) An approach to script identification in multi-language text image. In: 6th international conference on intelligent networks and intelligent systems, Shenyang, pp 248–251. doi:10.​1109/​ICINIS.​2013.​70
59.
Zurück zum Zitat Chanda S, Terrades OR, Pal U (2007) SVM based scheme for Thai and English script identification. In: 9th international conference on document analysis and recognition, Brazil, pp 551–555. doi:10.1109/ICDAR.2007.4378770 Chanda S, Terrades OR, Pal U (2007) SVM based scheme for Thai and English script identification. In: 9th international conference on document analysis and recognition, Brazil, pp 551–555. doi:10.​1109/​ICDAR.​2007.​4378770
60.
Zurück zum Zitat Chanda S, Pal U, Kimura F (2007) Identification of Japanese and English script from a single document page. In: 7th IEEE international conference on computer and information technology, Fukushima, pp 656–661. doi:10.1109/CIT.2007.109 Chanda S, Pal U, Kimura F (2007) Identification of Japanese and English script from a single document page. In: 7th IEEE international conference on computer and information technology, Fukushima, pp 656–661. doi:10.​1109/​CIT.​2007.​109
61.
Zurück zum Zitat Dhandra BV, Hangarge M (2007) Global and local features based handwritten text words and numerals script identification. In: International conference on conference on computational intelligence and multimedia applications, Sivakasi, pp 471–475. doi:10.1109/ICCIMA.2007.125 Dhandra BV, Hangarge M (2007) Global and local features based handwritten text words and numerals script identification. In: International conference on conference on computational intelligence and multimedia applications, Sivakasi, pp 471–475. doi:10.​1109/​ICCIMA.​2007.​125
62.
Zurück zum Zitat Singh S, Kumar A, Shaw DK, Ghosh D (2014) Script separation in machine printed bilingual (Devnagari and Gurumukhi) documents using morphological approach. In: 20th national conference on communications, Kanpur, pp 1–5. doi:10.1109/NCC.2014.6811361 Singh S, Kumar A, Shaw DK, Ghosh D (2014) Script separation in machine printed bilingual (Devnagari and Gurumukhi) documents using morphological approach. In: 20th national conference on communications, Kanpur, pp 1–5. doi:10.​1109/​NCC.​2014.​6811361
63.
Zurück zum Zitat Lin XR, Guo CY, Chang F (2011) Classifying textual components of bilingual documents with decision-tree support vector machines. In: International conference on document analysis and recognition, Beijing, pp 498–502. doi:10.1109/ICDAR.2011.106 Lin XR, Guo CY, Chang F (2011) Classifying textual components of bilingual documents with decision-tree support vector machines. In: International conference on document analysis and recognition, Beijing, pp 498–502. doi:10.​1109/​ICDAR.​2011.​106
64.
Zurück zum Zitat Echi AK, Saidani A, Belaid A (2014) How to separate between machine-printed/handwritten and Arabic/Latin Words? Electron Lett Comput Vis Image Anal 13:1–16. doi:10.5565/rev/elcvia.572 Echi AK, Saidani A, Belaid A (2014) How to separate between machine-printed/handwritten and Arabic/Latin Words? Electron Lett Comput Vis Image Anal 13:1–16. doi:10.​5565/​rev/​elcvia.​572
65.
Zurück zum Zitat Haboubi S, Maddouri SS, Amiri H (2011) Separation between Arabic and Latin scripts from bilingual text using structural features. In: 1st international conference innovative computing technology, Brazil, pp 132–143. doi:10.1007/978-3-642-22247-4_12 Haboubi S, Maddouri SS, Amiri H (2011) Separation between Arabic and Latin scripts from bilingual text using structural features. In: 1st international conference innovative computing technology, Brazil, pp 132–143. doi:10.​1007/​978-3-642-22247-4_​12
66.
Zurück zum Zitat Sharma N, Chanda S, Pal U, Blumenstein M (2013) Word-wise script identification from video frames. In: 12th international conference on document analysis and recognition, Washington DC, pp 867–871. doi:10.1109/ICDAR.2013.177 Sharma N, Chanda S, Pal U, Blumenstein M (2013) Word-wise script identification from video frames. In: 12th international conference on document analysis and recognition, Washington DC, pp 867–871. doi:10.​1109/​ICDAR.​2013.​177
67.
Zurück zum Zitat Ma H, Doermann D (2004) Word level script identification for scanned document images. In: Proceeding of international conference on document recognition and retrieval, San Jose, pp 178–191 Ma H, Doermann D (2004) Word level script identification for scanned document images. In: Proceeding of international conference on document recognition and retrieval, San Jose, pp 178–191
68.
Zurück zum Zitat Ferrer MA, Morales A, Rodríguez N, Pal U (2014) Multiple training—one test methodology for handwritten word-script identification. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 754–759. doi:10.1109/ICFHR.2014.132 Ferrer MA, Morales A, Rodríguez N, Pal U (2014) Multiple training—one test methodology for handwritten word-script identification. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 754–759. doi:10.​1109/​ICFHR.​2014.​132
69.
Zurück zum Zitat Singh PK, Khan A, Sarkar R, Nasipuri M (2014) A texture based approach to word-level script identification from multi-script handwritten documents. In: International conference on computational intelligence and communication networks, Udaipur, pp 228–232. doi:10.1109/CICN.2014.60 Singh PK, Khan A, Sarkar R, Nasipuri M (2014) A texture based approach to word-level script identification from multi-script handwritten documents. In: International conference on computational intelligence and communication networks, Udaipur, pp 228–232. doi:10.​1109/​CICN.​2014.​60
70.
Zurück zum Zitat Angadi SA, Kodabagi MM (2013) A fuzzy approach for word level script identification of text in low resolution display board images using wavelet features. In: International conference on advances in computing, communications and informatics, Mysore, pp 1804–1811. doi:10.1109/ICACCI.2013.6637455 Angadi SA, Kodabagi MM (2013) A fuzzy approach for word level script identification of text in low resolution display board images using wavelet features. In: International conference on advances in computing, communications and informatics, Mysore, pp 1804–1811. doi:10.​1109/​ICACCI.​2013.​6637455
71.
Zurück zum Zitat Pechwitz M, Maddouri SS, Märgner V, Ellouze N, Amiri H (2002) IFN/ENIT-database of handwritten ARABIC words. In: 7th colloque international francophone Sur l’Ecrit et le Document, Tunis, pp 129–136 Pechwitz M, Maddouri SS, Märgner V, Ellouze N, Amiri H (2002) IFN/ENIT-database of handwritten ARABIC words. In: 7th colloque international francophone Sur l’Ecrit et le Document, Tunis, pp 129–136
72.
Zurück zum Zitat Malemath VS, Kulkarni AH, Mallikarjun H (2014) Word-wise script identification in document images based on steerable Gaussian filtering technique. Int J Adv Res Comput Commun Eng 3:6844–6848 Malemath VS, Kulkarni AH, Mallikarjun H (2014) Word-wise script identification in document images based on steerable Gaussian filtering technique. Int J Adv Res Comput Commun Eng 3:6844–6848
73.
Zurück zum Zitat Rezaee H, Geravanchizadeh M, Razzazi F (2009) Automatic language identification of bilingual English and Farsi scripts. In: International conference on application of information and communication technologies, Baku, pp 1–4. doi:10.1109/ICAICT.2009.5372532 Rezaee H, Geravanchizadeh M, Razzazi F (2009) Automatic language identification of bilingual English and Farsi scripts. In: International conference on application of information and communication technologies, Baku, pp 1–4. doi:10.​1109/​ICAICT.​2009.​5372532
74.
Zurück zum Zitat Rani R, Dhir R, Lehal GS (2013) Script identification of pre-segmented multi-font characters and digits. In: 12th international conference on document analysis and recognition, Washington DC, pp 1150–154. doi:10.1109/ICDAR.2013.233 Rani R, Dhir R, Lehal GS (2013) Script identification of pre-segmented multi-font characters and digits. In: 12th international conference on document analysis and recognition, Washington DC, pp 1150–154. doi:10.​1109/​ICDAR.​2013.​233
75.
Zurück zum Zitat Pal S, Alireza A, Pal U, Blumenstein M (2012) Multi-script off-line signature identification. In: 12th international conference on hybrid intelligent systems, Pune, pp 236–240. doi:10.1109/HIS.2012.6421340 Pal S, Alireza A, Pal U, Blumenstein M (2012) Multi-script off-line signature identification. In: 12th international conference on hybrid intelligent systems, Pune, pp 236–240. doi:10.​1109/​HIS.​2012.​6421340
76.
Zurück zum Zitat Obaidullah SM, Halder C, Das N, Roy K (2015) Numeral script identification from handwritten document images. In: 11th international multi-conference on information processing, Bangalore, pp 585–594. doi:10.1016/j.procs.2015.06.067 Obaidullah SM, Halder C, Das N, Roy K (2015) Numeral script identification from handwritten document images. In: 11th international multi-conference on information processing, Bangalore, pp 585–594. doi:10.​1016/​j.​procs.​2015.​06.​067
77.
Zurück zum Zitat Hangarge M, Santosh KC, Pardeshi R (2013) Directional discrete cosine transform for handwritten script identification. In: 12th international conference on document analysis and recognition, Washington DC, pp 344–348. doi:10.1109/ICDAR.2013.76 Hangarge M, Santosh KC, Pardeshi R (2013) Directional discrete cosine transform for handwritten script identification. In: 12th international conference on document analysis and recognition, Washington DC, pp 344–348. doi:10.​1109/​ICDAR.​2013.​76
78.
Zurück zum Zitat Hangarge M, Santosh KC (2014) Word-level handwritten script identification from multi-script documents. In: Recent advances in information technology, advances in intelligent systems and computing, Dhanbad, pp 49–55. doi:10.1007/978-81-322-1856-2_6 Hangarge M, Santosh KC (2014) Word-level handwritten script identification from multi-script documents. In: Recent advances in information technology, advances in intelligent systems and computing, Dhanbad, pp 49–55. doi:10.​1007/​978-81-322-1856-2_​6
79.
Zurück zum Zitat Pardeshi R, Chaudhuri BB, Hangarge M, Santosh KC (2014) Automatic handwritten Indian scripts identification. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 375–380. doi:10.1109/ICFHR.2014.69 Pardeshi R, Chaudhuri BB, Hangarge M, Santosh KC (2014) Automatic handwritten Indian scripts identification. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 375–380. doi:10.​1109/​ICFHR.​2014.​69
80.
Zurück zum Zitat Marti U, Bunke H (1999) A full English sentence database for off-line handwriting recognition. In: Proceedings of the 5th international conference on document analysis and recognition, Bangalore, pp 705–708. doi:10.1109/ICDAR.1999.791885 Marti U, Bunke H (1999) A full English sentence database for off-line handwriting recognition. In: Proceedings of the 5th international conference on document analysis and recognition, Bangalore, pp 705–708. doi:10.​1109/​ICDAR.​1999.​791885
81.
Zurück zum Zitat Sarkar R, Das N, Basu S, Kundu M, Nasipuri M, Basu DK (2012) Cmaterdb1: a database of unconstrained handwritten Bangla and Bangla English mixed script document image. Int J Doc Anal Recogn 15:71–83. doi:10.1007/s10032-011-0148-6 CrossRef Sarkar R, Das N, Basu S, Kundu M, Nasipuri M, Basu DK (2012) Cmaterdb1: a database of unconstrained handwritten Bangla and Bangla English mixed script document image. Int J Doc Anal Recogn 15:71–83. doi:10.​1007/​s10032-011-0148-6 CrossRef
83.
Zurück zum Zitat Ng CC, Selamat A (2009) Improved letter weighting feature selection on Arabic script language identification. In: 1st Asian conference on intelligent information and database systems, Vietnam, pp 150–154. doi:10.1109/ACIIDS.2009.33 Ng CC, Selamat A (2009) Improved letter weighting feature selection on Arabic script language identification. In: 1st Asian conference on intelligent information and database systems, Vietnam, pp 150–154. doi:10.​1109/​ACIIDS.​2009.​33
84.
Zurück zum Zitat Selamat A, Lee ZS (2008) Language identifications of Arabic script web documents using independent component analysis. In: 2nd Asia international conference on modeling and simulation, Kuala Lumpur, pp 427–432. doi:10.1109/AMS.2008.46 Selamat A, Lee ZS (2008) Language identifications of Arabic script web documents using independent component analysis. In: 2nd Asia international conference on modeling and simulation, Kuala Lumpur, pp 427–432. doi:10.​1109/​AMS.​2008.​46
86.
Zurück zum Zitat Behrad A, Khoddami M, Salehpour M (2010) A novel framework for Farsi and Latin script identification and Farsi handwritten digit recognition. J Autom Control 20:17–25. doi:10.2298/JAC1001017B CrossRef Behrad A, Khoddami M, Salehpour M (2010) A novel framework for Farsi and Latin script identification and Farsi handwritten digit recognition. J Autom Control 20:17–25. doi:10.​2298/​JAC1001017B CrossRef
87.
Zurück zum Zitat Rani R, Dhir R, Lehal GS (2011) Comparative analysis of Gabor and discriminating feature extraction techniques for script identification. In: International conference on information systems for Indian languages, Patiala, pp 174–179. doi:10.1007/978-3-642-19403-0_27 Rani R, Dhir R, Lehal GS (2011) Comparative analysis of Gabor and discriminating feature extraction techniques for script identification. In: International conference on information systems for Indian languages, Patiala, pp 174–179. doi:10.​1007/​978-3-642-19403-0_​27
88.
Zurück zum Zitat Mezghani A, Slimane F, Kanoun S, Margner V (2014) Identification of Arabic/French–handwritten/printed words using Gmm-based system. In: Proceedings of CIFED, France, pp 371–374 Mezghani A, Slimane F, Kanoun S, Margner V (2014) Identification of Arabic/French–handwritten/printed words using Gmm-based system. In: Proceedings of CIFED, France, pp 371–374
89.
Zurück zum Zitat Abainia K, Ouamour S, Sayoud H (2014) Robust language identification of noisy texts: proposal of hybrid approaches. In: 25th international workshop on database and expert systems applications, Munich, pp 228–232. doi:10.1109/DEXA.2014.55 Abainia K, Ouamour S, Sayoud H (2014) Robust language identification of noisy texts: proposal of hybrid approaches. In: 25th international workshop on database and expert systems applications, Munich, pp 228–232. doi:10.​1109/​DEXA.​2014.​55
90.
Zurück zum Zitat Yadav P, Kaur S (2013) Language identification and correction in corrupted texts of regional Indian languages. In: International conference oriental held jointly with conference on Asian spoken language research and evaluation, Gurgaon, pp 1–5. doi:10.1109/ICSDA.2013.6709877 Yadav P, Kaur S (2013) Language identification and correction in corrupted texts of regional Indian languages. In: International conference oriental held jointly with conference on Asian spoken language research and evaluation, Gurgaon, pp 1–5. doi:10.​1109/​ICSDA.​2013.​6709877
91.
Zurück zum Zitat Hebert D, Barlas P, Chatelain C, Adam S, Paquet T (2014) Writing type and language identification in heterogeneous and complex documents. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 411–416. doi:10.1109/ICFHR.2014.75 Hebert D, Barlas P, Chatelain C, Adam S, Paquet T (2014) Writing type and language identification in heterogeneous and complex documents. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 411–416. doi:10.​1109/​ICFHR.​2014.​75
92.
Zurück zum Zitat Ablavsky V, Stevens MR (2003) Automatic feature selection with applications to script identification of degraded documents. In: Proceedings of 7th international conference on document analysis and recognition, Edinburgh, pp 750–754. doi:10.1109/ICDAR.2003.1227762 Ablavsky V, Stevens MR (2003) Automatic feature selection with applications to script identification of degraded documents. In: Proceedings of 7th international conference on document analysis and recognition, Edinburgh, pp 750–754. doi:10.​1109/​ICDAR.​2003.​1227762
93.
Zurück zum Zitat Obaidullah SM, Mondal A, Roy K (2014) Structural feature based approach for script identification from printed Indian document. In: International conference on signal processing and integrated networks, Noida, pp 120–124. doi:10.1109/SPIN.2014.6776933 Obaidullah SM, Mondal A, Roy K (2014) Structural feature based approach for script identification from printed Indian document. In: International conference on signal processing and integrated networks, Noida, pp 120–124. doi:10.​1109/​SPIN.​2014.​6776933
94.
Zurück zum Zitat Obaidullah SM, Mondal A, Das N, Roy K (2014) Script identification from printed Indian document images and performance evaluation using different classifiers. Appl Comput Intell Soft Comput. doi:10.1155/2014/896128 Obaidullah SM, Mondal A, Das N, Roy K (2014) Script identification from printed Indian document images and performance evaluation using different classifiers. Appl Comput Intell Soft Comput. doi:10.​1155/​2014/​896128
96.
Zurück zum Zitat Singh PK, Mondal A, Bhowmik S, Sarkar R, Nasipuri M (2014) Word-level script identification from handwritten multi-script documents. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications, Bhubaneswar, pp 551–558. doi:10.1007/978-3-319-11933-5_62 Singh PK, Mondal A, Bhowmik S, Sarkar R, Nasipuri M (2014) Word-level script identification from handwritten multi-script documents. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications, Bhubaneswar, pp 551–558. doi:10.​1007/​978-3-319-11933-5_​62
97.
Zurück zum Zitat Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. In: Proceedings of international conference on document analysis and recognition, Nancy Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. In: Proceedings of international conference on document analysis and recognition, Nancy
98.
Zurück zum Zitat Mezghani A, Kanoun S, Khemakhem M, El AH (2012) A database for Arabic handwritten text image recognition and writer identification. In: International conference on frontiers in handwriting recognition, Bari, pp 399–402. doi:10.1109/ICFHR.2012.155 Mezghani A, Kanoun S, Khemakhem M, El AH (2012) A database for Arabic handwritten text image recognition and writer identification. In: International conference on frontiers in handwriting recognition, Bari, pp 399–402. doi:10.​1109/​ICFHR.​2012.​155
99.
Zurück zum Zitat Grosicki E, Carré M, Brodin JM, Geoffrois E (2009) Results of the RIMES evaluation campaign for handwritten mail processing. In: International conference on document analysis and recognition, Barcelona, pp 941–945. doi:10.1109/ICDAR.2009.224 Grosicki E, Carré M, Brodin JM, Geoffrois E (2009) Results of the RIMES evaluation campaign for handwritten mail processing. In: International conference on document analysis and recognition, Barcelona, pp 941–945. doi:10.​1109/​ICDAR.​2009.​224
100.
Zurück zum Zitat Slimane F, Ingold R, Kanoun S, Alimi AM, Hennebert J (2009) A new Arabic printed text image database and evaluation protocols. In: International conference on document analysis and recognition, Barcelona, pp 946–950. doi:10.1109/ICDAR.2009.155 Slimane F, Ingold R, Kanoun S, Alimi AM, Hennebert J (2009) A new Arabic printed text image database and evaluation protocols. In: International conference on document analysis and recognition, Barcelona, pp 946–950. doi:10.​1109/​ICDAR.​2009.​155
102.
Zurück zum Zitat Sharma N, Mandal R, Sharma R, Pal U, Blumenstein M (2015) ICDAR2015 competition on video script identification (CVSI 2015). In: IEEE 13th international conference on document analysis and recognition (ICDAR), 2015, Tunis, pp 1196–1200. doi:10.1109/ICDAR.2015.7333950 Sharma N, Mandal R, Sharma R, Pal U, Blumenstein M (2015) ICDAR2015 competition on video script identification (CVSI 2015). In: IEEE 13th international conference on document analysis and recognition (ICDAR), 2015, Tunis, pp 1196–1200. doi:10.​1109/​ICDAR.​2015.​7333950
104.
106.
Zurück zum Zitat Obaidullah SM, Halder C, Santosh KC, Das N, Roy K (2017) PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimed Tools Appl. doi:10.1007/s11042-017-4373-y Obaidullah SM, Halder C, Santosh KC, Das N, Roy K (2017) PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimed Tools Appl. doi:10.​1007/​s11042-017-4373-y
107.
Zurück zum Zitat Singh PK, Sarkar R, Das N, Basu S, Kundu M, Nasipuri M (2017) Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images. Multimed Tools Appl. doi:10.1007/s11042-017-4745-3 Singh PK, Sarkar R, Das N, Basu S, Kundu M, Nasipuri M (2017) Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images. Multimed Tools Appl. doi:10.​1007/​s11042-017-4745-3
108.
Zurück zum Zitat Brodic’ D, Amelio A, Milivojevic’ ZN (2016) Language discrimination by texture analysis of the image corresponding to the text. Neural Comput Appl. doi:10.1007/s00521-016-2527-x Brodic’ D, Amelio A, Milivojevic’ ZN (2016) Language discrimination by texture analysis of the image corresponding to the text. Neural Comput Appl. doi:10.​1007/​s00521-016-2527-x
109.
Metadaten
Titel
Script identification algorithms: a survey
verfasst von
Parul Sahare
Sanjay B. Dhok
Publikationsdatum
29.07.2017
Verlag
Springer London
Erschienen in
International Journal of Multimedia Information Retrieval / Ausgabe 3/2017
Print ISSN: 2192-6611
Elektronische ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-017-0130-2

Weitere Artikel der Ausgabe 3/2017

International Journal of Multimedia Information Retrieval 3/2017 Zur Ausgabe

Premium Partner