skip to main content
research-article

Word Level Script Identification Using Convolutional Neural Network Enhancement for Scenic Images

Authors Info & Claims
Published:04 March 2022Publication History
Skip Abstract Section

Abstract

Script identification from complex and colorful images is an integral part of the text recognition and classification system. Such images may contain twofold challenges: (1) Challenges related to the camera like blurring effect, non-uniform illumination and noisy background, and so on, and (2) Challenges related to the text shape, orientation, and text size. The present work in this area is much focused on non-Indian scripts. In contrast, Gurumukhi, Hindi, and English scripts play a vital role in communication among Indians and foreigners. In this article, we focus on the above said challenges in the field of identifying the script. Additionally, we have introduced a new dataset that contains Hindi, Gurumukhi, and English scripts from scenic images collected from different sources. We also proposed a CNN-based model, which is capable of distinguishing between the scripts with good accuracy. Performance of the method has been evaluated for own dataset, i.e., NITJDATASET and other benchmarked datasets available for Indian scripts, i.e., CVSI-2015 (Task-1 and Task 4) and ILST. This work is an extension to find the script from strict text background.

REFERENCES

  1. Ahmad Muhammad Tayyab, Malik Muhammad Kamran, Shahzad Khurram, Aslam Faisal, Iqbal Asif, Nawaz Zubair, and Bukhari Faisal. 2020. Named entity recognition and classification for punjabi shahmukhi. ACM Transactions on Asian and Low-Resource Language Information Processing 19, 4 (2020), 113. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Al-Sallab Ahmad, Baly Ramy, Hajj Hazem, Shaban Khaled Bashir, El-Hajj Wassim, and Badaro Gilbert. 2017. AROMA: A recursive deep learning model for opinion mining in arabic as a low resource language. ACM Transactions on Asian and Low-Resource Language Information Processing 16, 4 (2017), 20 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alom Md Zahangir, Hasan Mahmudul, Yakopcic Chris, Taha Tarek M., and Asari Vijayan K.. 2020. Improved inception-residual convolutional neural network for object recognition. Neural Computing and Applications 32, 1 (2020), 279293. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bains Jasleen Kaur, Singh Sukhdeep, and Sharma Anuj. 2020. Dynamic features based stroke recognition system for signboard images of gurmukhi text. Multimedia Tools and Applications 80, 1 (2020), 125. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bhunia Ankan Kumar, Konwer Aishik, Bhunia Ayan Kumar, Bhowmick Abir, Roy Partha P., and Pal Umapada. 2019. Script identification in natural scene image and video frames using an attention based convolutional-LSTM network. Pattern Recognition 85 (2019), 172184. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  6. Chanda Sukalpa, Pal Umapada, and Terrades Oriol Ramos. 2009. Word-wise thai and roman script identification. ACM Transactions on Asian Language Information Processing 8, 3 (2009), 21 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chaudhari Shailesh and Gulati Ravi M.. 2016. Script identification using gabor feature and SVM classifier. In Procedia Computer Science, Vol. 79. Elsevier Masson SAS, 8592. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  8. Cheng Changxu, Huang Qiuhui, Bai Xiang, Feng Bin, and Liu Wenyu. 2019. Patch aggregator for scene text script identification. In Proceedings of the 2019 International Conference on Document Analysis and Recognition. IEEE, Sydney, Australia, Australia. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  9. Choudhary Prakash and Nain Neeta. 2016. A four-tier annotated Urdu handwritten text image dataset for multidisciplinary research on Urdu script. ACM Transactions on Asian and Low-Resource Language Information Processing 15, 4 (2016), 23 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Corder Gregory W. and Foreman Dale I.. 2011. Nonparametric statistics for non-statisticians. (2011).Google ScholarGoogle Scholar
  11. Dargan Shaveta, Kumar Munish, Ayyagari Maruthi Rohit, and Kumar Gulshan. 2019. A survey of deep learning and its applications: A new paradigm to machine learning. Archives of Computational Methods in EngineeringJuly 27, 4 (2019), 1071–1092. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  12. Fasil S. Manjunath and Aradhya V.N. Manjunath. 2017. Word-level script identification from scene images. In Proceedings of the Advances in Intelligent Systems and Computing. Number March. 417425. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  13. Gao Kangyu, Zhang Qingyong, and Wang Haoran. 2019. A lightweight residual-inception convolutional neural network. Journal of Physics: Conference Series 1237, 3 (2019), 1–7. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  14. Ghosh Mridul, Mukherjee Himadri, Obaidullah Sk Md, Santosh K. C., Das Nibaran, and Roy Kaushik. 2019. Identifying the presence of graphical texts in scene images using CNN. In Proceedings of the 2019 International Conference on Document Analysis and Recognition Workshops. 8691. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  15. Gllavata Julinda and Freisleben Bernd. 2005. Script recognition in images with complex backgrounds. In Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Vol. 2005. IEEE, 589594. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  16. Gomez Lluis and Karatzas Dimosthenis. 2016. A fine-grained approach to scene text script identification. In Proceedings of the 12th IAPR International Workshop on Document Analysis Systems, DAS 2016. 192197. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  17. Gomez Lluis, Nicolaou Anguelos, and Karatzas Dimosthenis. 2017. Improving patch-based scene text script identification with ensembles of conjoined networks. Pattern Recognition 67 (2017), 8596. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jajoo Madhuram, Chakraborty Neelotpal, Mollah Ayatullah Faruk, Basu Subhadip, and Sarkar Ram. 2019. Script identification from camera-captured multi-script scene text components. In Proceedings of the Advances in Intelligent Systems and Computing. Recent Developments in Machine Learning and Data AnalyticsSpringer Singapore, 159166. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  19. Joshi Gopal Datt, Garg Saurabh, and Sivaswamy Jayanthi. 2007. A generalised framework for script identification. International Journal on Document Analysis and Recognition 10, 2 (2007), 5568. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Keserwani Prateek, De Kanjar, Roy Partha Pratim, and Pal Umapada. 2019. Zero shot learning based script identification in the wild. In Proceedings of the 2019 International Conference on Document Analysis and Recognitionii (2019), 987992. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  21. Khan Asifullah, Sohail Anabia, Zahoora Umme, and Qureshi Aqsa Saeed. 2020. A survey of the recent architectures of deep convolutional neural networks. Artificial Intelligence Review0123456789 (2020). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Khanduja Deepti, Nain Neeta, and Panwar Subhash. 2016. Deepti khanduja, neeta nain,” segmentation and recognition techniques for handwritten devanagari script. ACM Transactions on Asian and Low-Resource Language Information Processing 15, 1 (2016), 10 pages.Google ScholarGoogle Scholar
  23. Kumar Munish, Jindal M. K., Sharma R. K., and Jindal Simpel Rani. 2019. Character and numeral recognition for non-indic and indic scripts: A survey. Artificial Intelligence Review 52, 4 (2019), 22352261. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Le Ngoc T. A. N., Sadat Fatiha, Menard Lucie, and Dinh Dien. 2019. Low-resource machine transliteration using recurrent neural networks. ACM Transactions on Asian and Low-Resource Language Information Processing 18, 2 (2019), 114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lu Liqiong, Yi Yaohua, Huang Faliang, Wang Kaili, and Wang Qi. 2019. Integrating local CNN and global CNN for script identification in natural scene images. IEEE Access 7 (2019), 5266952679. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  26. Lu W.. 2020. An Empirical Evaluation of Deep Learning Techniques for Human Activity Recognition. Ph.D. Dissertation.Google ScholarGoogle Scholar
  27. Mahajan Shilpa and Rani Rajneesh. 2018. Text extraction from indian and non-indian natural scene images : A review. In Proceedings of the 2018 1st International Conference on Secure Cyber Computing and Communication (2018), 584588. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  28. Mahajan Shilpa and Rani Rajneesh. 2019. A decade on script identification from natural images/videos: A review. In Proceedings of the 2019 International Conference on Issues and Challenges in Intelligent Computing Techniques. 15. Google ScholarGoogle ScholarCross RefCross Ref
  29. Mahajan Shilpa and Rani Rajneesh. 2021. Text detection and localization in scene images: A broad review. Artificial Intelligence Review54 (2021), 4317–4377. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mehta Yash, Majumder Navonil, Gelbukh Alexander, and Cambria Erik. 2020. Recent trends in deep learning based personality detection. Artificial Intelligence Review 53, 4 (2020), 23132339. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  31. Mei Jieru, Dai Luo, Shi Baoguang, and Bai Xiang. 2017. Scene text script identification with convolutional recurrent neural networks. In Proceedings - International Conference on Pattern Recognition. IEEE, 40534058. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  32. Nongmeikapam Kishorjit, Wahengbam Kanan, Meetei Oinam Nickson, and Tuithung Themrichon. 2019. Handwritten manipuri meetei-mayek classification using convolutional neural network. ACM Transactions on Asian and Low-Resource Language Information Processing 18, 4 (2019). DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Obaidullah S. K., Santosh K. C., Halder Chayan, Das Nibaran, and Roy Kaushik. 2017. Word-level multi-script indic document image dataset and baseline results on script identification. International Journal of Computer Vision and Image Processing 7, 2 (2017), 8194. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Obaidullah Sk. Md., Bose Amitava, Mukherjee Himadri, Santosh K. C., and Das Nibaran. 2018a. Extreme learning machine for handwritten indic script identification in multiscript documents. Journal of Electronic Imaging 27, 05 (2018), 1. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  35. Obaidullah Sk Md, Halder Chayan, Santosh K. C., Das Nibaran, and Roy Kaushik. 2018b. PHDIndic_11: Page-level handwritten document image dataset of 11 official indic scripts for script identification. Multimedia Tools and Applications 77, 2 (2018), 16431678. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Obaidullah Sk Md, Santosh K. C., Das Nibaran, Halder Chayan, and Roy Kaushik. 2018c. Handwritten indic script identification in multi-script document images: A survey. International Journal of Pattern Recognition and Artificial Intelligence 32, 10 (2018), 1–7. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  37. Obaidullah Sk Md, Santosh K. C., Halder Chayan, Das Nibaran, and Roy Kaushik. 2019. Automatic indic script identification from handwritten documents: Page, block, line and word-level approach. International Journal of Machine Learning and Cybernetics 10, 1 (2019), 87106. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  38. Pereira Carlos S., Morais Raul, and Reis Manuel J. C. S.. 2019. Deep learning techniques for grape plant species identification in natural images. Sensors (Switzerland) 19, 22 (2019), 4850–4865. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  39. Roy Partha Pratim, Bhunia Ayan Kumar, and Pal Umapada. 2018. Date-field retrieval in scene image and video frames using text enhancement and shape coding. Neurocomputing 274, 2017 (2018), 3749. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sahare Parul and Dhok Sanjay B. 2017. Script identification algorithms : A survey. International Journal of Multimedia Information Retrieval 6, 3 (2017), 211232. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  41. Sharma Nabin, Chanda Sukalpa, Pal Umapada, and Blumenstein Michael. 2013. Word-wise script identification from video frames. In Proceedings of the International Conference on Document Analysis and Recognition. IEEE, 867871. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sharma Nabin, Mandal Ranju, Sharma Rabi, Pal Umapada, and Blumenstein Michael. 2015. ICDAR2015 competition on video script identification (CVSI 2015). In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition. IEEE, 11961200.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shi Baoguang, Bai Xiang, and Yao Cong. 2016. Script identification in the wild via discriminative convolutional neural network. Pattern Recognition 52, abs/1505.02982 (2016), 448458. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Shi Baoguang, Yao Cong, Zhang Chengquan, Guo Xiaowei, Huang Feiyue, and Bai Xiang. 2015. Automatic script identification in the wild. In Proceedings of the International Conference on Document Analysis and Recognition, Vol. 2015-November. 531535. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Shivakumara Palaiahnakote, Yuan Zehuan, Zhao Danni, Lu Tong, and Tan Chew Lim. 2015. New gradient-spatial-structural features for video script identification. Computer Vision and Image Understanding 130 (2015), 3553. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Singh Ajeet Kumar, Mishra Anand, Dabral Pranav, and Jawahar C. V.. 2016. A simple and effective solution for script identification in the wild. In Proceedings of the 12th IAPR International Workshop on Document Analysis Systems, 428433. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  47. Singh Harjeet, Sharma R. K., Kumar Rajesh, Verma Karun, Kumar Ravinder, and Kumar Munish. 2020. A benchmark dataset of online handwritten gurmukhi script words and numerals. Communications in Computer and Information Science 1148 CCIS, March (2020), 457466. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  48. Ubul Kurban, Tursun Gulzira, Aysa Alimjan, Impedovo Donato, Pirlo Giuseppe, and Yibulayin Tuergen. 2017. Script identification of multi-script documents: A survey. IEEE Access 5 (2017), 65466559. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  49. Ukil Soumya, Ghosh Swarnendu, Obaidullah Sk Md, Santosh K. C., Roy Kaushik, and Das Nibaran. 2020. Improved word-level handwritten indic script identification by integrating small convolutional neural networks. Neural Computing and Applications 32, 7 (2020), 28292844. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  50. Verma Manisha, Nitakshi Sood, Raman Balasubramanian, and Roy Partha Pratim. 2017. Script identification in natural scene images: A dataset and texture-feature based performance evaluation. In Proceedings of the International Conference on Computer Vision and Image Processing, Vol. 460. 309319. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  51. Zdenek Jan and Nakayama Hideki. 2018. Bag of local convolutional triplets for script identification in scene text. In Proceedings of the International Conference on Document Analysis and Recognition, Vol. 1. 369375. DOI: Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Word Level Script Identification Using Convolutional Neural Network Enhancement for Scenic Images

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 4
      July 2022
      464 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3511099
      Issue’s Table of Contents

      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 March 2022
      • Accepted: 1 December 2021
      • Revised: 1 July 2021
      • Received: 1 December 2020
      Published in tallip Volume 21, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format