Skip to main content
Top

2020 | OriginalPaper | Chapter

Urdu Spell Checker: A Scarce Resource Language

Authors : Romila Aziz, Muhammad Waqas Anwar

Published in: Intelligent Technologies and Applications

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the digital world of computers, several software applications have been developed to ensure spellings of various words. English language is found to have gone far ahead in the development of spell checking applications whilst other languages specifically naming Urdu, lack behind to cherish such technologies. We develop “Urdu Spell Checker” which detects incorrect spellings of a word and provides a list of options containing correct spellings. The spell checker carries correct spellings of words residing inside a predefined lexicon or corpus. It is to ensure whether entered word is correct or not. In case if the input word matches with the corpus words it is considered correct otherwise it is considered as misspelled word. Multiple techniques are used individually as well as a combination these techniques is used to check which set of methods is best in terms of output. By using multiple techniques for error correction, it is observed that Jaro distance provides best results with combination of soundex, shapex and n-gram that is 80.0% precision, 44.87% recall and 57.37% F-Measure.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)CrossRef Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)CrossRef
2.
go back to reference Naseem, T., Hussain, S.: A novel approach for ranking spelling error corrections. Lang. Resour. Eval. 41(2), 117–128 (2007)CrossRef Naseem, T., Hussain, S.: A novel approach for ranking spelling error corrections. Lang. Resour. Eval. 41(2), 117–128 (2007)CrossRef
3.
go back to reference Naseem, T.: A hybrid approach for Urdu spell checking. Master of Science (Computer Science) thesis at the National University of Computer & Emerging Sciences, pp. 1–87 (2004) Naseem, T.: A hybrid approach for Urdu spell checking. Master of Science (Computer Science) thesis at the National University of Computer & Emerging Sciences, pp. 1–87 (2004)
4.
go back to reference Das, M., Borgohain, S., Gogoi, J., Nair, S.B.: Design and implementation of a spell checker for Assamese, pp. 156–162. IEEE (2002) Das, M., Borgohain, S., Gogoi, J., Nair, S.B.: Design and implementation of a spell checker for Assamese, pp. 156–162. IEEE (2002)
5.
go back to reference Solak, A., Oflazer, K.: Design and implementation of a spelling checker for Turkish. Literary Linguist. Comput. 8(3), 113–130 (1993)CrossRef Solak, A., Oflazer, K.: Design and implementation of a spelling checker for Turkish. Literary Linguist. Comput. 8(3), 113–130 (1993)CrossRef
6.
go back to reference Durrani, N., Hussain, S.: Urdu word segmentation. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 528–536 (2010) Durrani, N., Hussain, S.: Urdu word segmentation. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 528–536 (2010)
7.
go back to reference Zaghouani, W., et al.: Large scale arabic error annotation: guidelines and framework. In: LREC, pp. 2362–2369 (2014) Zaghouani, W., et al.: Large scale arabic error annotation: guidelines and framework. In: LREC, pp. 2362–2369 (2014)
8.
go back to reference Rasooli, M.S., Kahefi, O., Minaei-Bidgoli, B.: Effect of adaptive spell checking in Persian. In: 2011 7th International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), pp. 161–164. IEEE (2011) Rasooli, M.S., Kahefi, O., Minaei-Bidgoli, B.: Effect of adaptive spell checking in Persian. In: 2011 7th International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), pp. 161–164. IEEE (2011)
9.
go back to reference Iqbal, S., Anwar, M.W., Bajwa, U.I., Rehman, Z.: Urdu spell checking: reverse edit distance approach. In: Proceedings of the 4th Workshop on South and Southeast Asian Natural Language Processing, pp. 58–65 (2013) Iqbal, S., Anwar, M.W., Bajwa, U.I., Rehman, Z.: Urdu spell checking: reverse edit distance approach. In: Proceedings of the 4th Workshop on South and Southeast Asian Natural Language Processing, pp. 58–65 (2013)
10.
go back to reference Magdy, W., Darwish, K.: Arabic OCR error correction using character segment correction, language modeling, and shallow morphology. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 408–414 (2006) Magdy, W., Darwish, K.: Arabic OCR error correction using character segment correction, language modeling, and shallow morphology. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 408–414 (2006)
11.
go back to reference Zhang, Q., Zhang, S., Hou, J., Cheng, X.: HANSpeller: a unified framework for Chinese spelling correction. Int. J. Comput. Linguist. Chin. Lang. Process. 20(1), 1–22 (2015) Zhang, Q., Zhang, S., Hou, J., Cheng, X.: HANSpeller: a unified framework for Chinese spelling correction. Int. J. Comput. Linguist. Chin. Lang. Process. 20(1), 1–22 (2015)
Metadata
Title
Urdu Spell Checker: A Scarce Resource Language
Authors
Romila Aziz
Muhammad Waqas Anwar
Copyright Year
2020
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-5232-8_40

Premium Partner