Skip to main content
Top
Published in:

14-11-2021

Part of Speech Tagging Using Part of Speech Sequence Graph

Authors: Pejman Gholami-Dastgerdi, Mohammad-Reza Feizi-Derakhshi

Published in: Annals of Data Science | Issue 5/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The article introduces a new approach for part-of-speech tagging that utilizes a part-of-speech sequence graph to correct tags assigned by the MLE tagger. The method aims to enhance the accuracy of tagging both known and unknown words by leveraging the repetitive grammatical structures in sentences. The proposed graph-based approach addresses the limitations of traditional methods, such as the MLE tagger, which often struggles with unknown words and ambiguous tags. The article provides a detailed explanation of the methodology, including the construction of the graph and the process of traversing sentences to identify and correct tags. The results demonstrate that the proposed method outperforms existing methods, achieving a higher accuracy for both known and unknown words. The article concludes by highlighting the potential for further improvement using machine learning algorithms for selecting the best path in the graph.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 67.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 102.000 books
  • more than 537 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 67.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Footnotes
1
Maximum Likelihood Estimation.
 
2
A popular language in India.
 
Literature
1.
go back to reference Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4:149–178CrossRef Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4:149–178CrossRef
2.
go back to reference Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York
4.
go back to reference Liu F, Shi Y (2020) Investigating laws of intelligence based on AI IQ research. Ann Data Sci 7:399–416CrossRef Liu F, Shi Y (2020) Investigating laws of intelligence based on AI IQ research. Ann Data Sci 7:399–416CrossRef
5.
go back to reference Mirzanezhad Z, Feizi-Derakhshi MR (2016) Using morphological analyzer to statistical POS tagging on Persian text. Int J Comput Sci Inf Secur (IJCSIS) 14(8) Mirzanezhad Z, Feizi-Derakhshi MR (2016) Using morphological analyzer to statistical POS tagging on Persian text. Int J Comput Sci Inf Secur (IJCSIS) 14(8)
6.
go back to reference Dhumal Deshmukh R, Kiwelekar A (2020) Deep learning techniques for part of speech tagging by natural language processing. In: 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) Dhumal Deshmukh R, Kiwelekar A (2020) Deep learning techniques for part of speech tagging by natural language processing. In: 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA)
7.
go back to reference Heyan H, Xiaofei Z (2009) Part-of-speech tagger based on maximum entropy model. In: International conference on intelligent human-machine systems and cybernetics Heyan H, Xiaofei Z (2009) Part-of-speech tagger based on maximum entropy model. In: International conference on intelligent human-machine systems and cybernetics
8.
go back to reference Mohseni M, Minaei-bidgoli B (2010) A Persian part-of-speech tagger based on morphological analysis. In: European Language Resources Association (ELRA), Valletta, Malta Mohseni M, Minaei-bidgoli B (2010) A Persian part-of-speech tagger based on morphological analysis. In: European Language Resources Association (ELRA), Valletta, Malta
9.
go back to reference Ghayoomi M (2017) A comparative study on the impact of part-of-speech tagging on parsing. Persian Lang Process 13(4):121–132 Ghayoomi M (2017) A comparative study on the impact of part-of-speech tagging on parsing. Persian Lang Process 13(4):121–132
10.
go back to reference Jadidinejad AH, Mahmudi F (2008) Evaluating part-of-speech tags in indexing and precision for Persian text retrieval. In: Second Iran data mining conference, Tehran. Jadidinejad AH, Mahmudi F (2008) Evaluating part-of-speech tags in indexing and precision for Persian text retrieval. In: Second Iran data mining conference, Tehran.
11.
go back to reference Zhao F, Quan B, Yang J, Chen J, Zhang Y, Wang X (2019) Document summarization using word and part-of-speech based on attention mechanism. J Phys Conf Ser 1168(3):032008 Zhao F, Quan B, Yang J, Chen J, Zhang Y, Wang X (2019) Document summarization using word and part-of-speech based on attention mechanism. J Phys Conf Ser 1168(3):032008
12.
go back to reference Suzuki M, Komiya K, Sasaki M, Shinnou H (2018) Fine-tuning for named entity recognition using part-of-speech tagging. In Proceedings of the 32nd pacific asia conference on language, Hong Kong Suzuki M, Komiya K, Sasaki M, Shinnou H (2018) Fine-tuning for named entity recognition using part-of-speech tagging. In Proceedings of the 32nd pacific asia conference on language, Hong Kong
13.
go back to reference Elahimanesh M, Minaei Bidgoli B (2011) Improvement of the Persian texts unknown words enunciations by the help of association rules. In: Seventeenth national conference of iran computer association, Tehran Elahimanesh M, Minaei Bidgoli B (2011) Improvement of the Persian texts unknown words enunciations by the help of association rules. In: Seventeenth national conference of iran computer association, Tehran
14.
go back to reference Assi M (2003) From lingual corpuses to corpus linguistics. In Fifth linguistics conference, Tehran Assi M (2003) From lingual corpuses to corpus linguistics. In Fifth linguistics conference, Tehran
15.
go back to reference BijanKhan M (2004) The role of the corpus in writing a grammar: an introduction to a software. Iran J Linguist 2:19 BijanKhan M (2004) The role of the corpus in writing a grammar: an introduction to a software. Iran J Linguist 2:19
16.
go back to reference Mirdamadi M, Zaree Bidaki A, Rezaeyan M (2012) Persian text statistical tagging for using in search engines. In: First international conference on Persian handwriting and language, Semnan Mirdamadi M, Zaree Bidaki A, Rezaeyan M (2012) Persian text statistical tagging for using in search engines. In: First international conference on Persian handwriting and language, Semnan
17.
go back to reference Rahati Quchani S, Azimizadeh A, Arab M (2007) Persian words part of speech tagging by the help of Markov hidden model. In: Thirteenth Iran national conference on computer, Persian Gulf, Kish Island Rahati Quchani S, Azimizadeh A, Arab M (2007) Persian words part of speech tagging by the help of Markov hidden model. In: Thirteenth Iran national conference on computer, Persian Gulf, Kish Island
18.
go back to reference Hamidi M, Khalili S, Alighardash E, Pilevar A (2011) Persian tagging based on new structural rules. In: Nineteenth conference on electricity engineering, Iran, Tehran, Amir Kabir Industrial University, Persian Hamidi M, Khalili S, Alighardash E, Pilevar A (2011) Persian tagging based on new structural rules. In: Nineteenth conference on electricity engineering, Iran, Tehran, Amir Kabir Industrial University, Persian
19.
go back to reference Altunyurt L, Orhan Z, Güngör T (2007) Towards combining rule-based and statistical part of speech tagging in agglutinative languages. Comput Eng 1(1):66–69 Altunyurt L, Orhan Z, Güngör T (2007) Towards combining rule-based and statistical part of speech tagging in agglutinative languages. Comput Eng 1(1):66–69
20.
go back to reference Jurafsky D, Martin JH (2000) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall, Upper Saddle River, NJ Jurafsky D, Martin JH (2000) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall, Upper Saddle River, NJ
21.
go back to reference Sabeti V, Mahoor Z, Palhang M (2007) Token tagging in Persian language in probability and transformational-based method. In: Fifteenth conference on electrical engineering Iran, Tehran Sabeti V, Mahoor Z, Palhang M (2007) Token tagging in Persian language in probability and transformational-based method. In: Fifteenth conference on electrical engineering Iran, Tehran
22.
go back to reference Moghadam M, Jafarpour N (2021) A Survey of part of speech tagging of latin and non-latin script languages: a more vivid view on Persian. Lang Art 6(1):75–90 Moghadam M, Jafarpour N (2021) A Survey of part of speech tagging of latin and non-latin script languages: a more vivid view on Persian. Lang Art 6(1):75–90
23.
go back to reference Tasharofi S, Raja F, Oroumchian F, Rahgozar M (2007) Evaluation of statistical part of speech tagging of Persian text. In: International symposium on signal processing and its application, Sharjah, United Arab Emirates Tasharofi S, Raja F, Oroumchian F, Rahgozar M (2007) Evaluation of statistical part of speech tagging of Persian text. In: International symposium on signal processing and its application, Sharjah, United Arab Emirates
24.
go back to reference Amiri H, Hojjat H, Oroumchian F (2007) Investigation on a feasible corpus for Persian POS tagging. In: Tewelfth International CSI Computer Conference(CSICC), Tehran Amiri H, Hojjat H, Oroumchian F (2007) Investigation on a feasible corpus for Persian POS tagging. In: Tewelfth International CSI Computer Conference(CSICC), Tehran
25.
go back to reference Mohtarami M, Amiri H, Oroumchian F, Rahgozar M (2008) Using heuristic rules to improve Persian part of speech tagging accuracy. In: International conference on information and knowledge engineering, California, USA Mohtarami M, Amiri H, Oroumchian F, Rahgozar M (2008) Using heuristic rules to improve Persian part of speech tagging accuracy. In: International conference on information and knowledge engineering, California, USA
26.
go back to reference Oroumchian F, Tasharofi S, Amiri H, Hojjat H, Raja F (2006) Creating a feasible corpus for Persian POS tagging, UOWD Technical Report, University of Wollongong(Dubai Campus) Oroumchian F, Tasharofi S, Amiri H, Hojjat H, Raja F (2006) Creating a feasible corpus for Persian POS tagging, UOWD Technical Report, University of Wollongong(Dubai Campus)
27.
go back to reference Okhovvat M, Sharifi M, Minaei Bidgoli B (2020) An accurate Persian part-of-speech tagger. Comput Syst Sci Eng 35:423–430CrossRef Okhovvat M, Sharifi M, Minaei Bidgoli B (2020) An accurate Persian part-of-speech tagger. Comput Syst Sci Eng 35:423–430CrossRef
28.
go back to reference Badpeima M, Hourali F, Hourali M (2019) Part of speech tagging of Persian Language using fuzzy network model. Signal and Data Process 15:123–130CrossRef Badpeima M, Hourali F, Hourali M (2019) Part of speech tagging of Persian Language using fuzzy network model. Signal and Data Process 15:123–130CrossRef
29.
go back to reference DeRose SJ (1988) Grammatical category disambiguation by statistical optimization. Computat Linguist J 14(1):31–39 DeRose SJ (1988) Grammatical category disambiguation by statistical optimization. Computat Linguist J 14(1):31–39
30.
go back to reference Assi M, Haji Abdolhosseini M (2000) Grammatical tagging of a Persian corpus. Proc Int J Corpus linguist 5:69–82CrossRef Assi M, Haji Abdolhosseini M (2000) Grammatical tagging of a Persian corpus. Proc Int J Corpus linguist 5:69–82CrossRef
31.
go back to reference Brants T (2000) "TnT: a statistical part-of-speech tagger. In: Sixth conference on Applied Natural Language Processing (ANLP), Seattle Brants T (2000) "TnT: a statistical part-of-speech tagger. In: Sixth conference on Applied Natural Language Processing (ANLP), Seattle
32.
go back to reference Jabbari S, Allison B (2007) Persian part of speech tagging. In: CAASL-2 proceedings, London Jabbari S, Allison B (2007) Persian part of speech tagging. In: CAASL-2 proceedings, London
33.
go back to reference Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 21(4):543–565 Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 21(4):543–565
34.
go back to reference Hepple M (2000) Independence and commitment: assumptions for rapid training and execution of rule-based part-of- speech taggers. In: Proceedings of the 38th annual meeting of the association for computational linguistics, Hong Kong Hepple M (2000) Independence and commitment: assumptions for rapid training and execution of rule-based part-of- speech taggers. In: Proceedings of the 38th annual meeting of the association for computational linguistics, Hong Kong
35.
go back to reference Raja F, Amiri H, Tasharofi S, Sarmadi M, Hojjat H, Oroumchian F (2007) Evaluation of part of speech tagging on Persian text. In: Proceedings of the second workshop on computational approaches to Arabic script-based languages, Stanford, California Raja F, Amiri H, Tasharofi S, Sarmadi M, Hojjat H, Oroumchian F (2007) Evaluation of part of speech tagging on Persian text. In: Proceedings of the second workshop on computational approaches to Arabic script-based languages, Stanford, California
36.
go back to reference Fadaei H, Shamsfard M (2010) Persian POS tagging using probabilistic morphological analysis. Int J Comput Appl Technol 38(4):264–273CrossRef Fadaei H, Shamsfard M (2010) Persian POS tagging using probabilistic morphological analysis. Int J Comput Appl Technol 38(4):264–273CrossRef
37.
go back to reference Keikha M, Mahdikhani F, Oroumchian F, Khansari A (2007) Designing of tree-based POS tagger. In: Fifteenth conference on computer engineering, Tehran Keikha M, Mahdikhani F, Oroumchian F, Khansari A (2007) Designing of tree-based POS tagger. In: Fifteenth conference on computer engineering, Tehran
38.
go back to reference Razi Perjikolaei B, Eshghi M (2012) Designing of a part of speech (POS) tagger based on the neural network for Persian language. In: Twentieth national conference on electricity engineering, Iran, Tehran Razi Perjikolaei B, Eshghi M (2012) Designing of a part of speech (POS) tagger based on the neural network for Persian language. In: Twentieth national conference on electricity engineering, Iran, Tehran
39.
go back to reference Tamadon DYM, Abbasi Dezfuli M (2013) Proposing a method for part of speech tagging in Persian language. In: First national conference on innovation in i computer engineering and information and technology, Tonekabon Tamadon DYM, Abbasi Dezfuli M (2013) Proposing a method for part of speech tagging in Persian language. In: First national conference on innovation in i computer engineering and information and technology, Tonekabon
40.
go back to reference Koochari A, Alavi Gharahbagh A, Hajihashemi V (2020) A Persian part of speech tagging system using the long short-term memory neural network. In 2020 6th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), Mashhad, Iran Koochari A, Alavi Gharahbagh A, Hajihashemi V (2020) A Persian part of speech tagging system using the long short-term memory neural network. In 2020 6th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), Mashhad, Iran
Metadata
Title
Part of Speech Tagging Using Part of Speech Sequence Graph
Authors
Pejman Gholami-Dastgerdi
Mohammad-Reza Feizi-Derakhshi
Publication date
14-11-2021
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science / Issue 5/2023
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-021-00359-4

Premium Partner