Skip to main content

2020 | OriginalPaper | Buchkapitel

Chinese Text Error Correction Suggestion Generation Based on SoundShape Code

verfasst von : Hanru Wang, Yangsen Zhang, Lipeng Yang, Congcong Wang

Erschienen in: Chinese Lexical Semantics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Text error correction is an essential part of text proofreading. This paper presents a method for generating text error correction suggestion based on SoundShape Code. By converting the target words into SoundShape Code and using an improved editing distance algorithm to make an ambiguous match with the words in the vocabulary, a set of candidate words whose similarity exceeds a certain threshold are obtained. Based on the contextual relevance model, each words in the candidate words set is scored, and then reasonable error correction suggestions are given according the score. In this paper, four types of errors are marked: substitution error in words with two-character, missing error in words with more than three-character, inserting error in words with more than three-character, substitution error in words with three-character. In total, 617 errors are tested and analyzed. Experiments show that the similarity calculation based on SoundShape Code can provide reasonable error correction suggestions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chen, M., Du, Q.Z., Shao, Y.B., et al.: Chinese characters similarity comparison algorithm based on phonetic code and shape code. Inf. Technol. 11, 73–75 (2018) Chen, M., Du, Q.Z., Shao, Y.B., et al.: Chinese characters similarity comparison algorithm based on phonetic code and shape code. Inf. Technol. 11, 73–75 (2018)
2.
Zurück zum Zitat Zhang, Y.S.: The structuring method of correcting knowledge sets and the producing algorithm of correcting suggestion in the Chinese text proofreading system. J. Chin. Inf. Process. 15(5), 33–39 (2001) Zhang, Y.S.: The structuring method of correcting knowledge sets and the producing algorithm of correcting suggestion in the Chinese text proofreading system. J. Chin. Inf. Process. 15(5), 33–39 (2001)
3.
Zurück zum Zitat Zhang, Y.S., Cao, Y.D., Xu, B.: Correcting candidate suggestion algorithm and its realization based on statistics. Comput. Eng. 30(11), 106–109 (2004) Zhang, Y.S., Cao, Y.D., Xu, B.: Correcting candidate suggestion algorithm and its realization based on statistics. Comput. Eng. 30(11), 106–109 (2004)
4.
Zurück zum Zitat Liu, L.L., Cao, C.G.: Research on automatic correction of Chinese true word errors based on combination of local context features. Comput. Sci. 43(12), 30–35 (2016) Liu, L.L., Cao, C.G.: Research on automatic correction of Chinese true word errors based on combination of local context features. Comput. Sci. 43(12), 30–35 (2016)
5.
Zurück zum Zitat Shi, H.L., Liu, L.L., Wang, S., et al.: Research on method of constructing Chinese character confusion set. Comput. Sci. 41(8), 229–232 (2014) Shi, H.L., Liu, L.L., Wang, S., et al.: Research on method of constructing Chinese character confusion set. Comput. Sci. 41(8), 229–232 (2014)
6.
Zurück zum Zitat Shi, M.: Chinese text automatic proofreading system. Jiangsu University of Science and Technology (2015) Shi, M.: Chinese text automatic proofreading system. Jiangsu University of Science and Technology (2015)
7.
Zurück zum Zitat Cai, D.F., Bai, Y., Yu, S., et al.: A context based word similarity computing method. J. Chin. Inf. Process. 24(3), 24–28 (2010) Cai, D.F., Bai, Y., Yu, S., et al.: A context based word similarity computing method. J. Chin. Inf. Process. 24(3), 24–28 (2010)
8.
Zurück zum Zitat Levenshtein, V.I.: Binary code capable of correcting deletions, insertions and reversals. Dokl. Akad. Nauk SSSR 163(4), 708–710 (1966) Levenshtein, V.I.: Binary code capable of correcting deletions, insertions and reversals. Dokl. Akad. Nauk SSSR 163(4), 708–710 (1966)
9.
Zurück zum Zitat Liu, L.L., Cao, C.G.: Research on automatic proofreading method for Chinese non-multiple word errors. Comput. Sci. 43(10), 200–205 (2016) Liu, L.L., Cao, C.G.: Research on automatic proofreading method for Chinese non-multiple word errors. Comput. Sci. 43(10), 200–205 (2016)
Metadaten
Titel
Chinese Text Error Correction Suggestion Generation Based on SoundShape Code
verfasst von
Hanru Wang
Yangsen Zhang
Lipeng Yang
Congcong Wang
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-38189-9_44