Skip to main content

2013 | OriginalPaper | Buchkapitel

Towards a Leaner Evaluation Process: Application to Error Correction Systems

verfasst von : Arnaud Renard, Sylvie Calabretto, Batrice Rumpler

Erschienen in: Enterprise Information Systems

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

While they follow similar procedures, evaluations of state of the art error correction systems always rely on different resources (collections of documents, evaluation metrics, dictionaries, ...). In this context, error correction approaches cannot be directly compared without being re-implemented from scratch every time they have to be compared with a new one. In other domains such as Information Retrieval this problem is solved through Cranfield like experiments such as TREC [5] evaluation campaign. We propose a generic solution to overcome those evaluation difficulties through a modular evaluation platform which formalizes similarities between evaluation procedures and provides standard sets of instantiated resources for particular domains. While this was our main problem at first, in this article, the set of resources is dedicated to the evaluation of error correction systems. The idea is to provide the leanest way to evaluate error correction systems by implementing only the core algorithm and relying on the platform for everything else.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998) Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
3.
Zurück zum Zitat Hirst, G., Budanitsky, A.: Correcting real-word spelling errors by restoring lexical cohesion. Nat. Lang. Eng. 11(1), 87–111 (2005)CrossRef Hirst, G., Budanitsky, A.: Correcting real-word spelling errors by restoring lexical cohesion. Nat. Lang. Eng. 11(1), 87–111 (2005)CrossRef
4.
Zurück zum Zitat Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms, Chapter 13. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, vol. 305, pp. 305–332. MIT Press, Cambridge (1998) Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms, Chapter 13. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, vol. 305, pp. 305–332. MIT Press, Cambridge (1998)
5.
Zurück zum Zitat Kantor, P.B., Voorhees, E.M.: The TREC-5 confusion track: comparing retrieval methods for scanned text. Inf. Retrieval 2(2), 165–176 (2000) Kantor, P.B., Voorhees, E.M.: The TREC-5 confusion track: comparing retrieval methods for scanned text. Inf. Retrieval 2(2), 165–176 (2000)
6.
Zurück zum Zitat Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. (CSUR) 24(4), 439 (1992) Kukich, K.: Techniques for automatically correcting words in text. ACM Comput. Surv. (CSUR) 24(4), 439 (1992)
7.
Zurück zum Zitat Mays, E., Damerau, F.J., Mercer, R.L.: Context based spelling correction. Inf. Process. Manag. 27(5), 517–522 (1991) Mays, E., Damerau, F.J., Mercer, R.L.: Context based spelling correction. Inf. Process. Manag. 27(5), 517–522 (1991)
8.
Zurück zum Zitat Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995) Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
9.
Zurück zum Zitat Mitton, R.: Ordering the suggestions of a spellchecker without using context. Nat. Lang. Eng. 15(02), 173–192 (2008) Mitton, R.: Ordering the suggestions of a spellchecker without using context. Nat. Lang. Eng. 15(02), 173–192 (2008)
12.
Zurück zum Zitat Pedler, J.: Computer correction of real-word spelling errors in dyslexic text. Ph.D. thesis, Birkbeck, London University (2007) Pedler, J.: Computer correction of real-word spelling errors in dyslexic text. Ph.D. thesis, Birkbeck, London University (2007)
13.
Zurück zum Zitat Rosnay, J., Revelli, C.: Pronetarian Revolution (2006) Rosnay, J., Revelli, C.: Pronetarian Revolution (2006)
14.
Zurück zum Zitat Ruch, P.: Using contextual spelling correction to improve retrieval effectiveness in degraded text collections. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, p. 7. Association for Computational Linguistics (2002) Ruch, P.: Using contextual spelling correction to improve retrieval effectiveness in degraded text collections. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, p. 7. Association for Computational Linguistics (2002)
15.
Zurück zum Zitat Shannon, C.: A mathematical theory of communication. Bell Sys. Tech. J. 27(379–423), pp. 623–656 (1948) Shannon, C.: A mathematical theory of communication. Bell Sys. Tech. J. 27(379–423), pp. 623–656 (1948)
16.
Zurück zum Zitat Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A Survey of Types of Text Noise and Techniques to Handle Noisy Text. Language, pp. 115–122 (2009) Subramaniam, L.V., Roy, S., Faruquie, T.A., Negi, S.: A Survey of Types of Text Noise and Techniques to Handle Noisy Text. Language, pp. 115–122 (2009)
17.
Zurück zum Zitat Varnhagen, C.K., McFall, G.P., Figueredo, L., Takach, B.S., Daniels, J., Cuthbertson, H.: Spelling and the web. J. App.l. Develop. Psychol. 30(4), 454–462 (2009)CrossRef Varnhagen, C.K., McFall, G.P., Figueredo, L., Takach, B.S., Daniels, J., Cuthbertson, H.: Spelling and the web. J. App.l. Develop. Psychol. 30(4), 454–462 (2009)CrossRef
18.
Zurück zum Zitat Voorhees, E.M., Garofolo, J.: The TREC-6 spoken document retrieval track. Bull. Am. Soc. Inf. Sci. Technol. 26(5), 18–19 (2000)CrossRef Voorhees, E.M., Garofolo, J.: The TREC-6 spoken document retrieval track. Bull. Am. Soc. Inf. Sci. Technol. 26(5), 18–19 (2000)CrossRef
21.
Zurück zum Zitat Wilcox-O’Hearn, A., Hirst, G., Budanitsky, A.: Real-word spelling correction with trigrams: a reconsideration of the Mays, Damerau, and Mercer model. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 605–616. Springer, Heidelberg (2008) Wilcox-O’Hearn, A., Hirst, G., Budanitsky, A.: Real-word spelling correction with trigrams: a reconsideration of the Mays, Damerau, and Mercer model. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 605–616. Springer, Heidelberg (2008)
22.
Zurück zum Zitat Wong, W., Liu, W., Bennamoun, M.: Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text. In: 5th Australasian conference on Data mining and analystics (AusDM’06), Sydney, Australia, pp. 83–89. Australian Computer Society (2006) Wong, W., Liu, W., Bennamoun, M.: Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text. In: 5th Australasian conference on Data mining and analystics (AusDM’06), Sydney, Australia, pp. 83–89. Australian Computer Society (2006)
Metadaten
Titel
Towards a Leaner Evaluation Process: Application to Error Correction Systems
verfasst von
Arnaud Renard
Sylvie Calabretto
Batrice Rumpler
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-40654-6_14

Premium Partner