Skip to main content

2015 | OriginalPaper | Buchkapitel

Correcting and Standardizing Crude Drug Names in Traditional Medicine Formulae by Ensemble of String Matching Techniques

verfasst von : Duangkamol Pakdeesattayapong, Verayuth Lertnattee

Erschienen in: Intelligent Computing Theories and Methodologies

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Common problems of representing crude drug names in traditional herbal formulae are spelling errors, grammatical variants, synonyms and various formats. In order to make these names more obvious and useful, correcting and standardizing of these names should be applied. In this work, crude drug names in various forms were corrected and standardized by string matching techniques. A set of experiments were done using crude drug names from a database of registered traditional medicines in Thai Food and Drug Administration as the test set. Two well-known algorithms, i.e., similar text and Levenshtein were investigated. However, the results from each algorithm indicated that crude drug names in the test set were moderately matched with those of the standard set. To increase performance of these single algorithms, the ensemble algorithm was proposed. From the results, the ensemble algorithm outperforms single algorithms to match crude drug names, especially crude drug names with the modifier that have no significant meaning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Navarro, G.: A guide tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)CrossRef Navarro, G.: A guide tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)CrossRef
3.
Zurück zum Zitat Bureau of Drug and Narcotic, Department of Medical Sciences: Thai Herbal Pharmacopoeia, vol. 3. Office of National Buddishm Press, Bangkok (2009) Bureau of Drug and Narcotic, Department of Medical Sciences: Thai Herbal Pharmacopoeia, vol. 3. Office of National Buddishm Press, Bangkok (2009)
4.
Zurück zum Zitat Ministry of Health of the People’s Republic of China: Pharmacopoeia of the People’s Republic of China. China Medical Science Press, Beijing (2010) Ministry of Health of the People’s Republic of China: Pharmacopoeia of the People’s Republic of China. China Medical Science Press, Beijing (2010)
5.
Zurück zum Zitat World Health Organization: WHO Monograph on Selected Medicinal Plants, vol. 4. WHO Press, Geneva (2005) World Health Organization: WHO Monograph on Selected Medicinal Plants, vol. 4. WHO Press, Geneva (2005)
6.
Zurück zum Zitat Klaus, U.S., Stoyan, M.: Fast string correction with Levenshtein automata. Int. J. Doc. Anal. Recogn. 5, 67–85 (2002)CrossRef Klaus, U.S., Stoyan, M.: Fast string correction with Levenshtein automata. Int. J. Doc. Anal. Recogn. 5, 67–85 (2002)CrossRef
7.
Zurück zum Zitat Wang, J.F., Li, Z.R., Cai, C.Z., Chen, Y.Z.: Assessment of approximate string matching in a biomedical text retrieval problem. Comput. Biol. Med. 29, 717–724 (2005)CrossRef Wang, J.F., Li, Z.R., Cai, C.Z., Chen, Y.Z.: Assessment of approximate string matching in a biomedical text retrieval problem. Comput. Biol. Med. 29, 717–724 (2005)CrossRef
8.
Zurück zum Zitat Tilo, B., Leonid, V.B.: Levenshtein error-correcting barcodes for multiplexed DNA sequencing. BMC Bioinform. 14, 272–281 (2013)CrossRef Tilo, B., Leonid, V.B.: Levenshtein error-correcting barcodes for multiplexed DNA sequencing. BMC Bioinform. 14, 272–281 (2013)CrossRef
9.
Zurück zum Zitat Rees, T.: Fuzzy matching of taxon names for biodiversity informatics applications. Poster session presented at the meeting of e-Biosphere Conference, UK (2009) Rees, T.: Fuzzy matching of taxon names for biodiversity informatics applications. Poster session presented at the meeting of e-Biosphere Conference, UK (2009)
10.
Zurück zum Zitat Brad, B., et al.: The taxonomic name resolution service: an online tool for automated standardization of plant names. Bioinformatics 14(16), 1–14 (2013) Brad, B., et al.: The taxonomic name resolution service: an online tool for automated standardization of plant names. Bioinformatics 14(16), 1–14 (2013)
11.
Zurück zum Zitat Grzegorz, K., Bonnie, D.: Automatic identification of confusable drug names. Artif. Intell. Med. 36, 29–42 (2006)CrossRef Grzegorz, K., Bonnie, D.: Automatic identification of confusable drug names. Artif. Intell. Med. 36, 29–42 (2006)CrossRef
13.
Zurück zum Zitat Oliver, I.: Programming Classics: Implementing the World’s Best Algorithms. Prentice Hall Inc., Englewood Cliffs (1993) Oliver, I.: Programming Classics: Implementing the World’s Best Algorithms. Prentice Hall Inc., Englewood Cliffs (1993)
14.
Zurück zum Zitat Ilse, D., Nathalie, D.S., Arda, T.: Post-editing of machine translation: a case study. In: Laura, W.B., Michael, C. (eds.) Processes and Applications, pp. 78–108. Cambridge Scholar publishing, Newcastle (2014) Ilse, D., Nathalie, D.S., Arda, T.: Post-editing of machine translation: a case study. In: Laura, W.B., Michael, C. (eds.) Processes and Applications, pp. 78–108. Cambridge Scholar publishing, Newcastle (2014)
15.
Zurück zum Zitat Levenshtein, V.I.: Binary code capable of correcting deletions, insertions, and reverals. Sov. Phys. Dokl. 10(8), 707–710 (1966)MathSciNet Levenshtein, V.I.: Binary code capable of correcting deletions, insertions, and reverals. Sov. Phys. Dokl. 10(8), 707–710 (1966)MathSciNet
16.
Zurück zum Zitat Andres, M., Enrique, V.: Computation of normalized edit distance and applications. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 1091–1095 (1993) Andres, M., Enrique, V.: Computation of normalized edit distance and applications. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 1091–1095 (1993)
17.
Zurück zum Zitat Peter, C.: A comparison of personal name matching. In: Sixth IEEE International Conference on Data Mining Workshop, pp. 290–294. The Printing House Publication, USA (2006) Peter, C.: A comparison of personal name matching. In: Sixth IEEE International Conference on Data Mining Workshop, pp. 290–294. The Printing House Publication, USA (2006)
18.
Zurück zum Zitat Lisa, T., Beata, M., Aron, H., Martin, D., Maria, K.: EACL - expansion of abbreviations in CLinical text. In: The 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 2085–2090. Association for Computational Linguistics (ACL), Pennsylvania (2014) Lisa, T., Beata, M., Aron, H., Martin, D., Maria, K.: EACL - expansion of abbreviations in CLinical text. In: The 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 2085–2090. Association for Computational Linguistics (ACL), Pennsylvania (2014)
19.
Zurück zum Zitat Bryan, R., Sanda, H., Kirk, R.: Automatic extraction of relations between medicals concepts in clinical texts. J. Am. Med. Inform. Assoc. 18, 594–600 (2011)CrossRef Bryan, R., Sanda, H., Kirk, R.: Automatic extraction of relations between medicals concepts in clinical texts. J. Am. Med. Inform. Assoc. 18, 594–600 (2011)CrossRef
20.
Zurück zum Zitat Zied, M., Lina, F.S., Elise, P.-G., Thierry, L., Stefan, J.D.: Spell-checking queries by combining Levenshtein and Stoilos distances. In: Oral presentation session presented at Network Tools and Applications in Biology Clinical Bioinformatics (NETTAB) Workshop, Italy (2011) Zied, M., Lina, F.S., Elise, P.-G., Thierry, L., Stefan, J.D.: Spell-checking queries by combining Levenshtein and Stoilos distances. In: Oral presentation session presented at Network Tools and Applications in Biology Clinical Bioinformatics (NETTAB) Workshop, Italy (2011)
21.
Zurück zum Zitat Shaun, J.G., Overhage, J.M., Clement, M.: Real world performance of approximate string comparators for use in patient matching. In: Medinfo 2004 Proceedings of the 11th World Congress on Medical Informatics, pp. 43–47. IOS Press (2004) Shaun, J.G., Overhage, J.M., Clement, M.: Real world performance of approximate string comparators for use in patient matching. In: Medinfo 2004 Proceedings of the 11th World Congress on Medical Informatics, pp. 43–47. IOS Press (2004)
22.
Zurück zum Zitat Johnston, E., Kushmerick, N.: Aggregating web services with active invocation and ensembles of string distance metrics. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, pp. 386–402. Springer, Heidelberg (2004)CrossRef Johnston, E., Kushmerick, N.: Aggregating web services with active invocation and ensembles of string distance metrics. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, pp. 386–402. Springer, Heidelberg (2004)CrossRef
23.
Zurück zum Zitat Michael, J.P.: SAPLE: Sandia advanced personnel locator engine. In: U.S. Department of Energy (ed.) Sandia Report. U.S. Department of Energy, Springfield (2010) Michael, J.P.: SAPLE: Sandia advanced personnel locator engine. In: U.S. Department of Energy (ed.) Sandia Report. U.S. Department of Energy, Springfield (2010)
24.
Zurück zum Zitat Taro, Y.: Elementary Sampling Theory. Prentice Hall Inc., Englewood Cliffs (1967)MATH Taro, Y.: Elementary Sampling Theory. Prentice Hall Inc., Englewood Cliffs (1967)MATH
25.
Zurück zum Zitat Christopher, D.M., Prabhakar, R., Hinrich, S.: Introduction to Information Retrieval. Cambridge University Press, New York (2008) Christopher, D.M., Prabhakar, R., Hinrich, S.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Metadaten
Titel
Correcting and Standardizing Crude Drug Names in Traditional Medicine Formulae by Ensemble of String Matching Techniques
verfasst von
Duangkamol Pakdeesattayapong
Verayuth Lertnattee
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-22186-1_24

Premium Partner