Skip to main content

2020 | OriginalPaper | Buchkapitel

Applicability of Machine Learning Methods to Multi-label Medical Text Classification

verfasst von : Iuliia Lenivtceva, Evgenia Slasten, Mariya Kashina, Georgy Kopanitsa

Erschienen in: Computational Science – ICCS 2020

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Structuring medical text using international standards allows to improve interoperability and quality of predictive modelling. Medical text classification task facilitates information extraction. In this work we investigate the applicability of several machine learning models and classifier chains (CC) to medical unstructured text classification. The experimental study was performed on a corpus of 11671 manually labeled Russian medical notes. The results showed that using CC strategy allows to improve classification performance. Ensemble of classifier chains based on linear SVC showed the best result: 0.924 micro F-measure, 0.872 micro precision and 0.927 micro recall.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Fiebeck, J., Gietzelt, M., Ballout, S., et al.: Implementing LOINC: current status and ongoing work at the Hannover Medical School. In: Studies in Health Technology and Informatics, pp. 247–248. IOS Press (2019) Fiebeck, J., Gietzelt, M., Ballout, S., et al.: Implementing LOINC: current status and ongoing work at the Hannover Medical School. In: Studies in Health Technology and Informatics, pp. 247–248. IOS Press (2019)
4.
Zurück zum Zitat Santos, M.R., Bax, M.P., Kalra, D.: Building a logical EHR architecture based on ISO 13606 standard and semantic web technologies. In: Studies in Health Technology and Informatics (2010) Santos, M.R., Bax, M.P., Kalra, D.: Building a logical EHR architecture based on ISO 13606 standard and semantic web technologies. In: Studies in Health Technology and Informatics (2010)
5.
Zurück zum Zitat Ulrich, H., Kock, A.K., Duhm-Harbeck, P., et al.: Metadata repository for improved data sharing and reuse based on HL7 FHIR. In: Studies in Health Technology and Informatics (2017) Ulrich, H., Kock, A.K., Duhm-Harbeck, P., et al.: Metadata repository for improved data sharing and reuse based on HL7 FHIR. In: Studies in Health Technology and Informatics (2017)
6.
Zurück zum Zitat Hong, N., Wen, A., Mojarad, M.R., et al.: Standardizing heterogeneous annotation corpora using HL7 FHIR for facilitating their reuse and integration in clinical NLP. In: AMIA Annual Symposium Proceedings AMIA Symposium, pp. 574–583 (2018) Hong, N., Wen, A., Mojarad, M.R., et al.: Standardizing heterogeneous annotation corpora using HL7 FHIR for facilitating their reuse and integration in clinical NLP. In: AMIA Annual Symposium Proceedings AMIA Symposium, pp. 574–583 (2018)
8.
Zurück zum Zitat Kaur, R., Ginige, J.A.: Analysing effectiveness of multi-label classification in clinical coding. In: ACM International Conference Proceeding Series. Association for Computing Machinery (2019) Kaur, R., Ginige, J.A.: Analysing effectiveness of multi-label classification in clinical coding. In: ACM International Conference Proceeding Series. Association for Computing Machinery (2019)
9.
Zurück zum Zitat Wang, Y., Wang, L., Rastegar-Mojarad, M., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018)CrossRef Wang, Y., Wang, L., Rastegar-Mojarad, M., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018)CrossRef
10.
Zurück zum Zitat Alemu, A., Hulth, A., Megyesi, B.: General-purpose text categorization applied to the medical domain. Comput. Sci. 16 (2007) Alemu, A., Hulth, A., Megyesi, B.: General-purpose text categorization applied to the medical domain. Comput. Sci. 16 (2007)
12.
Zurück zum Zitat Métivier, J.-P., Serrano, L., Charnois, T., Cuissart, B., Widlöcher, A.: Automatic symptom extraction from texts to enhance knowledge discovery on rare diseases. In: Holmes, J.H., Bellazzi, R., Sacchi, L., Peek, N. (eds.) AIME 2015. LNCS (LNAI), vol. 9105, pp. 249–254. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19551-3_33CrossRef Métivier, J.-P., Serrano, L., Charnois, T., Cuissart, B., Widlöcher, A.: Automatic symptom extraction from texts to enhance knowledge discovery on rare diseases. In: Holmes, J.H., Bellazzi, R., Sacchi, L., Peek, N. (eds.) AIME 2015. LNCS (LNAI), vol. 9105, pp. 249–254. Springer, Cham (2015). https://​doi.​org/​10.​1007/​978-3-319-19551-3_​33CrossRef
13.
Zurück zum Zitat Levin, M.A., Krol, M., Doshi, A.M., Reich, D.L.: Extraction and mapping of drug names from free text to a standardized nomenclature. In: AMIA Annual Symposium Proceedings, pp. 438–442 (2007) Levin, M.A., Krol, M., Doshi, A.M., Reich, D.L.: Extraction and mapping of drug names from free text to a standardized nomenclature. In: AMIA Annual Symposium Proceedings, pp. 438–442 (2007)
15.
16.
Zurück zum Zitat Jain, A., Mandowara, J.: Text classification by combining text classifiers to improve the efficiency of classification. Int. J. Comput. Appl. 6, 1797–2250 (2016) Jain, A., Mandowara, J.: Text classification by combining text classifiers to improve the efficiency of classification. Int. J. Comput. Appl. 6, 1797–2250 (2016)
17.
Zurück zum Zitat Ali, A.R., Ijaz, M.: Urdu text classification. In: Proceedings of the 6th International Conference on Frontiers of Information Technology, FIT 2009 (2009) Ali, A.R., Ijaz, M.: Urdu text classification. In: Proceedings of the 6th International Conference on Frontiers of Information Technology, FIT 2009 (2009)
18.
Zurück zum Zitat Toldova, S., Lyashevskaya, O., Bonch-Osmolovskaya, A., Ionov, M.: Evaluation for morphologically rich language: Russian NLP. In: Proceedings on the International Conference on Artificial Intelligence (ICAI), pp. 300–306. CSREA Press, Las Vegas (2015) Toldova, S., Lyashevskaya, O., Bonch-Osmolovskaya, A., Ionov, M.: Evaluation for morphologically rich language: Russian NLP. In: Proceedings on the International Conference on Artificial Intelligence (ICAI), pp. 300–306. CSREA Press, Las Vegas (2015)
21.
Zurück zum Zitat Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 1819–1837 (2014)CrossRef Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 1819–1837 (2014)CrossRef
22.
Zurück zum Zitat Zhao, R.W., Li, G.Z., Liu, J.M., Wang, X.: Clinical multi-label free text classification by exploiting disease label relation. In: Proceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013, pp 311–315 (2013) Zhao, R.W., Li, G.Z., Liu, J.M., Wang, X.: Clinical multi-label free text classification by exploiting disease label relation. In: Proceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013, pp 311–315 (2013)
26.
Zurück zum Zitat Spat, S., et al.: Multi-label classification of clinical text documents considering the impact of text pre-processing and training size. In: 23rd International Conference of the European Federation for Medical Informatics (2011) Spat, S., et al.: Multi-label classification of clinical text documents considering the impact of text pre-processing and training size. In: 23rd International Conference of the European Federation for Medical Informatics (2011)
27.
Zurück zum Zitat Lita, L.V., Yu, S., Niculescu, S., Bi, J.: Large scale diagnostic code classification for medical patient records. In: IJCNLP, pp. 877–882 (2008) Lita, L.V., Yu, S., Niculescu, S., Bi, J.: Large scale diagnostic code classification for medical patient records. In: IJCNLP, pp. 877–882 (2008)
28.
Zurück zum Zitat Baumel, T., Nassour-Kassis, J., Cohen, R., et al.: Multi-label classification of patient notes a case study on ICD code assignment. In: AAAI Conference on Artificial Intelligence. pp. 409–416 (2017) Baumel, T., Nassour-Kassis, J., Cohen, R., et al.: Multi-label classification of patient notes a case study on ICD code assignment. In: AAAI Conference on Artificial Intelligence. pp. 409–416 (2017)
29.
Zurück zum Zitat van der Maaten, L.J.P., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATH van der Maaten, L.J.P., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATH
31.
Zurück zum Zitat Dembczynski, K., Jachnik, A., Kotłowski, W., et al. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: ICML 2013: Proceedings of the 30th International Conference on International Conference on Machine Learning, pp. 1130–1138 (2013) Dembczynski, K., Jachnik, A., Kotłowski, W., et al. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: ICML 2013: Proceedings of the 30th International Conference on International Conference on Machine Learning, pp. 1130–1138 (2013)
Metadaten
Titel
Applicability of Machine Learning Methods to Multi-label Medical Text Classification
verfasst von
Iuliia Lenivtceva
Evgenia Slasten
Mariya Kashina
Georgy Kopanitsa
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_38