Skip to main content
Top

2020 | OriginalPaper | Chapter

Applicability of Machine Learning Methods to Multi-label Medical Text Classification

Authors : Iuliia Lenivtceva, Evgenia Slasten, Mariya Kashina, Georgy Kopanitsa

Published in: Computational Science – ICCS 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Structuring medical text using international standards allows to improve interoperability and quality of predictive modelling. Medical text classification task facilitates information extraction. In this work we investigate the applicability of several machine learning models and classifier chains (CC) to medical unstructured text classification. The experimental study was performed on a corpus of 11671 manually labeled Russian medical notes. The results showed that using CC strategy allows to improve classification performance. Ensemble of classifier chains based on linear SVC showed the best result: 0.924 micro F-measure, 0.872 micro precision and 0.927 micro recall.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Fiebeck, J., Gietzelt, M., Ballout, S., et al.: Implementing LOINC: current status and ongoing work at the Hannover Medical School. In: Studies in Health Technology and Informatics, pp. 247–248. IOS Press (2019) Fiebeck, J., Gietzelt, M., Ballout, S., et al.: Implementing LOINC: current status and ongoing work at the Hannover Medical School. In: Studies in Health Technology and Informatics, pp. 247–248. IOS Press (2019)
4.
go back to reference Santos, M.R., Bax, M.P., Kalra, D.: Building a logical EHR architecture based on ISO 13606 standard and semantic web technologies. In: Studies in Health Technology and Informatics (2010) Santos, M.R., Bax, M.P., Kalra, D.: Building a logical EHR architecture based on ISO 13606 standard and semantic web technologies. In: Studies in Health Technology and Informatics (2010)
5.
go back to reference Ulrich, H., Kock, A.K., Duhm-Harbeck, P., et al.: Metadata repository for improved data sharing and reuse based on HL7 FHIR. In: Studies in Health Technology and Informatics (2017) Ulrich, H., Kock, A.K., Duhm-Harbeck, P., et al.: Metadata repository for improved data sharing and reuse based on HL7 FHIR. In: Studies in Health Technology and Informatics (2017)
6.
go back to reference Hong, N., Wen, A., Mojarad, M.R., et al.: Standardizing heterogeneous annotation corpora using HL7 FHIR for facilitating their reuse and integration in clinical NLP. In: AMIA Annual Symposium Proceedings AMIA Symposium, pp. 574–583 (2018) Hong, N., Wen, A., Mojarad, M.R., et al.: Standardizing heterogeneous annotation corpora using HL7 FHIR for facilitating their reuse and integration in clinical NLP. In: AMIA Annual Symposium Proceedings AMIA Symposium, pp. 574–583 (2018)
8.
go back to reference Kaur, R., Ginige, J.A.: Analysing effectiveness of multi-label classification in clinical coding. In: ACM International Conference Proceeding Series. Association for Computing Machinery (2019) Kaur, R., Ginige, J.A.: Analysing effectiveness of multi-label classification in clinical coding. In: ACM International Conference Proceeding Series. Association for Computing Machinery (2019)
9.
go back to reference Wang, Y., Wang, L., Rastegar-Mojarad, M., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018)CrossRef Wang, Y., Wang, L., Rastegar-Mojarad, M., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018)CrossRef
10.
go back to reference Alemu, A., Hulth, A., Megyesi, B.: General-purpose text categorization applied to the medical domain. Comput. Sci. 16 (2007) Alemu, A., Hulth, A., Megyesi, B.: General-purpose text categorization applied to the medical domain. Comput. Sci. 16 (2007)
12.
go back to reference Métivier, J.-P., Serrano, L., Charnois, T., Cuissart, B., Widlöcher, A.: Automatic symptom extraction from texts to enhance knowledge discovery on rare diseases. In: Holmes, J.H., Bellazzi, R., Sacchi, L., Peek, N. (eds.) AIME 2015. LNCS (LNAI), vol. 9105, pp. 249–254. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19551-3_33CrossRef Métivier, J.-P., Serrano, L., Charnois, T., Cuissart, B., Widlöcher, A.: Automatic symptom extraction from texts to enhance knowledge discovery on rare diseases. In: Holmes, J.H., Bellazzi, R., Sacchi, L., Peek, N. (eds.) AIME 2015. LNCS (LNAI), vol. 9105, pp. 249–254. Springer, Cham (2015). https://​doi.​org/​10.​1007/​978-3-319-19551-3_​33CrossRef
13.
go back to reference Levin, M.A., Krol, M., Doshi, A.M., Reich, D.L.: Extraction and mapping of drug names from free text to a standardized nomenclature. In: AMIA Annual Symposium Proceedings, pp. 438–442 (2007) Levin, M.A., Krol, M., Doshi, A.M., Reich, D.L.: Extraction and mapping of drug names from free text to a standardized nomenclature. In: AMIA Annual Symposium Proceedings, pp. 438–442 (2007)
16.
go back to reference Jain, A., Mandowara, J.: Text classification by combining text classifiers to improve the efficiency of classification. Int. J. Comput. Appl. 6, 1797–2250 (2016) Jain, A., Mandowara, J.: Text classification by combining text classifiers to improve the efficiency of classification. Int. J. Comput. Appl. 6, 1797–2250 (2016)
17.
go back to reference Ali, A.R., Ijaz, M.: Urdu text classification. In: Proceedings of the 6th International Conference on Frontiers of Information Technology, FIT 2009 (2009) Ali, A.R., Ijaz, M.: Urdu text classification. In: Proceedings of the 6th International Conference on Frontiers of Information Technology, FIT 2009 (2009)
18.
go back to reference Toldova, S., Lyashevskaya, O., Bonch-Osmolovskaya, A., Ionov, M.: Evaluation for morphologically rich language: Russian NLP. In: Proceedings on the International Conference on Artificial Intelligence (ICAI), pp. 300–306. CSREA Press, Las Vegas (2015) Toldova, S., Lyashevskaya, O., Bonch-Osmolovskaya, A., Ionov, M.: Evaluation for morphologically rich language: Russian NLP. In: Proceedings on the International Conference on Artificial Intelligence (ICAI), pp. 300–306. CSREA Press, Las Vegas (2015)
21.
go back to reference Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 1819–1837 (2014)CrossRef Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 1819–1837 (2014)CrossRef
22.
go back to reference Zhao, R.W., Li, G.Z., Liu, J.M., Wang, X.: Clinical multi-label free text classification by exploiting disease label relation. In: Proceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013, pp 311–315 (2013) Zhao, R.W., Li, G.Z., Liu, J.M., Wang, X.: Clinical multi-label free text classification by exploiting disease label relation. In: Proceedings - 2013 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2013, pp 311–315 (2013)
26.
go back to reference Spat, S., et al.: Multi-label classification of clinical text documents considering the impact of text pre-processing and training size. In: 23rd International Conference of the European Federation for Medical Informatics (2011) Spat, S., et al.: Multi-label classification of clinical text documents considering the impact of text pre-processing and training size. In: 23rd International Conference of the European Federation for Medical Informatics (2011)
27.
go back to reference Lita, L.V., Yu, S., Niculescu, S., Bi, J.: Large scale diagnostic code classification for medical patient records. In: IJCNLP, pp. 877–882 (2008) Lita, L.V., Yu, S., Niculescu, S., Bi, J.: Large scale diagnostic code classification for medical patient records. In: IJCNLP, pp. 877–882 (2008)
28.
go back to reference Baumel, T., Nassour-Kassis, J., Cohen, R., et al.: Multi-label classification of patient notes a case study on ICD code assignment. In: AAAI Conference on Artificial Intelligence. pp. 409–416 (2017) Baumel, T., Nassour-Kassis, J., Cohen, R., et al.: Multi-label classification of patient notes a case study on ICD code assignment. In: AAAI Conference on Artificial Intelligence. pp. 409–416 (2017)
29.
go back to reference van der Maaten, L.J.P., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATH van der Maaten, L.J.P., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATH
31.
go back to reference Dembczynski, K., Jachnik, A., Kotłowski, W., et al. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: ICML 2013: Proceedings of the 30th International Conference on International Conference on Machine Learning, pp. 1130–1138 (2013) Dembczynski, K., Jachnik, A., Kotłowski, W., et al. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: ICML 2013: Proceedings of the 30th International Conference on International Conference on Machine Learning, pp. 1130–1138 (2013)
Metadata
Title
Applicability of Machine Learning Methods to Multi-label Medical Text Classification
Authors
Iuliia Lenivtceva
Evgenia Slasten
Mariya Kashina
Georgy Kopanitsa
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_38

Premium Partner