Skip to main content

2020 | OriginalPaper | Buchkapitel

Hybrid Text Feature Modeling for Disease Group Prediction Using Unstructured Physician Notes

verfasst von : Gokul S. Krishnan, S. Sowmya Kamath

Erschienen in: Computational Science – ICCS 2020

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Existing Clinical Decision Support Systems (CDSSs) largely depend on the availability of structured patient data and Electronic Health Records (EHRs) to aid caregivers. However, in case of hospitals in developing countries, structured patient data formats are not widely adopted, where medical professionals still rely on clinical notes in the form of unstructured text. Such unstructured clinical notes recorded by medical personnel can also be a potential source of rich patient-specific information which can be leveraged to build CDSSs, even for hospitals in developing countries. If such unstructured clinical text can be used, the manual and time-consuming process of EHR generation will no longer be required, with huge person-hours and cost savings. In this article, we propose a generic ICD9 disease group prediction CDSS built on unstructured physician notes modeled using hybrid word embeddings. These word embeddings are used to train a deep neural network for effectively predicting ICD9 disease groups. Experimental evaluation showed that the proposed approach outperformed the state-of-the-art disease group prediction model built on structured EHRs by 15% in terms of AUROC and 40% in terms of AUPRC, thus proving our hypothesis and eliminating dependency on availability of structured patient data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
ICD-9-CM: International Classification of Diseases, Ninth Revision, Clinical Modification.
 
Literatur
1.
Zurück zum Zitat Appelros, P.: Prediction of length of stay for stroke patients. Acta Neurol. Scand. 116(1), 15–19 (2007)CrossRef Appelros, P.: Prediction of length of stay for stroke patients. Acta Neurol. Scand. 116(1), 15–19 (2007)CrossRef
2.
Zurück zum Zitat Ayyar, S., Don, O., Iv, W.: Tagging patient notes with ICD-9 codes. In: Proceedings of the 29th Conference on Neural Information Processing Systems (2016) Ayyar, S., Don, O., Iv, W.: Tagging patient notes with ICD-9 codes. In: Proceedings of the 29th Conference on Neural Information Processing Systems (2016)
3.
Zurück zum Zitat Baumel, T., Nassour-Kassis, J., Cohen, R., Elhadad, M., Elhadad, N.: Multi-label classification of patient notes: case study on ICD code assignment. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018) Baumel, T., Nassour-Kassis, J., Cohen, R., Elhadad, M., Elhadad, N.: Multi-label classification of patient notes: case study on ICD code assignment. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
4.
Zurück zum Zitat Berndorfer, S., Henriksson, A.: Automated diagnosis coding with combined text representations. Stud. Health Technol. Inf. 235, 201 (2017) Berndorfer, S., Henriksson, A.: Automated diagnosis coding with combined text representations. Stud. Health Technol. Inf. 235, 201 (2017)
5.
Zurück zum Zitat Calvert, J., et al.: Using electronic health record collected clinical variables to predict medical intensive care unit mortality. Ann. Med. Surg. 11, 52–57 (2016)CrossRef Calvert, J., et al.: Using electronic health record collected clinical variables to predict medical intensive care unit mortality. Ann. Med. Surg. 11, 52–57 (2016)CrossRef
6.
Zurück zum Zitat Choi, E., Bahadori, M.T., Schuetz, A., Stewart, W.F., Sun, J.: Doctor AI: predicting clinical events via recurrent neural networks. In: Machine Learning for Healthcare Conference, pp. 301–318 (2016) Choi, E., Bahadori, M.T., Schuetz, A., Stewart, W.F., Sun, J.: Doctor AI: predicting clinical events via recurrent neural networks. In: Machine Learning for Healthcare Conference, pp. 301–318 (2016)
7.
Zurück zum Zitat Farkas, R., Szarvas, G.: Automatic construction of rule-based ICD-9-CM coding systems. In: BMC Bioinformatics, vol. 9, p. S10. BioMed Central (2008) Farkas, R., Szarvas, G.: Automatic construction of rule-based ICD-9-CM coding systems. In: BMC Bioinformatics, vol. 9, p. S10. BioMed Central (2008)
8.
Zurück zum Zitat Gangavarapu, T., Krishnan, G., Kamath, S., Jeganathan, J.: FarSight: long-term disease prediction using unstructured clinical nursing notes. IEEE Trans. Emerg. Top. Comput. 01, 1 (2020) Gangavarapu, T., Krishnan, G., Kamath, S., Jeganathan, J.: FarSight: long-term disease prediction using unstructured clinical nursing notes. IEEE Trans. Emerg. Top. Comput. 01, 1 (2020)
9.
Zurück zum Zitat Gangavarapu, T., Jayasimha, A., Krishnan, G.S., Kamath, S.: Predicting ICD-9 code groups with fuzzy similarity based supervised multi-label classification of unstructured clinical nursing notes. Knowl.-Based Syst. 190, 105321 (2020)CrossRef Gangavarapu, T., Jayasimha, A., Krishnan, G.S., Kamath, S.: Predicting ICD-9 code groups with fuzzy similarity based supervised multi-label classification of unstructured clinical nursing notes. Knowl.-Based Syst. 190, 105321 (2020)CrossRef
10.
Zurück zum Zitat Ge, W., Huh, J.W., Park, Y.R., Lee, J.H., Kim, Y.H., Turchin, A.: An interpretable ICU mortality prediction model based on logistic regression and recurrent neural networks with LSTM units. In: AMIA Annual Symposium Proceedings, vol. 2018, p. 460. American Medical Informatics Association (2018) Ge, W., Huh, J.W., Park, Y.R., Lee, J.H., Kim, Y.H., Turchin, A.: An interpretable ICU mortality prediction model based on logistic regression and recurrent neural networks with LSTM units. In: AMIA Annual Symposium Proceedings, vol. 2018, p. 460. American Medical Informatics Association (2018)
11.
Zurück zum Zitat Harutyunyan, H., Khachatrian, H., Kale, D.C., Galstyan, A.: Multitask learning and benchmarking with clinical time series data. arXiv preprint arXiv:1703.07771 (2017) Harutyunyan, H., Khachatrian, H., Kale, D.C., Galstyan, A.: Multitask learning and benchmarking with clinical time series data. arXiv preprint arXiv:​1703.​07771 (2017)
12.
Zurück zum Zitat Jiang, S., Chin, K.S., Qu, G., Tsui, K.L.: An integrated machine learning framework for hospital readmission prediction. Knowl.-Based Syst. 146, 73–90 (2018)CrossRef Jiang, S., Chin, K.S., Qu, G., Tsui, K.L.: An integrated machine learning framework for hospital readmission prediction. Knowl.-Based Syst. 146, 73–90 (2018)CrossRef
13.
Zurück zum Zitat Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016)CrossRef Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016)CrossRef
14.
Zurück zum Zitat Kansagara, D., et al.: Risk prediction models for hospital readmission: a systematic review. JAMA 306(15), 1688–1698 (2011) CrossRef Kansagara, D., et al.: Risk prediction models for hospital readmission: a systematic review. JAMA 306(15), 1688–1698 (2011) CrossRef
15.
Zurück zum Zitat Krishnan, G.S., Kamath, S.S.: Ontology-driven text feature modeling for disease prediction using unstructured radiological notes. Comput. Sistemas 23(3), 915–922 (2019) Krishnan, G.S., Kamath, S.S.: Ontology-driven text feature modeling for disease prediction using unstructured radiological notes. Comput. Sistemas 23(3), 915–922 (2019)
17.
Zurück zum Zitat Li, M., et al.: Automated ICD-9 coding via a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinf. 16, 1193–1202 (2018)CrossRef Li, M., et al.: Automated ICD-9 coding via a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinf. 16, 1193–1202 (2018)CrossRef
18.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
19.
Zurück zum Zitat Miotto, R., Li, L., Kidd, B.A., Dudley, J.T.: Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci. Rep. 6, 26094 (2016)CrossRef Miotto, R., Li, L., Kidd, B.A., Dudley, J.T.: Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci. Rep. 6, 26094 (2016)CrossRef
20.
Zurück zum Zitat Nédellec, C., et al.: Overview of BioNLP shared task 2013. In: Proceedings of the BioNLP Shared Task 2013 Workshop, pp. 1–7 (2013) Nédellec, C., et al.: Overview of BioNLP shared task 2013. In: Proceedings of the BioNLP Shared Task 2013 Workshop, pp. 1–7 (2013)
21.
Zurück zum Zitat Purushotham, S., Meng, C., Che, Z., Liu, Y.: Benchmarking deep learning models on large healthcare datasets. J. Biomed. Inf. 83, 112–134 (2018)CrossRef Purushotham, S., Meng, C., Che, Z., Liu, Y.: Benchmarking deep learning models on large healthcare datasets. J. Biomed. Inf. 83, 112–134 (2018)CrossRef
22.
Zurück zum Zitat Reddy, B.K., Delen, D.: Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput. Biol. Med. 101, 199–209 (2018)CrossRef Reddy, B.K., Delen, D.: Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput. Biol. Med. 101, 199–209 (2018)CrossRef
23.
Zurück zum Zitat Shickel, B., Loftus, T.J., Adhikari, L., Ozrazgat-Baslanti, T., Bihorac, A., Rashidi, P.: DeepSOFA: a continuous acuity score for critically ill patients using clinically interpretable deep learning. Sci. Rep. 9(1), 1879 (2019)CrossRef Shickel, B., Loftus, T.J., Adhikari, L., Ozrazgat-Baslanti, T., Bihorac, A., Rashidi, P.: DeepSOFA: a continuous acuity score for critically ill patients using clinically interpretable deep learning. Sci. Rep. 9(1), 1879 (2019)CrossRef
24.
Zurück zum Zitat Van Houdenhoven, M., et al.: Optimizing intensive care capacity using individual length-of-stay prediction models. Crit. Care 11(2), R42 (2007)CrossRef Van Houdenhoven, M., et al.: Optimizing intensive care capacity using individual length-of-stay prediction models. Crit. Care 11(2), R42 (2007)CrossRef
25.
Zurück zum Zitat Xie, P., Xing, E.: A neural architecture for automated ICD coding. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1066–1076 (2018) Xie, P., Xing, E.: A neural architecture for automated ICD coding. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1066–1076 (2018)
26.
Zurück zum Zitat Zeng, M., Li, M., Fei, Z., Yu, Y., Pan, Y., Wang, J.: Automatic ICD-9 coding via deep transfer learning. Neurocomputing 324, 43–50 (2019)CrossRef Zeng, M., Li, M., Fei, Z., Yu, Y., Pan, Y., Wang, J.: Automatic ICD-9 coding via deep transfer learning. Neurocomputing 324, 43–50 (2019)CrossRef
Metadaten
Titel
Hybrid Text Feature Modeling for Disease Group Prediction Using Unstructured Physician Notes
verfasst von
Gokul S. Krishnan
S. Sowmya Kamath
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_24