Skip to main content

2019 | OriginalPaper | Buchkapitel

Multi-layered Learning for Information Extraction from Adverse Drug Event Narratives

verfasst von : Susmitha Wunnava, Xiao Qin, Tabassum Kakar, M. L. Tlachac, Xiangnan Kong, Elke A. Rundensteiner, Sanjay K. Sahoo, Suranjan De

Erschienen in: Biomedical Engineering Systems and Technologies

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recognizing named entities in Adverse Drug Reactions narratives is a crucial step towards extracting valuable patient information from unstructured text and transforming the information into an easily processable structured format. This motivates using advanced data analytics to support data-driven pharmacovigilance. Yet existing biomedical named entity recognition (NER) tools are limited in their ability to identify certain entity types from these domain-specific narratives, resulting in poor accuracy. To address this shortcoming, we propose our novel methodology called Tiered Ensemble Learning System with Diversity (TELS-D), an ensemble approach that integrates a rich variety of named entity recognizers to procure the final result. There are two specific challenges faced by biomedical NER: the classes are imbalanced and the lack of a single best performing method. The first challenge is addressed through a balanced, under-sampled bagging strategy that depends on the imbalance level to overcome this highly skewed data problem. To address the second challenge, we design an ensemble of heterogeneous entity recognizers that leverages a novel ensemble combiner. Our experimental results demonstrate that for biomedical text datasets: (i) a balanced learning environment combined with an ensemble of heterogeneous classifiers consistently improves the performance over individual base learners and (ii) stacking-based ensemble combiner methods outperform simple majority voting based solutions by 0.3 in F1-score.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2014)MATH Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2014)MATH
2.
Zurück zum Zitat Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium, p. 17. AMIA (2001) Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium, p. 17. AMIA (2001)
3.
Zurück zum Zitat Barandela, R., Valdovinos, R.M., Sánchez, J.S.: New applications of ensembles of classifiers. Pattern Anal. Appl. 6(3), 245–256 (2003)MathSciNetCrossRef Barandela, R., Valdovinos, R.M., Sánchez, J.S.: New applications of ensembles of classifiers. Pattern Anal. Appl. 6(3), 245–256 (2003)MathSciNetCrossRef
4.
Zurück zum Zitat Bird, S., et al.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc., Sebastopol (2009)MATH Bird, S., et al.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc., Sebastopol (2009)MATH
5.
Zurück zum Zitat Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)MATH
6.
Zurück zum Zitat Błaszczyński, J., Stefanowski, J., Idkowiak, Ł.: Extending bagging for imbalanced data. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013. Advances in Intelligent Systems and Computing, vol. 226, pp. 269–278. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-319-00969-8_26CrossRef Błaszczyński, J., Stefanowski, J., Idkowiak, Ł.: Extending bagging for imbalanced data. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013. Advances in Intelligent Systems and Computing, vol. 226, pp. 269–278. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-319-00969-8_​26CrossRef
7.
Zurück zum Zitat Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATH Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATH
8.
Zurück zum Zitat Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd Annual Meeting on ACL, pp. 173–180. ACL (2005) Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd Annual Meeting on ACL, pp. 173–180. ACL (2005)
10.
Zurück zum Zitat Doan, S., Xu, H.: Recognizing medication related entities in hospital discharge summaries using support vector machine. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 259–266. ACL (2010) Doan, S., Xu, H.: Recognizing medication related entities in hospital discharge summaries using support vector machine. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 259–266. ACL (2010)
11.
Zurück zum Zitat FDA: FAERS (FDA adverse event reporting system) (2016) FDA: FAERS (FDA adverse event reporting system) (2016)
12.
Zurück zum Zitat Feng, X., et al.: Assessing pancreatic cancer risk associated with dipeptidyl peptidase 4 inhibitors: data mining of FDA adverse event reporting system (FAERS). J. Pharmacovigilance 1, 1–7 (2013) Feng, X., et al.: Assessing pancreatic cancer risk associated with dipeptidyl peptidase 4 inhibitors: data mining of FDA adverse event reporting system (FAERS). J. Pharmacovigilance 1, 1–7 (2013)
13.
Zurück zum Zitat Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3–4), 327–348 (2004)CrossRef Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3–4), 327–348 (2004)CrossRef
14.
Zurück zum Zitat Friedman, C., Alderson, P.O., Austin, J.H., Cimino, J.J., Johnson, S.B.: A general natural-language text processor for clinical radiology. JAMIA 1(2), 161–174 (1994) Friedman, C., Alderson, P.O., Austin, J.H., Cimino, J.J., Johnson, S.B.: A general natural-language text processor for clinical radiology. JAMIA 1(2), 161–174 (1994)
15.
Zurück zum Zitat Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)CrossRef Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)CrossRef
16.
Zurück zum Zitat Ghiasvand, O.: Disease name extraction from clinical text using conditional random fields. Ph.D. thesis, The University of Wisconsin-Milwaukee (2014) Ghiasvand, O.: Disease name extraction from clinical text using conditional random fields. Ph.D. thesis, The University of Wisconsin-Milwaukee (2014)
17.
Zurück zum Zitat Halgrim, S.R., Xia, F., Solti, I., Cadag, E., Uzuner, Ö.: A cascade of classifiers for extracting medication information from discharge summaries. J. Biomed. Semant. 2(3), S2 (2011)CrossRef Halgrim, S.R., Xia, F., Solti, I., Cadag, E., Uzuner, Ö.: A cascade of classifiers for extracting medication information from discharge summaries. J. Biomed. Semant. 2(3), S2 (2011)CrossRef
18.
Zurück zum Zitat Harpaz, R., et al.: Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. 37(10), 777–790 (2014)CrossRef Harpaz, R., et al.: Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. 37(10), 777–790 (2014)CrossRef
19.
Zurück zum Zitat Jagannatha, A.N., Yu, H.: Bidirectional RNN for medical event detection in electronic health records. In: Proceedings of the conference. ACL. North American Chapter. Meeting, vol. 2016, p. 473. NIH Public Access (2016) Jagannatha, A.N., Yu, H.: Bidirectional RNN for medical event detection in electronic health records. In: Proceedings of the conference. ACL. North American Chapter. Meeting, vol. 2016, p. 473. NIH Public Access (2016)
20.
Zurück zum Zitat Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)CrossRef Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)CrossRef
21.
Zurück zum Zitat Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, Stanford, CA, vol. 14, pp. 1137–1145 (1995) Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, Stanford, CA, vol. 14, pp. 1137–1145 (1995)
22.
Zurück zum Zitat Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The sider database of drugs and side effects. Nucleic Acids Res. 44(D1), D1075–D1079 (2015)CrossRef Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The sider database of drugs and side effects. Nucleic Acids Res. 44(D1), D1075–D1079 (2015)CrossRef
23.
Zurück zum Zitat Lazarou, J., Pomeranz, B.H., Corey, P.N.: Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. JAMA 279(15), 1200–1205 (1998)CrossRef Lazarou, J., Pomeranz, B.H., Corey, P.N.: Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. JAMA 279(15), 1200–1205 (1998)CrossRef
25.
Zurück zum Zitat Nguyen, H., Patrick, J.: Text mining in clinical domain: dealing with noise. In: KDD, pp. 549–558 (2016) Nguyen, H., Patrick, J.: Text mining in clinical domain: dealing with noise. In: KDD, pp. 549–558 (2016)
27.
Zurück zum Zitat Ramesh, B.P., Belknap, S.M., Li, Z., Frid, N., West, D.P., Yu, H.: Automatically recognizing medication and adverse event information from food and drug administration’s adverse event reporting system narratives. JMIR Med. Inform. 2(1), e10 (2014)CrossRef Ramesh, B.P., Belknap, S.M., Li, Z., Frid, N., West, D.P., Yu, H.: Automatically recognizing medication and adverse event information from food and drug administration’s adverse event reporting system narratives. JMIR Med. Inform. 2(1), e10 (2014)CrossRef
28.
Zurück zum Zitat Sakaeda, T., Tamon, A., Kadoyama, K., Okuno, Y.: Data mining of the public version of the FDA adverse event reporting system. Int. J. Med. Sci. 10(7), 796 (2013)CrossRef Sakaeda, T., Tamon, A., Kadoyama, K., Okuno, Y.: Data mining of the public version of the FDA adverse event reporting system. Int. J. Med. Sci. 10(7), 796 (2013)CrossRef
29.
Zurück zum Zitat Savova, G.K., et al.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. JAMIA 17(5), 507–513 (2010) Savova, G.K., et al.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. JAMIA 17(5), 507–513 (2010)
30.
Zurück zum Zitat Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990) Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990)
32.
Zurück zum Zitat Tan, P.N., et al.: Introduction to Data Mining. Pearson Education India, New Delhi (2006) Tan, P.N., et al.: Introduction to Data Mining. Pearson Education India, New Delhi (2006)
33.
Zurück zum Zitat Uzuner, Ö., Solti, I., Cadag, E.: Extracting medication information from clinical text. JAMIA 17(5), 514–518 (2010) Uzuner, Ö., Solti, I., Cadag, E.: Extracting medication information from clinical text. JAMIA 17(5), 514–518 (2010)
34.
Zurück zum Zitat Uzuner, Ö., Solti, I., Xia, F., Cadag, E.: Community annotation experiment for ground truth generation for the i2b2 medication challenge. JAMIA 17(5), 519–523 (2010) Uzuner, Ö., Solti, I., Xia, F., Cadag, E.: Community annotation experiment for ground truth generation for the i2b2 medication challenge. JAMIA 17(5), 519–523 (2010)
35.
Zurück zum Zitat Uzuner, Ö., Zhang, X., Sibanda, T.: Machine learning and rule-based approaches to assertion classification. JAMIA 16(1), 109–115 (2009) Uzuner, Ö., Zhang, X., Sibanda, T.: Machine learning and rule-based approaches to assertion classification. JAMIA 16(1), 109–115 (2009)
36.
Zurück zum Zitat Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM, pp. 324–331 (2009) Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM, pp. 324–331 (2009)
37.
Zurück zum Zitat Wilson, A.M., Thabane, L., Holbrook, A.: Application of data mining techniques in pharmacovigilance. BJCP 57(2), 127–134 (2004) Wilson, A.M., Thabane, L., Holbrook, A.: Application of data mining techniques in pharmacovigilance. BJCP 57(2), 127–134 (2004)
38.
Zurück zum Zitat Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)CrossRef Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)CrossRef
39.
Zurück zum Zitat Wunnava, S., et al.: One size does not fit all: an ensemble approach towards information extraction from adverse drug event narratives. In: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, pp. 176–188. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006600201760188 Wunnava, S., et al.: One size does not fit all: an ensemble approach towards information extraction from adverse drug event narratives. In: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, pp. 176–188. INSTICC, SciTePress (2018). https://​doi.​org/​10.​5220/​0006600201760188​
40.
Zurück zum Zitat Xu, H., Stenner, S.P., Doan, S., Johnson, K.B., Waitman, L.R., Denny, J.C.: MedEx: a medication information extraction system for clinical narratives. JAMIA 17(1), 19–24 (2010) Xu, H., Stenner, S.P., Doan, S., Johnson, K.B., Waitman, L.R., Denny, J.C.: MedEx: a medication information extraction system for clinical narratives. JAMIA 17(1), 19–24 (2010)
Metadaten
Titel
Multi-layered Learning for Information Extraction from Adverse Drug Event Narratives
verfasst von
Susmitha Wunnava
Xiao Qin
Tabassum Kakar
M. L. Tlachac
Xiangnan Kong
Elke A. Rundensteiner
Sanjay K. Sahoo
Suranjan De
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-29196-9_22