Skip to main content

2021 | OriginalPaper | Buchkapitel

Virus Causes Flu: Identifying Causality in the Biomedical Domain Using an Ensemble Approach with Target-Specific Semantic Embeddings

verfasst von : Raksha Sharma, Girish Palshikar

Erschienen in: Natural Language Processing and Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Identification of Cause-Effect (CE) relation is crucial for creating a scientific knowledge-base and facilitate question-answering in the biomedical domain. An example sentence having CE relation in the biomedical domain (precisely Leukemia) is: viability of THP-1 cells was inhibited by COR. Here, COR is the cause argument, viability of THP-1 cells is the effect argument and inhibited is the trigger word creating a causal scenario. Notably CE relation has a temporal order between cause and effect arguments. In this paper, we harness this property and hypothesize that the temporal order of CE relation can be captured well by the Long Short Term Memory (LSTM) network with independently obtained semantic embeddings of words trained on the targeted disease data. These focused semantic embeddings of words overcome the labeled data requirement of the LSTM network. We extensively validate our hypothesis using three types of word embeddings, viz., GloVe, PubMed, and target-specific where the target (focus) is Leukemia. We obtain a statistically significant improvement in the performance with LSTM using GloVe and target-specific embeddings over other baseline models. Furthermore, we show that an ensemble of LSTM models gives a significant improvement (\(\sim \)3%) over the individual models as per the t-test. Our CE relation classification system’s results generate a knowledge-base of 277478 CE relation mentions using a rule-based approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Causal questions are frequently used in general on Web. Naver Knowledge iN, http://​kin.​naver.​com reported 130,000 causal questions from 950,000 sentence-sized database [18].
 
Literatur
1.
Zurück zum Zitat Ananiadou, S., Mcnaught, J.: Text mining for biology and biomedicine. Citeseer (2006) Ananiadou, S., Mcnaught, J.: Text mining for biology and biomedicine. Citeseer (2006)
2.
Zurück zum Zitat Berry, K.J., Mielke, P.W., Jr.: A generalization of cohen’s kappa agreement measure to interval measurement and multiple raters. Educ. Psychol. Meas. 48(4), 921–933 (1988)CrossRef Berry, K.J., Mielke, P.W., Jr.: A generalization of cohen’s kappa agreement measure to interval measurement and multiple raters. Educ. Psychol. Meas. 48(4), 921–933 (1988)CrossRef
4.
Zurück zum Zitat Cohen, K.B., Hunter, L.: Getting started in text mining. PLoS Comput. Biol. 4(1), e20 (2008)CrossRef Cohen, K.B., Hunter, L.: Getting started in text mining. PLoS Comput. Biol. 4(1), e20 (2008)CrossRef
5.
Zurück zum Zitat Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391 (1990)CrossRef Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391 (1990)CrossRef
6.
Zurück zum Zitat Do, Q.X., Chan, Y.S., Roth, D.: Minimally supervised event causality identification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 294–303. Association for Computational Linguistics (2011) Do, Q.X., Chan, Y.S., Roth, D.: Minimally supervised event causality identification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 294–303. Association for Computational Linguistics (2011)
8.
Zurück zum Zitat Girju, R.: Automatic detection of causal relations for question answering. In: Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering-Volume 12, pp. 76–83. Association for Computational Linguistics (2003) Girju, R.: Automatic detection of causal relations for question answering. In: Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering-Volume 12, pp. 76–83. Association for Computational Linguistics (2003)
9.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
10.
Zurück zum Zitat Joskowicz, L., Ksiezyck, T., Grishman, R.: Deep domain models for discourse analysis. In: AI Systems in Government Conference, 1989, Proceedings of the Annual, pp. 195–200. IEEE (1989) Joskowicz, L., Ksiezyck, T., Grishman, R.: Deep domain models for discourse analysis. In: AI Systems in Government Conference, 1989, Proceedings of the Annual, pp. 195–200. IEEE (1989)
11.
Zurück zum Zitat Kaplan, R.M., Berry-Rogghe, G.: Knowledge-based acquisition of causal relationships in text. Knowl. Acquisition 3(3), 317–337 (1991)CrossRef Kaplan, R.M., Berry-Rogghe, G.: Knowledge-based acquisition of causal relationships in text. Knowl. Acquisition 3(3), 317–337 (1991)CrossRef
12.
Zurück zum Zitat Khoo, C.S., Kornfilt, J., Oddy, R.N., Myaeng, S.H.: Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing. Literary Linguist. Comput. 13(4), 177–186 (1998)CrossRef Khoo, C.S., Kornfilt, J., Oddy, R.N., Myaeng, S.H.: Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing. Literary Linguist. Comput. 13(4), 177–186 (1998)CrossRef
13.
Zurück zum Zitat Kim, H.D., et al.: Incatomi: integrative causal topic miner between textual and non-textual time series data. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2689–2691. ACM (2012) Kim, H.D., et al.: Incatomi: integrative causal topic miner between textual and non-textual time series data. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2689–2691. ACM (2012)
14.
Zurück zum Zitat Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), pp. 136–140. IEEE (2015) Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), pp. 136–140. IEEE (2015)
15.
Zurück zum Zitat MIHĂILĂ, C., Ananiadou, S.: Recognising discourse causality triggers in the biomedical domain. J. Bioinform. Comput. Biol. 11(06), 1343008 (2013) MIHĂILĂ, C., Ananiadou, S.: Recognising discourse causality triggers in the biomedical domain. J. Bioinform. Comput. Biol. 11(06), 1343008 (2013)
16.
Zurück zum Zitat Mihăilă, C., Ohta, T., Pyysalo, S., Ananiadou, S.: Biocause: annotating and analysing causality in the biomedical domain. BMC Bioinform. 14(1), 2 (2013)CrossRef Mihăilă, C., Ohta, T., Pyysalo, S., Ananiadou, S.: Biocause: annotating and analysing causality in the biomedical domain. BMC Bioinform. 14(1), 2 (2013)CrossRef
17.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
18.
Zurück zum Zitat Moldovan, D., Paşca, M., Harabagiu, S., Surdeanu, M.: Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. (TOIS) 21(2), 133–154 (2003)CrossRef Moldovan, D., Paşca, M., Harabagiu, S., Surdeanu, M.: Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. (TOIS) 21(2), 133–154 (2003)CrossRef
19.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011) Pedregosa, F., et al.: Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
20.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 14, 1532–43 (2014) Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 14, 1532–43 (2014)
21.
Zurück zum Zitat Radinsky, K., Davidovich, S., Markovitch, S.: Learning causality from textual data. In: Proceedings of Learning by Reading for Intelligent Question Answering Conference (2011) Radinsky, K., Davidovich, S., Markovitch, S.: Learning causality from textual data. In: Proceedings of Learning by Reading for Intelligent Question Answering Conference (2011)
22.
23.
Zurück zum Zitat Yin, Y., Jin, Z.: Document sentiment classification based on the word embedding. In: 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (2015) Yin, Y., Jin, Z.: Document sentiment classification based on the word embedding. In: 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (2015)
Metadaten
Titel
Virus Causes Flu: Identifying Causality in the Biomedical Domain Using an Ensemble Approach with Target-Specific Semantic Embeddings
verfasst von
Raksha Sharma
Girish Palshikar
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-80599-9_9

Premium Partner