Top

Published in:

2021 | OriginalPaper | Chapter

Virus Causes Flu: Identifying Causality in the Biomedical Domain Using an Ensemble Approach with Target-Specific Semantic Embeddings

Authors : Raksha Sharma, Girish Palshikar

Published in: Natural Language Processing and Information Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Identification of Cause-Effect (CE) relation is crucial for creating a scientific knowledge-base and facilitate question-answering in the biomedical domain. An example sentence having CE relation in the biomedical domain (precisely Leukemia) is: viability of THP-1 cells was inhibited by COR. Here, COR is the cause argument, viability of THP-1 cells is the effect argument and inhibited is the trigger word creating a causal scenario. Notably CE relation has a temporal order between cause and effect arguments. In this paper, we harness this property and hypothesize that the temporal order of CE relation can be captured well by the Long Short Term Memory (LSTM) network with independently obtained semantic embeddings of words trained on the targeted disease data. These focused semantic embeddings of words overcome the labeled data requirement of the LSTM network. We extensively validate our hypothesis using three types of word embeddings, viz., GloVe, PubMed, and target-specific where the target (focus) is Leukemia. We obtain a statistically significant improvement in the performance with LSTM using GloVe and target-specific embeddings over other baseline models. Furthermore, we show that an ensemble of LSTM models gives a significant improvement (\(\sim \)3%) over the individual models as per the t-test. Our CE relation classification system’s results generate a knowledge-base of 277478 CE relation mentions using a rule-based approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Cross-Domain Transfer of Generative Explanations Using Text-to-Text Models

next chapter Multilevel Entity-Informed Business Relation Extraction

Causal questions are frequently used in general on Web. Naver Knowledge iN, http://kin.naver.com reported 130,000 causal questions from 950,000 sentence-sized database [18].

https://en.wikipedia.org/wiki/Leukemia.

Download: https://nlp.stanford.edu/projects/glove/.

Available for download: http://evexdb.org/pmresources/vec-space-models/.

Ananiadou, S., Mcnaught, J.: Text mining for biology and biomedicine. Citeseer (2006)

Berry, K.J., Mielke, P.W., Jr.: A generalization of cohen’s kappa agreement measure to interval measurement and multiple raters. Educ. Psychol. Meas. 48(4), 921–933 (1988)CrossRef

Chang, D.-S., Choi, K.-S.: Causal relation extraction using cue phrase and lexical pair probabilities. In: Su, K.-Y., Tsujii, J., Lee, J.-H., Kwong, O.Y. (eds.) IJCNLP 2004. LNCS (LNAI), vol. 3248, pp. 61–70. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-30211-7_7CrossRef

Cohen, K.B., Hunter, L.: Getting started in text mining. PLoS Comput. Biol. 4(1), e20 (2008)CrossRef

Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391 (1990)CrossRef

Do, Q.X., Chan, Y.S., Roth, D.: Minimally supervised event causality identification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 294–303. Association for Computational Linguistics (2011)

Garcia, D.: COATIS, an NLP system to locate expressions of actions connected by causality links. In: Plaza, E., Benjamins, R. (eds.) EKAW 1997. LNCS, vol. 1319, pp. 347–352. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0026799CrossRef

Girju, R.: Automatic detection of causal relations for question answering. In: Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering-Volume 12, pp. 76–83. Association for Computational Linguistics (2003)

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

10.

Joskowicz, L., Ksiezyck, T., Grishman, R.: Deep domain models for discourse analysis. In: AI Systems in Government Conference, 1989, Proceedings of the Annual, pp. 195–200. IEEE (1989)

11.

Kaplan, R.M., Berry-Rogghe, G.: Knowledge-based acquisition of causal relationships in text. Knowl. Acquisition 3(3), 317–337 (1991)CrossRef

12.

Khoo, C.S., Kornfilt, J., Oddy, R.N., Myaeng, S.H.: Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing. Literary Linguist. Comput. 13(4), 177–186 (1998)CrossRef

13.

Kim, H.D., et al.: Incatomi: integrative causal topic miner between textual and non-textual time series data. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2689–2691. ACM (2012)

14.

Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), pp. 136–140. IEEE (2015)

15.

MIHĂILĂ, C., Ananiadou, S.: Recognising discourse causality triggers in the biomedical domain. J. Bioinform. Comput. Biol. 11(06), 1343008 (2013)

16.

Mihăilă, C., Ohta, T., Pyysalo, S., Ananiadou, S.: Biocause: annotating and analysing causality in the biomedical domain. BMC Bioinform. 14(1), 2 (2013)CrossRef

17.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

18.

Moldovan, D., Paşca, M., Harabagiu, S., Surdeanu, M.: Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. (TOIS) 21(2), 133–154 (2003)CrossRef

19.

Pedregosa, F., et al.: Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)

20.

Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 14, 1532–43 (2014)

21.

Radinsky, K., Davidovich, S., Markovitch, S.: Learning causality from textual data. In: Proceedings of Learning by Reading for Intelligent Question Answering Conference (2011)

22.

Sharma, R., Palshikar, G., Pawar, S.: An unsupervised approach for cause-effect relation extraction from biomedical text. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 419–427. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91947-8_43CrossRef

23.

Yin, Y., Jin, Z.: Document sentiment classification based on the word embedding. In: 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (2015)

Title: Virus Causes Flu: Identifying Causality in the Biomedical Domain Using an Ensemble Approach with Target-Specific Semantic Embeddings
Authors: Raksha Sharma
Girish Palshikar
Publisher: Springer International Publishing
Book: Natural Language Processing and Information Systems
Print ISBN: 978-3-030-80598-2

Electronic ISBN: 978-3-030-80599-9

Copyright Year: 2021
DOI: https://doi.org/10.1007/978-3-030-80599-9_9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner