nach oben

Neural Processing Letters

Erschienen in:

18.06.2018

Multi-task Character-Level Attentional Networks for Medical Concept Normalization

verfasst von: Jinghao Niu, Yehui Yang, Siheng Zhang, Zhengya Sun, Wensheng Zhang

Erschienen in: Neural Processing Letters | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Recognizing standard medical concepts in the colloquial text is significant for kinds of applications such as the medical question answering system. Recently, word-level neural network methods, which can learn complex informal expression features, achieved remarkable performance on this task. However, they have two main limitations: (1) Existing word-level methods cannot learn character structure features inside words and suffer from “Out-of-vocabulary” (OOV) words, which are common in noisy colloquial text. (2) Since these methods handle the normalization task as a classification issue, concept phrases are represented by category labels. Hence the word morphological information inside the concept is lost. In this work, we present a multi-task character-level attentional network model for medical concept normalization. Specifically, the character-level encoding scheme of our model can alleviate the OOV word problem. The attention mechanism can effectively exploit the word morphological information through multi-task training. It generates higher attention weights on domain-related positions in the text sequence, helping the downstream convolution focus on the characters that are related to medical concepts. To test our model, we first introduce a labeled Chinese dataset (overall 314,991 records) for this task. Other two real-world English datasets are also used. Our model outperforms state-of-the-art methods on all three datasets. Besides, by adding four types noises to the datasets, we validate the robustness of our model against common noises in the colloquial text.

Vorheriger Artikel Anti-periodic Solutions for Quaternion-Valued High-Order Hopfield Neural Networks with Time-Varying Delays

Nächster Artikel A Dynamic ELM with Balanced Variance and Bias for Long-Term Online Prediction

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://www.healthtap.com/.

https://twitter.com/.

https://www.nlm.nih.gov/pubs/factsheets/umlsmeta.html.

http://dx.doi.org/10.5281/zenodo.55013.

http://sideeffects.embl.de/.

http://www.askapatient.com.

http://www.ihtsdo.org/snomed-ct.

https://www.120ask.com/.

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

Batool R, Khattak AM, Kim TS, Lee S (2013) Automatic extraction and mapping of discharge summary’s concepts into SNOMED CT. In: Proceedings of the annual international conference of the IEEE engineering in medicine and biology society, pp 4195–4198

Baxter J (2000) A model of inductive bias learning. J Artif Intell Res 12:149–198MathSciNetCrossRefMATH

Benton A, Mitchell M, Hovy D (2017) Multitask learning for mental health conditions with limited social media data. Proc Conf Eur Chapter Assoc Comput Linguist 1:152–162

Chawda VL, Mahalle VS (2017) Learning to recommend descriptive tags for health seekers using deep learning. In: 2017 international conference on inventive systems and control, pp 1–7

Chen H, Qi X, Yu L, Heng PA (2016) DCAN: deep contour-aware networks for accurate gland segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2487–2496

Choi E, Bahadori MT, Kulas JA, Schuetz A, Stewart WF, Sun J (2016) RETAIN: an interpretable predictive model for healthcare using reverse time attention mechanism. In: Proceedings of the neural information processing systems conference, pp 3504–3512

Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for natural language processing. arXiv preprint arXiv:1606.01781

dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of the 27th international conference on computational linguistics, pp 69–78

10.

Golub D, He X (2016) Character-level question answering with attention. arXiv preprint arXiv:1604.00727

11.

Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

12.

Karimi S, Metke-Jimenez A, Kemp M, Wang C (2015) Cadec: a corpus of adverse drug event annotations. Biomed Inform 55:73–81CrossRef

13.

Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the conference on empirical methods in natural language processing, pp 1746–1751

14.

Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Proceedings of the AAAI conference on artificial intelligence, pp 2741–2749

15.

Kingma DP, Ba JL (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

16.

Larochelle H, Hinton G (2010) Learning to combine foveal glimpses with a third-order Boltzmann machine. In: Proceedings of the neural information processing systems conference, pp 1243–1251

17.

Leaman R, Doan RI, Lu Z (2013) DNorm: disease name normalization with pairwise learning to rank. Bioinformatics 29(22):2909–2917CrossRef

18.

Limsopatham N, Collier N (2015) Adapting phrase-based machine translation to normalise medical terms in social media messages. In: Proceedings of the conference on empirical methods in natural language processing, pp 1675–1680

19.

Limsopatham N, Collier N (2016) Normalising medical concepts in social media texts by learning semantic representation. In: Proceedings of ACL, pp 1014–1023

20.

Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the neural information processing systems conference, pp 3111–3119

21.

Miyamoto Y, Cho K (2016) Gated word-character recurrent language model. In: Proceedings of the conference on empirical methods in natural language processing, pp 1992–1997

22.

Moeskops P, Wolterink JM, van der Velden BHM, Gilhuijs KGA, Leiner T, Viergever MA, Išgum I (2016) Deep learning for multi-task medical image segmentation in multiple modalities. International conference on medical image computing and computer-assisted intervention. Springer, Cham, pp 478–486

23.

Nie L, Wang M, Zhang L, Yan S, Zhang B, Chua TS (2015) Disease inference from health-related questions via sparse deep learning. IEEE Trans Knowl Data Eng 27(8):2107–2119CrossRef

24.

Nie L, Zhao YL, Akbari M, Shen J, Chua TS (2015) Bridging the vocabulary gap between health seekers and healthcare knowledge. IEEE Trans Knowl Data Eng 27(2):396–409CrossRef

25.

Nie L, Li T, Akbari M, Shen J, Chua TS (2014) WenZher: comprehensive vertical search for healthcare domain. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, pp 1245–1246

26.

Nie L, Zhang L, Yang Y, Wang M, Hong R, Chua TS (2015) Beyond doctors: future health prediction from multimedia and multimodal observations. In: Proceedings of the 23rd ACM international conference on multimedia, pp 591–600

27.

O’Connor K, Pimpalkhute P, Nikfarjam A, Ginn R, Smith KL, Gonzalez G (2014) Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. In: AMIA annual symposium proceedings, pp 924–933

28.

Robertson S (2009) The probabilistic relevance framework: BM25 and beyond. Found Trends Inf Retr 3(4):333–389CrossRef

29.

Shen Y, Huang X (2016) Attention-based convolutional neural network for semantic relation extraction. In: Proceedings of the international conference on computational linguistics, pp 2526–2536

30.

Shin B, Lee T, Choi JD (2016) Lexicon integrated CNN models with attention for sentiment analysis. arXiv preprint arXiv:1610.06272

31.

Sidana S, Mishra S, Amer-Yahia S, Clausel M, Amini MR (2016) Health monitoring on social media over time. In: Proceedings of the international ACM SIGIR conference on research and development in information retrieval, pp 849–852

32.

Søgaard A, Goldberg Y (2016) Deep multi-task learning with low level tasks supervised at lower layers. In: Proceedings of the annual meeting of the association for computational linguistics, pp 231–235

33.

Song H, Rajan D, Thiagarajan JJ, Spanias A (2017) Attend and diagnose: clinical time series analysis using attention models. arXiv preprint arXiv:1711.03905

34.

Stanovsky G, Gruhl D, Mendes PN (2017) Recognizing mentions of adverse drug reaction in social media using knowledge-infused recurrent models. In: Proceedings of conference of the European chapter of the association for computational linguistics, vol 1, pp 142–151

35.

Yang Z, Dhingra B, Yuan Y, Hu J, Cohen WW, Salakhutdinov R (2016) Words or characters? Fine-grained gating for reading comprehension. arXiv preprint arXiv:1611.01724

36.

Yang Z, Salakhutdinov R, Cohen W (2016) Multi-task cross-lingual sequence tagging from scratch. arXiv preprint arXiv:1603.06270

37.

Yu J (2016) Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification. In: Proceedings of the conference on empirical methods in natural language processing, pp 236–246

38.

Zhang D, Shen D (2012) Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. NeuroImage 59(2):895–907MathSciNetCrossRef

39.

Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inform 13(2):616–624CrossRef

40.

Zhang Z, Xie Y, Xing F, McGough M, Yang L (2017) MDNet: a semantically and visually interpretable medical image diagnosis network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6428–6436

41.

Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Proceedings of the neural information processing systems conference, pp 649–657

42.

Zhou J, Yuan L, Liu J, Ye J (2011) A multi-task learning formulation for predicting disease progression. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 814–822

Titel: Multi-task Character-Level Attentional Networks for Medical Concept Normalization
verfasst von: Jinghao Niu
Yehui Yang
Siheng Zhang
Zhengya Sun
Wensheng Zhang
Publikationsdatum: 18.06.2018
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 3/2019
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-018-9873-x

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence_ieS/© Springer Fachmedien Wiesbaden GmbH, Search Icon, Banner Hanser, Dr. Alexandru Oproiescu/© Dr. Alexandru Oproiescu, Julian Erhard/© Packex GmbH, Cloud Netzwerk Open Banking/© vege / Fotolia, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2019

Pseudo Almost Automorphic Solutions for Multidirectional Associative Memory Neural Network with Mixed Delays

Rapid Pedestrian Detection Based on Deep Omega-Shape Features with Partial Occlusion Handing

Fast-Convergent Fully Connected Deep Learning Model Using Constrained Nodes Input

Passivity and Synchronization of Coupled Reaction–Diffusion Cohen–Grossberg Neural Networks with Fixed and Switching Topologies

A Comparison: Different DCNN Models for Intelligent Object Detection in Remote Sensing Images

Enhancing Face Recognition from Massive Weakly Labeled Data of New Domains

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.