Skip to main content
Top
Published in: The Journal of Supercomputing 3/2020

16-01-2018

WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs

Paper ID: FC_17_25

Authors: Jianqiang Li, Shenhe Zhao, Jijiang Yang, Zhisheng Huang, Bo Liu, Shi Chen, Hui Pan, Qing Wang

Published in: The Journal of Supercomputing | Issue 3/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Deep learning has achieved remarkable success in a wide range of domains. However, it has not been comprehensively evaluated as a solution for the task of Chinese biomedical named entity recognition (Bio-NER). The traditional deep-learning approach for the Bio-NER task is usually based on the structure of recurrent neural networks (RNN) and only takes word embeddings into consideration, ignoring the value of character-level embeddings to encode the morphological and shape information. We propose an RNN-based approach, WCP-RNN, for the Chinese Bio-NER problem. Our method combines word embeddings and character embeddings to capture orthographic and lexicosemantic features. In addition, POS tags are involved as a priori word information to improve the final performance. The experimental results show our proposed approach outperforms the baseline method; the highest F-scores for subject and lesion detection tasks reach 90.36 and 90.48% with an increase of 3.10 and 2.60% compared with the baseline methods, respectively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Yang J-J, Li J, Mulder J, Wang Y, Chen S, Wu H, Wang Q, Pan H (2015) Emerging information technologies for enhanced healthcare. Comput Ind 69:3–11CrossRef Yang J-J, Li J, Mulder J, Wang Y, Chen S, Wu H, Wang Q, Pan H (2015) Emerging information technologies for enhanced healthcare. Comput Ind 69:3–11CrossRef
2.
go back to reference Zhang S, Elhadad N (2013) Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J Biomed Inform 46(6):1088–1098CrossRef Zhang S, Elhadad N (2013) Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J Biomed Inform 46(6):1088–1098CrossRef
3.
go back to reference Mao R, Xu H, Wu W, Li J, Li Y, Lu M (2015) Overcoming the challenge of variety: big data abstraction, the next evolution of data management for AAL communication systems. IEEE Commun Mag 53(1):42–47CrossRef Mao R, Xu H, Wu W, Li J, Li Y, Lu M (2015) Overcoming the challenge of variety: big data abstraction, the next evolution of data management for AAL communication systems. IEEE Commun Mag 53(1):42–47CrossRef
4.
go back to reference Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Enabling recognition of diseases in biomedical text with machine learning: corpus and benchmark Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Enabling recognition of diseases in biomedical text with machine learning: corpus and benchmark
5.
go back to reference Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6(Sep):1453–1484MathSciNetMATH Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6(Sep):1453–1484MathSciNetMATH
6.
go back to reference Mao R, Zhang P, Li X, Liu X, Lu M (2016) Pivot selection for metric-space indexing. Int J Mach Learn Cybern 7(2):311–323CrossRef Mao R, Zhang P, Li X, Liu X, Lu M (2016) Pivot selection for metric-space indexing. Int J Mach Learn Cybern 7(2):311–323CrossRef
7.
go back to reference LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef
8.
go back to reference Tompson JJ, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in neural information processing systems. pp 1799–1807 Tompson JJ, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in neural information processing systems. pp 1799–1807
9.
go back to reference Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537MATH Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537MATH
10.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119
11.
go back to reference Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. pp 1631–1642 Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. pp 1631–1642
12.
go back to reference Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:​1406.​1078
13.
go back to reference Seok M, Song H-J, Park C-Y, Kim J-D, Kim Y-S (2016) Named entity recognition using word embedding as a feature. Int J Softw Eng Appl 10(2):93–104 Seok M, Song H-J, Park C-Y, Kim J-D, Kim Y-S (2016) Named entity recognition using word embedding as a feature. Int J Softw Eng Appl 10(2):93–104
14.
go back to reference Sahu SK, Anand A (2016) Recurrent neural network models for disease name recognition using domain invariant features. arXiv preprint arXiv:1606.09371 Sahu SK, Anand A (2016) Recurrent neural network models for disease name recognition using domain invariant features. arXiv preprint arXiv:​1606.​09371
15.
go back to reference Tang B, Cao H, Wang X, Chen Q, Xu H (2014) Evaluating word representation features in biomedical named entity recognition tasks. BioMed Res Int 2014:240403 Tang B, Cao H, Wang X, Chen Q, Xu H (2014) Evaluating word representation features in biomedical named entity recognition tasks. BioMed Res Int 2014:240403
16.
go back to reference Li C, Song R, Liakata M, Vlachos A, Seneff S, Zhang X (2015) Using word embedding for bio-event extraction. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015). Association for Computational Linguistics, Stroudsburg, pp 121–126 Li C, Song R, Liakata M, Vlachos A, Seneff S, Zhang X (2015) Using word embedding for bio-event extraction. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015). Association for Computational Linguistics, Stroudsburg, pp 121–126
17.
go back to reference Nie Y, Rong W, Zhang Y, Ouyang Y, Xiong Z (2015) Embedding assisted prediction architecture for event trigger identification. J Bioinform Comput Biol 13(03):1541001CrossRef Nie Y, Rong W, Zhang Y, Ouyang Y, Xiong Z (2015) Embedding assisted prediction architecture for event trigger identification. J Bioinform Comput Biol 13(03):1541001CrossRef
18.
go back to reference Jagannatha AN, Yu H (2016) Bidirectional rnn for medical event detection in electronic health records. In: Proceedings of the Conference. Association for Computational Linguistics. North American Chapter. Meeting, 2016, 473. NIH Public Access Jagannatha AN, Yu H (2016) Bidirectional rnn for medical event detection in electronic health records. In: Proceedings of the Conference. Association for Computational Linguistics. North American Chapter. Meeting, 2016, 473. NIH Public Access
19.
go back to reference Jagannatha AN, Yu H (2016) Structured prediction models for rnn based sequence labeling in clinical text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016, 856. NIH Public Access Jagannatha AN, Yu H (2016) Structured prediction models for rnn based sequence labeling in clinical text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016, 856. NIH Public Access
20.
go back to reference Lei J, Tang B, Lu X, Gao K, Jiang M, Xu H (2013) A comprehensive study of named entity recognition in chinese clinical text. J Am Med Inform Assoc 21(5):808–814CrossRef Lei J, Tang B, Lu X, Gao K, Jiang M, Xu H (2013) A comprehensive study of named entity recognition in chinese clinical text. J Am Med Inform Assoc 21(5):808–814CrossRef
21.
go back to reference Yan Y, Wen D, Wang Y, Wang K (2014) Named entity recognition in chinese medical records based on cascaded conditional random field. J Jilin Univers Eng Technol Edn 6:048 Yan Y, Wen D, Wang Y, Wang K (2014) Named entity recognition in chinese medical records based on cascaded conditional random field. J Jilin Univers Eng Technol Edn 6:048
22.
go back to reference Dong X, Qian L, Guan Y, Huang L, Yu Q, Yang J (2016) A multiclass classification method based on deep learning for named entity recognition in electronic medical records. In: Scientific data summit (NYSDS). IEEE, New York, pp 1–10 Dong X, Qian L, Guan Y, Huang L, Yu Q, Yang J (2016) A multiclass classification method based on deep learning for named entity recognition in electronic medical records. In: Scientific data summit (NYSDS). IEEE, New York, pp 1–10
23.
go back to reference Wu Y, Jiang M, Lei J, Xu H (2015) Named entity recognition in chinese clinical text using deep neural network. Stud Health Technol Inform 216:624 Wu Y, Jiang M, Lei J, Xu H (2015) Named entity recognition in chinese clinical text using deep neural network. Stud Health Technol Inform 216:624
24.
go back to reference Botha J, Blunsom P (2014) Compositional morphology for word representations and language modelling. In: International Conference on Machine Learning. pp 1899–1907 Botha J, Blunsom P (2014) Compositional morphology for word representations and language modelling. In: International Conference on Machine Learning. pp 1899–1907
25.
go back to reference Chen X, Xu L, Liu Z, Sun M, Luan H-B (2015) Joint learning of character and word embeddings. In: IJCAI. pp 1236–1242 Chen X, Xu L, Liu Z, Sun M, Luan H-B (2015) Joint learning of character and word embeddings. In: IJCAI. pp 1236–1242
26.
go back to reference Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRef Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRef
27.
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
28.
go back to reference Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5):602–610CrossRef Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5):602–610CrossRef
29.
go back to reference Dyer C, Ballesteros M, Ling W, Matthews A, Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:1505.08075 Dyer C, Ballesteros M, Ling W, Matthews A, Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:​1505.​08075
30.
go back to reference Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:​1409.​1259
31.
go back to reference Kinga D, Adam JB (2015) A method for stochastic optimization. In: International Conference on Learning Representations (ICLR) Kinga D, Adam JB (2015) A method for stochastic optimization. In: International Conference on Learning Representations (ICLR)
Metadata
Title
WCP-RNN: a novel RNN-based approach for Bio-NER in Chinese EMRs
Paper ID: FC_17_25
Authors
Jianqiang Li
Shenhe Zhao
Jijiang Yang
Zhisheng Huang
Bo Liu
Shi Chen
Hui Pan
Qing Wang
Publication date
16-01-2018
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 3/2020
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-017-2229-x

Other articles of this Issue 3/2020

The Journal of Supercomputing 3/2020 Go to the issue

Premium Partner