24.02.2020 | Original Article | Ausgabe 9/2020

Fully-connected LSTM–CRF on medical concept extraction
Wichtige Hinweise
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abstract
Patient symptoms, test results, and treatment information have been taking down in extensive electronic records. Specifically, the named entity recognition of these medical concepts has high application value in the clinical field. However, due to issues like patient privacy, labeled data is expensive and difficult to find. In 2010, the i2b2/VA Natural Language Processing Challenge started a conceptual extraction task for electronic medical records. One of the task requirements is to classify natural language descriptions as corresponding concept types. In this paper, we proposed a new fully-connected LSTM network, while the LSTM + CRF model is used as the framework to test the effects of various LSTM structures. The real-data experiments demonstrate that the proposed fully-connected LSTM outperforms many of the mainstream LSTM structures in the quantitative evaluation. It is confirmed that the multi-layer bidirectional fully-connected LSTM cooperates with the character level word vector and the pre-trained word embedding, which achieves similar performance compared with the state-of-the-art methods, avoiding the using of prior knowledge data and ultra-high dimensional feature representation. Moreover, this end-to-end training method saves a lot of feature engineering work and storage spaces.