Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

01.05.2020 | Ausgabe 3/2020

Automatic Control and Computer Sciences 3/2020

A Method for Identifying Local Drug Names in Xinjiang Based on BERT-BiLSTM-CRF

Zeitschrift:
Automatic Control and Computer Sciences > Ausgabe 3/2020
Autoren:
Yuhang Song, Shengwei Tian, Long Yu

Abstract

This paper proposes a BERT-BiLSTM-CRF Xinjiang local drug name recognition method embedded in the BERT (Bidirectional Encoder Representations from Transformers) pre-training language model. The method is pre-trained by the two-way Transformer structure. The training method of MaskLM is used to randomly select some Chinese characters of the input sequence to be replaced with special symbols. The word vector is dynamically generated according to the position information of Chinese characters in Xinjiang local drug names, and then the word vector sequence is input into two directions. The LSTM layer is trained to obtain the dependencies between the sequences. Finally, the CRF module takes the joint distribution probability of the entire marker sequence as the output, and obtains the global optimal test result. The model obtains the named entity recognition on the Xinjiang local drug corpus. The accuracy rate is 95.77%, the recall rate is 89.47%, and the F value is 92.52%. The experimental results show that BERT-BiLSTM-CRF can effectively improve the evaluation indexes of Xinjiang local drug name identification methods in practical applications.

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Literatur
Über diesen Artikel

Weitere Artikel der Ausgabe 3/2020

Automatic Control and Computer Sciences 3/2020 Zur Ausgabe