Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

Erschienen in: Neural Processing Letters 3/2022

12.01.2022

LTP: A New Active Learning Strategy for CRF-Based Named Entity Recognition

verfasst von: Mingyi Liu, Zhiying Tu, Tong Zhang, Tonghua Su, Xiaofei Xu, Zhongjie Wang

Erschienen in: Neural Processing Letters | Ausgabe 3/2022

Einloggen, um Zugang zu erhalten
share
TEILEN

Abstract

In recent years, deep learning has achieved great success in many natural language processing tasks, including named entity recognition. The shortcoming is that a large quantity of manually annotated data is usually required. Previous studies have demonstrated that active learning can considerably reduce the cost of data annotation, but there is still plenty of room for improvement. In real applications, we found that existing uncertainty-based active learning strategies have two shortcomings. First, these strategies prefer to choose long sequences explicitly or implicitly, which increases the annotation burden of annotators. Second, some strategies need to revise and modify the model to generate additional information for sample selection, which increases the workload of the developer and increases the training/prediction time of the model. In this paper, we first examine traditional active learning strategies in specific cases of Word2Vec-BiLSTM-CRF and Bert-CRF that have been widely used in named entity recognition on several typical datasets. Then, we propose an uncertainty-based active learning strategy called the lowest token probability (LTP), which combines the input and output of conditional random field (CRF) to select informative instances. LTP is a simple and powerful strategy that does not favor long sequences and does not need to revise the model. We test LTP on multiple real-world datasets, the experiment results show that compared with existing state-of-the-art selection strategies, LTP can reduce about 20% annotation tokens while maintaining competitive performance on both sentence-level accuracy and entity-level F1-score. Additionally, LTP significantly outperformed all other strategies in selecting valid samples, which dramatically reduced the invalid annotation times of the labelers.
Literatur
1.
Zurück zum Zitat Awasthi P, Balcan MF, Long PM (2014) The power of localization for efficiently learning linear separators with noise. In: Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pp 449–458. ACM Awasthi P, Balcan MF, Long PM (2014) The power of localization for efficiently learning linear separators with noise. In: Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pp 449–458. ACM
2.
Zurück zum Zitat Boreshban Y, Mirbostani SM, Ghassem-Sani G, Mirroshandel SA, Amiriparian S (2021) Improving question answering performance using knowledge distillation and active learning. arXiv preprint arXiv:​2109.​12662 Boreshban Y, Mirbostani SM, Ghassem-Sani G, Mirroshandel SA, Amiriparian S (2021) Improving question answering performance using knowledge distillation and active learning. arXiv preprint arXiv:​2109.​12662
3.
Zurück zum Zitat Chen Y, Lasko TA, Mei Q, Denny JC, Xu H (2015) A study of active learning methods for named entity recognition in clinical text. J Biomed Inform 58:11–18 CrossRef Chen Y, Lasko TA, Mei Q, Denny JC, Xu H (2015) A study of active learning methods for named entity recognition in clinical text. J Biomed Inform 58:11–18 CrossRef
4.
Zurück zum Zitat Chiu JP, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguist 4:357–370 CrossRef Chiu JP, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguist 4:357–370 CrossRef
5.
Zurück zum Zitat Claveau V, Kijak E (2018) Strategies to select examples for active learning with conditional random fields. In: Gelbukh A (ed) Computational linguistics and intelligent text processing. Springer International Publishing, Cham, pp 30–43 CrossRef Claveau V, Kijak E (2018) Strategies to select examples for active learning with conditional random fields. In: Gelbukh A (ed) Computational linguistics and intelligent text processing. Springer International Publishing, Cham, pp 30–43 CrossRef
6.
Zurück zum Zitat Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537 MATH Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537 MATH
7.
Zurück zum Zitat Culotta A, McCallum A (2005) Reducing labeling effort for structured prediction tasks. In: AAAI, vol 5, pp 746–751 Culotta A, McCallum A (2005) Reducing labeling effort for structured prediction tasks. In: AAAI, vol 5, pp 746–751
8.
Zurück zum Zitat Dasgupta S, Kalai AT, Monteleoni C (2005) Analysis of perceptron-based active learning. In: International conference on computational learning theory, pp 249–263. Springer Dasgupta S, Kalai AT, Monteleoni C (2005) Analysis of perceptron-based active learning. In: International conference on computational learning theory, pp 249–263. Springer
9.
Zurück zum Zitat Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805 Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805
10.
Zurück zum Zitat Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In: Advances in neural information processing systems, pp 1019–1027 Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In: Advances in neural information processing systems, pp 1019–1027
11.
Zurück zum Zitat Gal Y, Islam R, Ghahramani Z (2017) Deep bayesian active learning with image data. In: International conference on machine Learning, pp 1183–1192 Gal Y, Islam R, Ghahramani Z (2017) Deep bayesian active learning with image data. In: International conference on machine Learning, pp 1183–1192
13.
Zurück zum Zitat Kim S, Song Y, Kim K, Cha JW, Lee GG (2006) Mmr-based active machine learning for bio named entity recognition. In: Proceedings of the human language technology conference of the NAACL, Companion Volume: Short Papers, pp 69–72. Association for Computational Linguistics Kim S, Song Y, Kim K, Cha JW, Lee GG (2006) Mmr-based active machine learning for bio named entity recognition. In: Proceedings of the human language technology conference of the NAACL, Companion Volume: Short Papers, pp 69–72. Association for Computational Linguistics
14.
Zurück zum Zitat Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, pp 260–270 Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, pp 260–270
15.
Zurück zum Zitat Lewis DD, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Machine learning proceedings 1994, pp 148–156. Elsevier Lewis DD, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Machine learning proceedings 1994, pp 148–156. Elsevier
17.
Zurück zum Zitat Li S, Zhao Z, Hu R, Li W, Liu T, Du X (2018) Analogical reasoning on chinese morphological and semantic relations. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 138–143 Li S, Zhao Z, Hu R, Li W, Liu T, Du X (2018) Analogical reasoning on chinese morphological and semantic relations. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 138–143
18.
Zurück zum Zitat Limsopatham N, Collier NH (2016) Bidirectional lstm for named entity recognition in twitter messages Limsopatham N, Collier NH (2016) Bidirectional lstm for named entity recognition in twitter messages
19.
Zurück zum Zitat Lyu Z, Duolikun D, Dai B, Yao Y, Minervini P, Xiao TZ, Gal Y (2020) You need only uncertain answers: Data efficient multilingual question answering. In: TWorkshop on Uncertainty and Ro-Bustness in Deep Learning Lyu Z, Duolikun D, Dai B, Yao Y, Minervini P, Xiao TZ, Gal Y (2020) You need only uncertain answers: Data efficient multilingual question answering. In: TWorkshop on Uncertainty and Ro-Bustness in Deep Learning
20.
Zurück zum Zitat Marcheggiani D, Artières T (2014) An experimental comparison of active learning strategies for partially labeled sequences. In: EMNLP Marcheggiani D, Artières T (2014) An experimental comparison of active learning strategies for partially labeled sequences. In: EMNLP
21.
Zurück zum Zitat Mesnil G, He X, Deng L, Bengio Y (2013) Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Interspeech, pp 3771–3775 Mesnil G, He X, Deng L, Bengio Y (2013) Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Interspeech, pp 3771–3775
22.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
23.
Zurück zum Zitat Min K, Ma C, Zhao T, Li H (2015) Bosonnlp: An ensemble approach for word segmentation and pos tagging. In: Natural language processing and chinese computing, pp 520–526. Springer Min K, Ma C, Zhao T, Li H (2015) Bosonnlp: An ensemble approach for word segmentation and pos tagging. In: Natural language processing and chinese computing, pp 520–526. Springer
24.
Zurück zum Zitat Nguyen TH, Sil A, Dinu G, Florian R (2016) Toward mention detection robustness with recurrent neural networks. arXiv preprint arXiv:​1602.​07749 Nguyen TH, Sil A, Dinu G, Florian R (2016) Toward mention detection robustness with recurrent neural networks. arXiv preprint arXiv:​1602.​07749
25.
Zurück zum Zitat Peng N, Dredze M (2015) Named entity recognition for chinese social media with jointly trained embeddings. In: Processings of the conference on empirical methods in natural language processing (EMNLP), pp 548—554 Peng N, Dredze M (2015) Named entity recognition for chinese social media with jointly trained embeddings. In: Processings of the conference on empirical methods in natural language processing (EMNLP), pp 548—554
26.
Zurück zum Zitat Peng N, Dredze M (2016) Improving named entity recognition for chinese social media with word segmentation representation learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL), vol 2, pp 149–155 Peng N, Dredze M (2016) Improving named entity recognition for chinese social media with word segmentation representation learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL), vol 2, pp 149–155
27.
Zurück zum Zitat Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543 Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
28.
Zurück zum Zitat Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:​1802.​05365 Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:​1802.​05365
29.
Zurück zum Zitat Qiu X, Qian P, Yin L, Wu S, Huang X (2015) Overview of the nlpcc 2015 shared task: Chinese word segmentation and pos tagging for micro-blog texts. In: Natural language processing and chinese computing, pp 541–549. Springer Qiu X, Qian P, Yin L, Wu S, Huang X (2015) Overview of the nlpcc 2015 shared task: Chinese word segmentation and pos tagging for micro-blog texts. In: Natural language processing and chinese computing, pp 541–549. Springer
31.
Zurück zum Zitat Ritter A, Clark S, Mausam Etzioni O (2011) Named entity recognition in tweets: An experimental study. In: EMNLP Ritter A, Clark S, Mausam Etzioni O (2011) Named entity recognition in tweets: An experimental study. In: EMNLP
32.
Zurück zum Zitat Rosenstein MT, Marx Z, Kaelbling LP, Dietterich TG (2005) To transfer or not to transfer. In: NIPS 2005 workshop on transfer learning, vol 898, pp 1–4 Rosenstein MT, Marx Z, Kaelbling LP, Dietterich TG (2005) To transfer or not to transfer. In: NIPS 2005 workshop on transfer learning, vol 898, pp 1–4
33.
Zurück zum Zitat Scheffer T, Decomain C, Wrobel S (2001) Active hidden markov models for information extraction. In: International symposium on intelligent data analysis, pp 309–318. Springer Scheffer T, Decomain C, Wrobel S (2001) Active hidden markov models for information extraction. In: International symposium on intelligent data analysis, pp 309–318. Springer
34.
Zurück zum Zitat Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the conference on empirical methods in natural language processing, pp 1070–1079. Association for Computational Linguistics Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the conference on empirical methods in natural language processing, pp 1070–1079. Association for Computational Linguistics
35.
Zurück zum Zitat Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth annual workshop on Computational learning theory, pp 287–294. ACM Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth annual workshop on Computational learning theory, pp 287–294. ACM
36.
37.
Zurück zum Zitat Siddhant A, Lipton ZC (2018) Deep bayesian active learning for natural language processing: Results of a large-scale empirical study. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2904–2909 Siddhant A, Lipton ZC (2018) Deep bayesian active learning for natural language processing: Results of a large-scale empirical study. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2904–2909
38.
Zurück zum Zitat Strubell E, Verga P, Belanger D, McCallum A (2017) Fast and accurate entity recognition with iterated dilated convolutions. arXiv preprint arXiv:​1702.​02098 Strubell E, Verga P, Belanger D, McCallum A (2017) Fast and accurate entity recognition with iterated dilated convolutions. arXiv preprint arXiv:​1702.​02098
39.
Zurück zum Zitat Tjong Kim Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp 142–147. https://​aclanthology.​org/​W03-0419 Tjong Kim Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp 142–147. https://​aclanthology.​org/​W03-0419
40.
Zurück zum Zitat Vandoni J, Aldea E, Le Hégarat-Mascle S (2019) Evidential query-by-committee active learning for pedestrian detection in high-density crowds. Int J Approx Reason 104:166–184 MathSciNetCrossRef Vandoni J, Aldea E, Le Hégarat-Mascle S (2019) Evidential query-by-committee active learning for pedestrian detection in high-density crowds. Int J Approx Reason 104:166–184 MathSciNetCrossRef
41.
Zurück zum Zitat Wei K, Iyer R, Bilmes J (2015) Submodularity in data subset selection and active learning. In: International conference on machine learning, pp 1954–1963 Wei K, Iyer R, Bilmes J (2015) Submodularity in data subset selection and active learning. In: International conference on machine learning, pp 1954–1963
42.
Zurück zum Zitat Weischedel R, Pradhan S, Ramshaw L, Kaufman J, Franchini M, El-Bachouti M, Xue N, Palmer M, Hwang JD, Bonial C, et al (2012) Ontonotes release 5.0 Weischedel R, Pradhan S, Ramshaw L, Kaufman J, Franchini M, El-Bachouti M, Xue N, Palmer M, Hwang JD, Bonial C, et al (2012) Ontonotes release 5.0
43.
Zurück zum Zitat Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush AM (2020) Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp 38–45. Association for Computational Linguistics, Online. https://​www.​aclweb.​org/​anthology/​2020.​emnlp-demos.​6 Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush AM (2020) Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp 38–45. Association for Computational Linguistics, Online. https://​www.​aclweb.​org/​anthology/​2020.​emnlp-demos.​6
45.
Zurück zum Zitat Zhang Y, Lan M (2021) A unified information extraction system based on role recognition and combination. In: CCF international conference on natural language processing and chinese computing, pp 447–459. Springer Zhang Y, Lan M (2021) A unified information extraction system based on role recognition and combination. In: CCF international conference on natural language processing and chinese computing, pp 447–459. Springer
Metadaten
Titel
LTP: A New Active Learning Strategy for CRF-Based Named Entity Recognition
verfasst von
Mingyi Liu
Zhiying Tu
Tong Zhang
Tonghua Su
Xiaofei Xu
Zhongjie Wang
Publikationsdatum
12.01.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 3/2022
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-021-10737-x

Weitere Artikel der Ausgabe 3/2022

Neural Processing Letters 3/2022 Zur Ausgabe