Skip to main content

2018 | OriginalPaper | Buchkapitel

Convolution Neural Network with Active Learning for Information Extraction of Enterprise Announcements

verfasst von : Lei Fu, Zhaoxia Yin, Yi Liu, Jun Zhang

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose using convolution neural network (CNN) with active learning for information extraction of enterprise announcements. The training process of supervised deep learning model usually requires a large amount of training data with high-quality reference samples. Human production of such samples is tedious, and since inter-labeler agreement is low, very unreliable. Active learning helps assuage this problem by automatically selecting a small amount of unlabeled samples for humans to hand correct. Active learning chooses a selective set of samples to be labeled. Then the CNN is trained on the labeled data iteratively, until the expected experimental effect is achieved. We propose three sample selection methods based on certainty criterion. We also establish an enterprise announcements dataset for experiments, which contains 10410 samples totally. Our experiment results show that the amount of labeled data needed for a given extraction accuracy can be reduced by more than 45.79% compared to that without active learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yogatama, D., et al.: Generative and discriminative text classification with recurrent neural networks (2017). arXiv preprint, arXiv:1703.01898 Yogatama, D., et al.: Generative and discriminative text classification with recurrent neural networks (2017). arXiv preprint, arXiv:​1703.​01898
3.
Zurück zum Zitat Cao, Z., et al.: Improving multi-document summarization via text classification. In: AAAI (2017) Cao, Z., et al.: Improving multi-document summarization via text classification. In: AAAI (2017)
4.
5.
Zurück zum Zitat Joshi, A.J. Porikli, A.J., Papanikolopoulos, N.: Multi-class active learning for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2372–2379 (2009) Joshi, A.J. Porikli, A.J., Papanikolopoulos, N.: Multi-class active learning for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2372–2379 (2009)
6.
Zurück zum Zitat Tur, G., Hakkani, D.: Combining active and semi-supervised learning for spoken language understanding. Speech Commun. 45(2), 171–186 (2005)CrossRef Tur, G., Hakkani, D.: Combining active and semi-supervised learning for spoken language understanding. Speech Commun. 45(2), 171–186 (2005)CrossRef
7.
Zurück zum Zitat Settles, B.: Active learning literature survey. Univ. Wisconsin, Madison 52(55–66), 11 (2010) Settles, B.: Active learning literature survey. Univ. Wisconsin, Madison 52(55–66), 11 (2010)
8.
Zurück zum Zitat MacKay, David J.C.: Information-based objective functions for active data selection. Neural Comput. 4(4), 590–604 (1992)CrossRef MacKay, David J.C.: Information-based objective functions for active data selection. Neural Comput. 4(4), 590–604 (1992)CrossRef
9.
Zurück zum Zitat Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Active learning with Gaussian processes for object categorization. In: ICCV (2007) Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Active learning with Gaussian processes for object categorization. In: ICCV (2007)
10.
Zurück zum Zitat Roy, N., McCallum, A.: Toward optimal active learning through monte carlo estimation of error reduction. In: ICML (2001) Roy, N., McCallum, A.: Toward optimal active learning through monte carlo estimation of error reduction. In: ICML (2001)
11.
Zurück zum Zitat Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: International Conference on Machine Learning (2016) Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: International Conference on Machine Learning (2016)
12.
Zurück zum Zitat Li, X., Guo, Y.: Active learning with multi-label SVM classification. In: IJCAI (2013) Li, X., Guo, Y.: Active learning with multi-label SVM classification. In: IJCAI (2013)
13.
Zurück zum Zitat Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory. ACM (1992) Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory. ACM (1992)
14.
Zurück zum Zitat McCallumzy, A.K., Nigamy, K.: Employing EM and pool-based active learning for text classification. In: Proceedings of the International Conference on Machine Learning (ICML), pp.359–367 (1998). Citeseer McCallumzy, A.K., Nigamy, K.: Employing EM and pool-based active learning for text classification. In: Proceedings of the International Conference on Machine Learning (ICML), pp.359–367 (1998). Citeseer
15.
Zurück zum Zitat Wang, K., et al.: Cost-effective active learning for deep image classification. IEEE Trans. Circ. Syst. Video Technol. 27(12), 2591–2600 (2017)CrossRef Wang, K., et al.: Cost-effective active learning for deep image classification. IEEE Trans. Circ. Syst. Video Technol. 27(12), 2591–2600 (2017)CrossRef
16.
Zurück zum Zitat Dasgupta, S.: Coarse sample complexity bounds for active learning. In: Advances in Neural Information Processing Systems (2006) Dasgupta, S.: Coarse sample complexity bounds for active learning. In: Advances in Neural Information Processing Systems (2006)
Metadaten
Titel
Convolution Neural Network with Active Learning for Information Extraction of Enterprise Announcements
verfasst von
Lei Fu
Zhaoxia Yin
Yi Liu
Jun Zhang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-99501-4_29

Premium Partner