Skip to main content
Top

2019 | OriginalPaper | Chapter

6. Convolutional Neural Networks

Authors : Uday Kamath, John Liu, James Whitaker

Published in: Deep Learning for NLP and Speech Recognition

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the last few years, convolutional neural networks (CNNs), along with recurrent neural networks (RNNs), have become a basic building block in constructing complex deep learning solutions for various NLP, speech, and time series tasks. LeCun first introduced certain basic parts of the CNN frameworks as a general NN framework to solve various high-dimensional data problems in computer vision, speech, and time series. ImageNet applied convolutions to recognize objects in images; by improving substantially on the state of the art, ImageNet revived interest in deep learning and CNNs. Collobert et al. pioneered the application of CNNs to NLP tasks, such as POS tagging, chunking, named entity resolution, and semantic role labeling. Many changes to CNNs, from input representation, number of layers, types of pooling, optimization techniques, and applications to various NLP tasks have been active subjects of research in the last decade.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
[AS17a]
go back to reference Heike Adel and Hinrich Schütze. “Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification”. In: EMNLP. Association for Computational Linguistics, 2017, pp. 1723–1729. Heike Adel and Hinrich Schütze. “Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification”. In: EMNLP. Association for Computational Linguistics, 2017, pp. 1723–1729.
[Bro+94]
go back to reference Jane Bromley et al. “Signature Verification using a “Siamese” Time Delay Neural Network”. In: Advances in Neural Information Processing Systems 6. Ed. by J. D. Cowan, G. Tesauro, and J. Alspector. Morgan-Kaufmann, 1994, pp. 737–744. Jane Bromley et al. “Signature Verification using a “Siamese” Time Delay Neural Network”. In: Advances in Neural Information Processing Systems 6. Ed. by J. D. Cowan, G. Tesauro, and J. Alspector. Morgan-Kaufmann, 1994, pp. 737–744.
[Che+15]
go back to reference Yubo Chen et al. “Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks”. In: ACL (1). The Association for Computer Linguistics, 2015, pp. 167–176. Yubo Chen et al. “Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks”. In: ACL (1). The Association for Computer Linguistics, 2015, pp. 167–176.
[CL16]
go back to reference Jianpeng Cheng and Mirella Lapata. “Neural Summarization by Extracting Sentences and Words”. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2016, pp. 484–494. Jianpeng Cheng and Mirella Lapata. “Neural Summarization by Extracting Sentences and Words”. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2016, pp. 484–494.
[Col+11]
go back to reference R. Collobert et al. “Natural Language Processing (Almost) from Scratch”. In: Journal of Machine Learning Research 12 (2011), pp. 2493–2537. R. Collobert et al. “Natural Language Processing (Almost) from Scratch”. In: Journal of Machine Learning Research 12 (2011), pp. 2493–2537.
[CW08b]
go back to reference Ronan Collobert and Jason Weston. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning”. In: Proceedings of the 25th International Conference on Machine Learning. ICML ’08. 2008. Ronan Collobert and Jason Weston. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning”. In: Proceedings of the 25th International Conference on Machine Learning. ICML ’08. 2008.
[Con+16]
go back to reference Alexis Conneau et al. “Very Deep Convolutional Networks for Natural Language Processing”. In: CoRR abs/1606.01781 (2016). Alexis Conneau et al. “Very Deep Convolutional Networks for Natural Language Processing”. In: CoRR abs/1606.01781 (2016).
[Den+14]
go back to reference Misha Denil et al. “Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network.” In: CoRR abs/1406.3830 (2014). Misha Denil et al. “Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network.” In: CoRR abs/1406.3830 (2014).
[Don+15b]
go back to reference Li Dong et al. “Question Answering over Freebase with Multi-Column Convolutional Neural Networks”. In: Proceedings of the International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2015, pp. 260–269. Li Dong et al. “Question Answering over Freebase with Multi-Column Convolutional Neural Networks”. In: Proceedings of the International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2015, pp. 260–269.
[DSZ14]
go back to reference Cícero Nogueira Dos Santos and Bianca Zadrozny. “Learning Character-level Representations for Part-of-speech Tagging”. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32. ICML’14. 2014, pp. II–1818–II–1826. Cícero Nogueira Dos Santos and Bianca Zadrozny. “Learning Character-level Representations for Part-of-speech Tagging”. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32. ICML’14. 2014, pp. II–1818–II–1826.
[Geh+17b]
go back to reference Jonas Gehring et al. “Convolutional Sequence to Sequence Learning”. In: Proceedings of the 34th International Conference on Machine Learning. Ed. by Doina Precup and Yee Whye Teh. Vol. 70. Proceedings of Machine Learning Research. 2017, pp. 1243–1252. Jonas Gehring et al. “Convolutional Sequence to Sequence Learning”. In: Proceedings of the 34th International Conference on Machine Learning. Ed. by Doina Precup and Yee Whye Teh. Vol. 70. Proceedings of Machine Learning Research. 2017, pp. 1243–1252.
[Hu+15]
go back to reference Baotian Hu et al. “Context-Dependent Translation Selection Using Convolutional Neural Network”. In: ACL (2). The Association for Computer Linguistics, 2015, pp. 536–541. Baotian Hu et al. “Context-Dependent Translation Selection Using Convolutional Neural Network”. In: ACL (2). The Association for Computer Linguistics, 2015, pp. 536–541.
[JZ15]
go back to reference Rie Johnson and Tong Zhang. “Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding”. In: Advances in Neural Information Processing Systems 28. Ed. by C. Cortes et al. 2015, pp. 919–927. Rie Johnson and Tong Zhang. “Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding”. In: Advances in Neural Information Processing Systems 28. Ed. by C. Cortes et al. 2015, pp. 919–927.
[KGB14b]
go back to reference Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. “A Convolutional Neural Network for Modelling Sentences”. In: CoRR abs/1404.2188 (2014). Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. “A Convolutional Neural Network for Modelling Sentences”. In: CoRR abs/1404.2188 (2014).
[Kim14b]
go back to reference Yoon Kim. “Convolutional Neural Networks for Sentence Classification”. In: CoRR abs/1408.5882 (2014). Yoon Kim. “Convolutional Neural Networks for Sentence Classification”. In: CoRR abs/1408.5882 (2014).
[KSH12a]
go back to reference Alex Krizhevsky, I Sutskever, and G. E Hinton. “ImageNet Classification with Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems (NIPS 2012). 2012, p. 4. Alex Krizhevsky, I Sutskever, and G. E Hinton. “ImageNet Classification with Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems (NIPS 2012). 2012, p. 4.
[LBC17]
go back to reference Jey Han Lau, Timothy Baldwin, and Trevor Cohn. “Topically Driven Neural Language Model”. In: ACL (1). Association for Computational Linguistics, 2017, pp. 355–365. Jey Han Lau, Timothy Baldwin, and Trevor Cohn. “Topically Driven Neural Language Model”. In: ACL (1). Association for Computational Linguistics, 2017, pp. 355–365.
[LG16]
go back to reference Andrew Lavin and Scott Gray. “Fast Algorithms for Convolutional Neural Networks”. In: CVPR. IEEE Computer Society, 2016, pp. 4013–4021. Andrew Lavin and Scott Gray. “Fast Algorithms for Convolutional Neural Networks”. In: CVPR. IEEE Computer Society, 2016, pp. 4013–4021.
[LB95]
go back to reference Y. LeCun and Y. Bengio. “Convolutional Networks for Images, Speech, and Time-Series”. In: The Handbook of Brain Theory and Neural Networks. 1995. Y. LeCun and Y. Bengio. “Convolutional Networks for Images, Speech, and Time-Series”. In: The Handbook of Brain Theory and Neural Networks. 1995.
[LeC+98]
go back to reference Yann LeCun et al. “Gradient-Based Learning Applied to Document Recognition”. In: Proceedings of the IEEE. Vol. 86. 1998, pp. 2278–2324.CrossRef Yann LeCun et al. “Gradient-Based Learning Applied to Document Recognition”. In: Proceedings of the IEEE. Vol. 86. 1998, pp. 2278–2324.CrossRef
[Li+15]
go back to reference Yujia Li et al. “Gated Graph Sequence Neural Networks”. In: CoRRabs/1511.05493 (2015). Yujia Li et al. “Gated Graph Sequence Neural Networks”. In: CoRRabs/1511.05493 (2015).
[LZ16]
go back to reference Depeng Liang and Yongdong Zhang. “AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification”. In: CoRR abs/1611.01884 (2016). Depeng Liang and Yongdong Zhang. “AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification”. In: CoRR abs/1611.01884 (2016).
[Ma+15]
go back to reference Mingbo Ma et al. “Tree-based Convolution for Sentence Modeling”. In: CoRR abs/1507.01839 (2015). Mingbo Ma et al. “Tree-based Convolution for Sentence Modeling”. In: CoRR abs/1507.01839 (2015).
[Men+15]
go back to reference Fandong Meng et al. “Encoding Source Language with Convolutional Neural Network for Machine Translation”. In: ACL (1). The Association for Computer Linguistics, 2015, pp. 20–30. Fandong Meng et al. “Encoding Source Language with Convolutional Neural Network for Machine Translation”. In: ACL (1). The Association for Computer Linguistics, 2015, pp. 20–30.
[Mou+14]
go back to reference Lili Mou et al. “TBCNN: A Tree-Based Convolutional Neural Network for Programming Language Processing”. In: CoRR abs/1409.5718 (2014). Lili Mou et al. “TBCNN: A Tree-Based Convolutional Neural Network for Programming Language Processing”. In: CoRR abs/1409.5718 (2014).
[NG15b]
go back to reference Thien Huu Nguyen and Ralph Grishman. “Relation Extraction: Perspective from Convolutional Neural Networks”. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Association for Computational Linguistics, 2015, pp. 39–48. Thien Huu Nguyen and Ralph Grishman. “Relation Extraction: Perspective from Convolutional Neural Networks”. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Association for Computational Linguistics, 2015, pp. 39–48.
[RSA15]
go back to reference Oren Rippel, Jasper Snoek, and Ryan P. Adams. “Spectral Representations for Convolutional Neural Networks”. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2. NIPS’15. 2015, pp. 2449–2457. Oren Rippel, Jasper Snoek, and Ryan P. Adams. “Spectral Representations for Convolutional Neural Networks”. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2. NIPS’15. 2015, pp. 2449–2457.
[SFH17]
go back to reference Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. “Dynamic Routing Between Capsules”. In: 2017, pp. 3856–3866. Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. “Dynamic Routing Between Capsules”. In: 2017, pp. 3856–3866.
[SG14]
go back to reference Cicero dos Santos and Maira Gatti. “Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts”. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014. Cicero dos Santos and Maira Gatti. “Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts”. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
[SM15]
go back to reference Aliaksei Severyn and Alessandro Moschitti. “Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks”. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’15. 2015, pp. 373–382. Aliaksei Severyn and Alessandro Moschitti. “Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks”. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’15. 2015, pp. 373–382.
[SZ14]
go back to reference Karen Simonyan and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: 2014. Karen Simonyan and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: 2014.
[Sze+17]
go back to reference Christian Szegedy et al. “Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning”. In: AAAI. AAAI Press, 2017, pp. 4278–4284. Christian Szegedy et al. “Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning”. In: AAAI. AAAI Press, 2017, pp. 4278–4284.
[Wan+15a]
go back to reference Peng Wang et al. “Semantic Clustering and Convolutional Neural Network for Short Text Categorization”. In: Proceedings the 7th International Joint Conference on Natural Language Processing. 2015. Peng Wang et al. “Semantic Clustering and Convolutional Neural Network for Short Text Categorization”. In: Proceedings the 7th International Joint Conference on Natural Language Processing. 2015.
[XC16]
go back to reference Yijun Xiao and Kyunghyun Cho. “Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers”. In: CoRR abs/1602.00367 (2016). Yijun Xiao and Kyunghyun Cho. “Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers”. In: CoRR abs/1602.00367 (2016).
[Xu+17]
go back to reference Jiaming Xu et al. “Self-Taught convolutional neural networks for short text clustering”. In: Neural Networks 88 (2017), pp. 22–31.CrossRef Jiaming Xu et al. “Self-Taught convolutional neural networks for short text clustering”. In: Neural Networks 88 (2017), pp. 22–31.CrossRef
[YS16]
go back to reference Wenpeng Yin and Hinrich Schütze. “Multichannel Variable-Size Convolution for Sentence Classification”. In: CoRR abs/1603.04513 (2016). Wenpeng Yin and Hinrich Schütze. “Multichannel Variable-Size Convolution for Sentence Classification”. In: CoRR abs/1603.04513 (2016).
[Yin+16a]
go back to reference Wenpeng Yin et al. “ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs”. In: Transactions of the Association for Computational Linguistics 4 (2016), pp. 259–272.CrossRef Wenpeng Yin et al. “ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs”. In: Transactions of the Association for Computational Linguistics 4 (2016), pp. 259–272.CrossRef
[Yin+16b]
go back to reference Wenpeng Yin et al. “Simple Question Answering by Attentive Convolutional Neural Network”. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, 2016, pp. 1746–1756. Wenpeng Yin et al. “Simple Question Answering by Attentive Convolutional Neural Network”. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, 2016, pp. 1746–1756.
[YK15]
go back to reference Fisher Yu and Vladlen Koltun. “Multi-Scale Context Aggregation by Dilated Convolutions”. In: CoRR abs/1511.07122 (2015). Fisher Yu and Vladlen Koltun. “Multi-Scale Context Aggregation by Dilated Convolutions”. In: CoRR abs/1511.07122 (2015).
[Yu+14]
go back to reference Lei Yu et al. “Deep Learning for Answer Sentence Selection”. In: CoRR abs/1412.1632 (2014). Lei Yu et al. “Deep Learning for Answer Sentence Selection”. In: CoRR abs/1412.1632 (2014).
[ZF13b]
go back to reference Matthew D. Zeiler and Rob Fergus. “Stochastic Pooling for Regularization of Deep Convolutional Neural Networks”. In: CoRR abs/1301.3557 (2013). Matthew D. Zeiler and Rob Fergus. “Stochastic Pooling for Regularization of Deep Convolutional Neural Networks”. In: CoRR abs/1301.3557 (2013).
[ZZL15]
go back to reference Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. “Character level Convolutional Networks for Text Classification”. In: CoRR abs/1509.01626 (2015). Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. “Character level Convolutional Networks for Text Classification”. In: CoRR abs/1509.01626 (2015).
[ZRW16]
go back to reference Ye Zhang, Stephen Roller, and Byron C. Wallace. “MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification”. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016, pp. 1522–1527. Ye Zhang, Stephen Roller, and Byron C. Wallace. “MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification”. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016, pp. 1522–1527.
[ZW17]
go back to reference Ye Zhang and Byron Wallace. “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification”. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, 2017, pp. 253–263. Ye Zhang and Byron Wallace. “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification”. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, 2017, pp. 253–263.
[Zhe+15]
go back to reference Xiaoqing Zheng et al. “Character-based Parsing with Convolutional Neural Network”. In: Proceedings of the 24th International Conference on Artificial Intelligence. IJCAI’15. 2015, pp. 1054–1060. Xiaoqing Zheng et al. “Character-based Parsing with Convolutional Neural Network”. In: Proceedings of the 24th International Conference on Artificial Intelligence. IJCAI’15. 2015, pp. 1054–1060.
[Zho+15]
go back to reference Chunting Zhou et al. In: CoRR abs/1511.08630 (2015). Chunting Zhou et al. In: CoRR abs/1511.08630 (2015).
[Zhu+15]
go back to reference Chenxi Zhu et al. “A Re-ranking Model for Dependency Parser with Recursive Convolutional Neural Network”. In: Proceedings of International Joint Conference on Natural Language Processing. 2015, pp. 1159–1168. Chenxi Zhu et al. “A Re-ranking Model for Dependency Parser with Recursive Convolutional Neural Network”. In: Proceedings of International Joint Conference on Natural Language Processing. 2015, pp. 1159–1168.
Metadata
Title
Convolutional Neural Networks
Authors
Uday Kamath
John Liu
James Whitaker
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-14596-5_6

Premium Partner