Skip to main content

2021 | OriginalPaper | Buchkapitel

Evolving Character-Level DenseNet Architectures Using Genetic Programming

verfasst von : Trevor Londt, Xiaoying Gao, Peter Andreae

Erschienen in: Applications of Evolutionary Computation

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Densely Connected Convolutional Networks (DenseNet) have demonstrated impressive performance on image classification tasks, but limited research has been conducted on using character-level DenseNet (char-DenseNet) architectures for text classification tasks. It is not clear what DenseNet architectures are optimal for text classification tasks. The iterative task of designing, training and testing of char-DenseNets is a time consuming task that requires expert domain knowledge. Evolutionary deep learning (EDL) has been used to automatically design CNN architectures for the image classification domain, thereby mitigating the need for expert domain knowledge. This study demonstrates the first work on using EDL to evolve char-DenseNet architectures for text classification tasks. A novel genetic programming-based algorithm (GP-Dense) coupled with an indirect-encoding scheme, facilitates the evolution of performant char-DenseNet architectures. The algorithm is evaluated on two popular text datasets, and the best-evolved models are benchmarked against four current state-of-the-art character-level CNN and DenseNet models. Results indicate that the algorithm evolves performant models for both datasets that outperform two of the state-of-the-art models in terms of model accuracy and three of the state-of-the-art models in terms of parameter size.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016)MATH Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016)MATH
2.
Zurück zum Zitat Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS 2015, vol. 1, pp. 649–657. MIT Press, Cambridge (2015) Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS 2015, vol. 1, pp. 649–657. MIT Press, Cambridge (2015)
3.
Zurück zum Zitat Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, pp. 1107–1116. Association for Computational Linguistics (April 2017). https://www.aclweb.org/anthology/E17-1104 Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, pp. 1107–1116. Association for Computational Linguistics (April 2017). https://​www.​aclweb.​org/​anthology/​E17-1104
6.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
7.
Zurück zum Zitat Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/n19-1423 Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://​doi.​org/​10.​18653/​v1/​n19-1423
8.
Zurück zum Zitat Le, H.T., Cerisara, C., Denis, A.: Do Convolutional Networks need to be deep for text classification? In: AAAI Workshop on Affective Content Analysis. New Orleans, United States (February 2018) Le, H.T., Cerisara, C., Denis, A.: Do Convolutional Networks need to be deep for text classification? In: AAAI Workshop on Affective Content Analysis. New Orleans, United States (February 2018)
9.
10.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1532–1543. Association for Computational Linguistics (October 2014). https://www.aclweb.org/anthology/D14-1162 Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1532–1543. Association for Computational Linguistics (October 2014). https://​www.​aclweb.​org/​anthology/​D14-1162
11.
Zurück zum Zitat Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017) Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)
12.
Zurück zum Zitat Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning Representations by Back-Propagating Errors, pp. 696–699. MIT Press, Cambridge (1988)MATH Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning Representations by Back-Propagating Errors, pp. 696–699. MIT Press, Cambridge (1988)MATH
13.
Zurück zum Zitat Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992)MATH Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992)MATH
14.
Zurück zum Zitat De Sa, C., Feldman, M., Ré, C., Olukotun, K.: Understanding and optimizing asynchronous low-precision stochastic gradient descent. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pp. 561–574 (2017) De Sa, C., Feldman, M., Ré, C., Olukotun, K.: Understanding and optimizing asynchronous low-precision stochastic gradient descent. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pp. 561–574 (2017)
15.
Zurück zum Zitat Hara, K., Saito, D., Shouno, H.: Analysis of function of rectified linear unit used in deep learning. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015) Hara, K., Saito, D., Shouno, H.: Analysis of function of rectified linear unit used in deep learning. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)
16.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, CoRR abs/1512.0 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, CoRR abs/1512.0 (2015)
17.
Zurück zum Zitat Liang, J., Meyerson, E., Hodjat, B., Fink, D., Mutch, K., Miikkulainen, R.: Evolutionary neural automl for deep learning. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 401–409. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3321707.3321721 Liang, J., Meyerson, E., Hodjat, B., Fink, D., Mutch, K., Miikkulainen, R.: Evolutionary neural automl for deep learning. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 401–409. Association for Computing Machinery, New York (2019). https://​doi.​org/​10.​1145/​3321707.​3321721
18.
Zurück zum Zitat Miikkulainen, R., et al.: Evolving deep neural networks (2017) Miikkulainen, R., et al.: Evolving deep neural networks (2017)
20.
Zurück zum Zitat Gruau, F.: Neural Network Synthesis Using Cellular Encoding and the Genetic Algorithm. Ph.D. Thesis (1994) Gruau, F.: Neural Network Synthesis Using Cellular Encoding and the Genetic Algorithm. Ph.D. Thesis (1994)
21.
Zurück zum Zitat Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification Kaiming. Biochem. Biophys. Res. Commun. 498(1), 254–261 (2018)CrossRef Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification Kaiming. Biochem. Biophys. Res. Commun. 498(1), 254–261 (2018)CrossRef
Metadaten
Titel
Evolving Character-Level DenseNet Architectures Using Genetic Programming
verfasst von
Trevor Londt
Xiaoying Gao
Peter Andreae
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-72699-7_42

Premium Partner