Skip to main content

2018 | OriginalPaper | Buchkapitel

Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures

verfasst von : Abhijit Mahalunkar, John D. Kelleher

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The presence of Long Distance Dependencies (LDDs) in sequential data poses significant challenges for computational models. Various recurrent neural architectures have been designed to mitigate this issue. In order to test these state-of-the-art architectures, there is growing need for rich benchmarking datasets. However, one of the drawbacks of existing datasets is the lack of experimental control with regards to the presence and/or degree of LDDs. This lack of control limits the analysis of model performance in relation to the specific challenge posed by LDDs. One way to address this is to use synthetic data having the properties of subregular languages. The degree of LDDs within the generated data can be controlled through the k parameter, length of the generated strings, and by choosing appropriate forbidden strings. In this paper, we explore the capacity of different RNN extensions to model LDDs, by evaluating these models on a sequence of SPk synthesized datasets, where each subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple languages, the presence of LDDs does have significant impact on the performance of recurrent neural architectures, thus making them prime candidate in benchmarking tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Refer Sect. 5.2 Finding the shortest forbidden subsequences in [16] for method to compute forbidden sequences for a particular SP language.
 
Literatur
1.
Zurück zum Zitat Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990)CrossRef Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990)CrossRef
2.
Zurück zum Zitat Hochreiter. S.: Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, TU Munich (1991) Hochreiter. S.: Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, TU Munich (1991)
3.
Zurück zum Zitat Yoshua, B., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef Yoshua, B., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef
4.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
5.
Zurück zum Zitat Graves, A., Wayne, G., Danihelka, I.: Neural Turing Machines. CoRR (2014) Graves, A., Wayne, G., Danihelka, I.: Neural Turing Machines. CoRR (2014)
6.
Zurück zum Zitat Salton, G.D., Ross, R.J., Kelleher, J.D.: Attentive language models. In: Proceedings of the 8th International Joint Conference on Natural Language Processing, pp. 441–450 (2017) Salton, G.D., Ross, R.J., Kelleher, J.D.: Attentive language models. In: Proceedings of the 8th International Joint Conference on Natural Language Processing, pp. 441–450 (2017)
7.
Zurück zum Zitat Merity, S., Xiong, C., Bradbury, J., Socher, R.: Pointer sentinel mixture models. In: ICLR 2016 (2016) Merity, S., Xiong, C., Bradbury, J., Socher, R.: Pointer sentinel mixture models. In: ICLR 2016 (2016)
8.
Zurück zum Zitat Chang, S. et al.: Dilated recurrent neural networks. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 77–87. Curran Associates, Inc. (2017) Chang, S. et al.: Dilated recurrent neural networks. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 77–87. Curran Associates, Inc. (2017)
9.
Zurück zum Zitat Zilly, J.G., Srivastava, R.K., Koutnk, J., Schmidhuber, J.: Recurrent highway networks. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR, vol. 70 (2017) Zilly, J.G., Srivastava, R.K., Koutnk, J., Schmidhuber, J.: Recurrent highway networks. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR, vol. 70 (2017)
10.
Zurück zum Zitat Vorontsov, E., Trabelsi, C., Kadoury, S., Pal, C.: On orthogonality and learning recurrent networks with long term dependencies. In: Proceeding of ICML 2017 (2017) Vorontsov, E., Trabelsi, C., Kadoury, S., Pal, C.: On orthogonality and learning recurrent networks with long term dependencies. In: Proceeding of ICML 2017 (2017)
11.
Zurück zum Zitat Henaff, M., Szlam, A., LeCun, Y.: Recurrent orthogonal networks and long-memory tasks. In: Proceedings of the 33rd International Conference on Machine Learning, PMLR, vol. 48, pp. 2034–2042 (2016) Henaff, M., Szlam, A., LeCun, Y.: Recurrent orthogonal networks and long-memory tasks. In: Proceedings of the 33rd International Conference on Machine Learning, PMLR, vol. 48, pp. 2034–2042 (2016)
12.
Zurück zum Zitat Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993). ISSN 0891–2017 Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993). ISSN 0891–2017
13.
Zurück zum Zitat LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
17.
Zurück zum Zitat Avcu, E., Shibata, C., Heinz, J.: Subregular complexity and deep learning. In: Proceedings of the Conference on Logic and Machine Learning in Natural Language (LaML 2017), vol. 1, pp. 20–33 (2017) Avcu, E., Shibata, C., Heinz, J.: Subregular complexity and deep learning. In: Proceedings of the Conference on Logic and Machine Learning in Natural Language (LaML 2017), vol. 1, pp. 20–33 (2017)
18.
Zurück zum Zitat Heinz. J., Rogers, J.: Estimating strictly piecewise distributions. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 886–896 (2010) Heinz. J., Rogers, J.: Estimating strictly piecewise distributions. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 886–896 (2010)
19.
Zurück zum Zitat Reber, A.S.: Implicit learning of artificial grammars. J. Verbal Learn. Verbal Behav. 6(6), 855–863 (1967)CrossRef Reber, A.S.: Implicit learning of artificial grammars. J. Verbal Learn. Verbal Behav. 6(6), 855–863 (1967)CrossRef
20.
Zurück zum Zitat Tomita, M.: Learning of construction of finite automata from examples using hill-climbing. In: Proceedings of Fourth International Cognitive Science Conference, pp. 105–108 (1982) Tomita, M.: Learning of construction of finite automata from examples using hill-climbing. In: Proceedings of Fourth International Cognitive Science Conference, pp. 105–108 (1982)
21.
Zurück zum Zitat Casey, M.: The dynamics of discrete-time computation, with application to recurrent neural networks and finite statemachine extraction. Neural Comput. 8(6), 1135–1178 (1996)CrossRef Casey, M.: The dynamics of discrete-time computation, with application to recurrent neural networks and finite statemachine extraction. Neural Comput. 8(6), 1135–1178 (1996)CrossRef
22.
Zurück zum Zitat Smith, A.W., Zipser, D.: Encoding sequential structure: experience with the real-time recurrent learning algorithm. Proc. IJCNN I, 645–648 (1989) Smith, A.W., Zipser, D.: Encoding sequential structure: experience with the real-time recurrent learning algorithm. Proc. IJCNN I, 645–648 (1989)
23.
Zurück zum Zitat Chomsky, N.: Three models for the description of language. IRE Trans. Inf. Theory 2, 113–124 (1956)CrossRef Chomsky, N.: Three models for the description of language. IRE Trans. Inf. Theory 2, 113–124 (1956)CrossRef
25.
Zurück zum Zitat Fitch, W.T., Friederici, A.D.: Artificial grammar learning meets formal language theory: an overview. Philos. Trans. R. Soc. B Biol. Sci. 367(1598), 1933–1955 (2012)CrossRef Fitch, W.T., Friederici, A.D.: Artificial grammar learning meets formal language theory: an overview. Philos. Trans. R. Soc. B Biol. Sci. 367(1598), 1933–1955 (2012)CrossRef
26.
Zurück zum Zitat Jager, G., Rogers, J.: Formal language theory: refining the Chomsky hierarchy. Philos. Trans. R. Soc. B Biol. Sci. 367(1598), 1956–1970 (2012)CrossRef Jager, G., Rogers, J.: Formal language theory: refining the Chomsky hierarchy. Philos. Trans. R. Soc. B Biol. Sci. 367(1598), 1956–1970 (2012)CrossRef
27.
Zurück zum Zitat Hulden, M.: Foma: a finite-state compiler and library. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 29–32 (2009) Hulden, M.: Foma: a finite-state compiler and library. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 29–32 (2009)
28.
Zurück zum Zitat Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. In: Proceedings of ICRL (2015) Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. In: Proceedings of ICRL (2015)
Metadaten
Titel
Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures
verfasst von
Abhijit Mahalunkar
John D. Kelleher
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01424-7_19

Premium Partner