nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures

verfasst von : Abhijit Mahalunkar, John D. Kelleher

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The presence of Long Distance Dependencies (LDDs) in sequential data poses significant challenges for computational models. Various recurrent neural architectures have been designed to mitigate this issue. In order to test these state-of-the-art architectures, there is growing need for rich benchmarking datasets. However, one of the drawbacks of existing datasets is the lack of experimental control with regards to the presence and/or degree of LDDs. This lack of control limits the analysis of model performance in relation to the specific challenge posed by LDDs. One way to address this is to use synthetic data having the properties of subregular languages. The degree of LDDs within the generated data can be controlled through the k parameter, length of the generated strings, and by choosing appropriate forbidden strings. In this paper, we explore the capacity of different RNN extensions to model LDDs, by evaluating these models on a sequence of SPk synthesized datasets, where each subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple languages, the presence of LDDs does have significant impact on the performance of recurrent neural architectures, thus making them prime candidate in benchmarking tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Attention-Based RNN Model for Joint Extraction of Intent and Word Slot Based on a Tagging Strategy

Nächstes Kapitel Learning Trends on the Fly in Time Series Data Using Plastic CGP Evolved Recurrent Neural Networks

Refer Sect. 5.2 Finding the shortest forbidden subsequences in [16] for method to compute forbidden sequences for a particular SP language.

Source code available at https://github.com/silentknight/ICANN2018.

LSTM source https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py.

ORNN Source https://github.com/veugene/spectre_release.

RHN source https://github.com/julian121266/RecurrentHighwayNetworks.

Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990)CrossRef

Hochreiter. S.: Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, TU Munich (1991)

Yoshua, B., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

Graves, A., Wayne, G., Danihelka, I.: Neural Turing Machines. CoRR (2014)

Salton, G.D., Ross, R.J., Kelleher, J.D.: Attentive language models. In: Proceedings of the 8th International Joint Conference on Natural Language Processing, pp. 441–450 (2017)

Merity, S., Xiong, C., Bradbury, J., Socher, R.: Pointer sentinel mixture models. In: ICLR 2016 (2016)

Chang, S. et al.: Dilated recurrent neural networks. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 77–87. Curran Associates, Inc. (2017)

Zilly, J.G., Srivastava, R.K., Koutnk, J., Schmidhuber, J.: Recurrent highway networks. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR, vol. 70 (2017)

10.

Vorontsov, E., Trabelsi, C., Kadoury, S., Pal, C.: On orthogonality and learning recurrent networks with long term dependencies. In: Proceeding of ICML 2017 (2017)

11.

Henaff, M., Szlam, A., LeCun, Y.: Recurrent orthogonal networks and long-memory tasks. In: Proceedings of the 33rd International Conference on Machine Learning, PMLR, vol. 48, pp. 2034–2042 (2016)

12.

Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993). ISSN 0891–2017

13.

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

14.

Rogers, J., et al.: On languages piecewise testable in the strict sense. In: Ebert, C., Jäger, G., Michaelis, J. (eds.) MOL 2007/2009. LNCS (LNAI), vol. 6149, pp. 255–265. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14322-9_19CrossRef

15.

Simon, I.: Piecewise testable events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33, pp. 214–222. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07407-4_23CrossRef

16.

Ogihara, M., Tarui, J. (eds.): TAMC 2011. LNCS, vol. 6648. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20877-5CrossRef

17.

Avcu, E., Shibata, C., Heinz, J.: Subregular complexity and deep learning. In: Proceedings of the Conference on Logic and Machine Learning in Natural Language (LaML 2017), vol. 1, pp. 20–33 (2017)

18.

Heinz. J., Rogers, J.: Estimating strictly piecewise distributions. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 886–896 (2010)

19.

Reber, A.S.: Implicit learning of artificial grammars. J. Verbal Learn. Verbal Behav. 6(6), 855–863 (1967)CrossRef

20.

Tomita, M.: Learning of construction of finite automata from examples using hill-climbing. In: Proceedings of Fourth International Cognitive Science Conference, pp. 105–108 (1982)

21.

Casey, M.: The dynamics of discrete-time computation, with application to recurrent neural networks and finite statemachine extraction. Neural Comput. 8(6), 1135–1178 (1996)CrossRef

22.

Smith, A.W., Zipser, D.: Encoding sequential structure: experience with the real-time recurrent learning algorithm. Proc. IJCNN I, 645–648 (1989)

23.

Chomsky, N.: Three models for the description of language. IRE Trans. Inf. Theory 2, 113–124 (1956)CrossRef

24.

Chomsky, N.: On certain formal properties of grammars. Inf. Control. 2, 137–167 (1959)MathSciNetCrossRef

25.

Fitch, W.T., Friederici, A.D.: Artificial grammar learning meets formal language theory: an overview. Philos. Trans. R. Soc. B Biol. Sci. 367(1598), 1933–1955 (2012)CrossRef

26.

Jager, G., Rogers, J.: Formal language theory: refining the Chomsky hierarchy. Philos. Trans. R. Soc. B Biol. Sci. 367(1598), 1956–1970 (2012)CrossRef

27.

Hulden, M.: Foma: a finite-state compiler and library. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 29–32 (2009)

28.

Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. In: Proceedings of ICRL (2015)

Titel: Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures
verfasst von: Abhijit Mahalunkar
John D. Kelleher
Verlag: Springer International Publishing
Buch: Artificial Neural Networks and Machine Learning – ICANN 2018
Print ISBN: 978-3-030-01423-0

Electronic ISBN: 978-3-030-01424-7

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01424-7_19

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner