Skip to main content
Top

2021 | OriginalPaper | Chapter

Deep Reinforcement Learning for IoT Interoperability

Authors : Sebastian Klöser, Sebastian Kotstein, Robin Reuben, Timo Zerrer, Christian Decker

Published in: Advances in Automotive Production Technology – Theory and Application

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The Internet of Things (IoT) is coined by many different standards, protocols, and data formats that are often not compatible to each other. Thus, the integration of different heterogeneous (IoT) components into a uniform IoT setup can be a time-consuming manual task. This lacking interoperability between IoT components has been addressed with different approaches in the past. However, only very few of these approaches rely on Machine Learning techniques. In this work, we present a new way towards IoT interoperability based on Deep Reinforcement Learning (DRL). In detail, we demonstrate that DRL algorithms, which use network architectures inspired by Natural Language Processing (NLP), can be applied to learn to control an environment by merely taking raw JSON or XML structures, which reflect the current state of the environment, as input. Applied to IoT setups, where the current state of a component is often reflected by features embedded into JSON or XML structures and exchanged via messages, our NLP DRL approach eliminates the need for feature engineering and manually written code for pre-processing of data, feature extraction, and decision making.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
To make our results more reliable we increase the maximum number of steps within one episode such that, in our case, the maximum return is 500.
 
2
50-dimensional GloVe.6b
 
Literature
1.
go back to reference Adolphs, L., Hofmann, T.: LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games. In: 34th AAAI Conference on Artificial Intelligence, pp. 7342–7349. New York, USA (2020) Adolphs, L., Hofmann, T.: LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games. In: 34th AAAI Conference on Artificial Intelligence, pp. 7342–7349. New York, USA (2020)
2.
go back to reference Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python, 1st edn. O’Reilly Media, Inc. (2009) Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python, 1st edn. O’Reilly Media, Inc. (2009)
3.
go back to reference Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016) Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym. arXiv preprint arXiv:​1606.​01540 (2016)
4.
go back to reference Côté, M.A., Kádár, Á., Yuan, X., Kybartas, B., Barnes, T., Fine, E., Moore, J., Hausknecht, M., El Asri, L., Adada, M., Tay, W., Trischler, A.: TextWorld: A Learning Environment for Text-Based Games. In: Computer Games, pp. 41–75. Cham, Switzerland (2019) Côté, M.A., Kádár, Á., Yuan, X., Kybartas, B., Barnes, T., Fine, E., Moore, J., Hausknecht, M., El Asri, L., Adada, M., Tay, W., Trischler, A.: TextWorld: A Learning Environment for Text-Based Games. In: Computer Games, pp. 41–75. Cham, Switzerland (2019)
5.
go back to reference Fältström, P.: Market-driven Challenges to Open Internet Standards. Global Commission on Internet Governance Paper Series, Centre for International Governance Innovation (CIGI) (33) (2016) Fältström, P.: Market-driven Challenges to Open Internet Standards. Global Commission on Internet Governance Paper Series, Centre for International Governance Innovation (CIGI) (33) (2016)
6.
go back to reference Hochreiter, S., Schmidhuber, J.: Long Short-term Memory. Neural Computation 9(8), 1735–80 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long Short-term Memory. Neural Computation 9(8), 1735–80 (1997)CrossRef
7.
go back to reference Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32(11), 1238–1274 (2013)CrossRef Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32(11), 1238–1274 (2013)CrossRef
8.
go back to reference Lewis, M., Yarats, D., Dauphin, Y.N., Parikh, D., Batra, D.: Deal or No Deal? End-to-End Learning of Negotiation Dialogues. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2443–2453. Copenhagen, Denmark (2017) Lewis, M., Yarats, D., Dauphin, Y.N., Parikh, D., Batra, D.: Deal or No Deal? End-to-End Learning of Negotiation Dialogues. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2443–2453. Copenhagen, Denmark (2017)
9.
go back to reference Luketina, J., Nardelli, N., Farquhar, G., Foerster, J., Andreas, J., Grefenstette, E., Whiteson, S., Rocktäschel, T.: A Survey of Reinforcement Learning Informed by Natural Language. In: Proceedings of the 28th International Joint Conference on Artifical Intelligence (IJCAI-19), pp. 6309–6317. Macao, China (2019) Luketina, J., Nardelli, N., Farquhar, G., Foerster, J., Andreas, J., Grefenstette, E., Whiteson, S., Rocktäschel, T.: A Survey of Reinforcement Learning Informed by Natural Language. In: Proceedings of the 28th International Joint Conference on Artifical Intelligence (IJCAI-19), pp. 6309–6317. Macao, China (2019)
10.
go back to reference Miorandi, D., Sicari, S., Pellegrini, F.D., Chlamtac, I.: Internet of things: Vision, applications and research challenges. Ad Hoc Networks 10(7), 1497–1516 (2012)CrossRef Miorandi, D., Sicari, S., Pellegrini, F.D., Chlamtac, I.: Internet of things: Vision, applications and research challenges. Ad Hoc Networks 10(7), 1497–1516 (2012)CrossRef
11.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)CrossRef Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)CrossRef
12.
go back to reference Pennington, J., Socher, R., Manning, C.D.: GloVe: Global Vectors for Word Representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Doha, Qatar (2014) Pennington, J., Socher, R., Manning, C.D.: GloVe: Global Vectors for Word Representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Doha, Qatar (2014)
13.
go back to reference Sankar, C., Ravi, S.: Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, pp. 1–10. Stockholm, Sweden (2019) Sankar, C., Ravi, S.: Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, pp. 1–10. Stockholm, Sweden (2019)
14.
go back to reference Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)CrossRef Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)CrossRef
15.
go back to reference Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy Gradient Methods for Reinforcement Learning with Function Approximation. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, p. 1057–1063. Denver, USA (1999) Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy Gradient Methods for Reinforcement Learning with Function Approximation. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, p. 1057–1063. Denver, USA (1999)
16.
go back to reference Tsilas, N.L.: Open Innovation and Interoperability. In: DeNardis, L. (ed.) Opening standards: The global politics of interoperability, pp. 97–117. MIT Press, Cambridge, MA, USA (2011) Tsilas, N.L.: Open Innovation and Interoperability. In: DeNardis, L. (ed.) Opening standards: The global politics of interoperability, pp. 97–117. MIT Press, Cambridge, MA, USA (2011)
17.
go back to reference Wallace, E., Wang, Y., Li, S., Singh, S., Gardner, M.: Do NLP Models Know Numbers? Probing Numeracy in Embeddings. In: 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, pp. 5307–5315. Hong Kong, China (2019) Wallace, E., Wang, Y., Li, S., Singh, S., Gardner, M.: Do NLP Models Know Numbers? Probing Numeracy in Embeddings. In: 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, pp. 5307–5315. Hong Kong, China (2019)
18.
go back to reference Wang, H., Zhou, X., Zhou, X., Liu, W., Li, W.: Adaptive and Dynamic Service Composition Using Q-Learning. In: Proceedings of the 22nd IEEE International Conference on Tools with Artificial Intelligence, IEEE, pp. 145–152. Arras, France (2010) Wang, H., Zhou, X., Zhou, X., Liu, W., Li, W.: Adaptive and Dynamic Service Composition Using Q-Learning. In: Proceedings of the 22nd IEEE International Conference on Tools with Artificial Intelligence, IEEE, pp. 145–152. Arras, France (2010)
19.
Metadata
Title
Deep Reinforcement Learning for IoT Interoperability
Authors
Sebastian Klöser
Sebastian Kotstein
Robin Reuben
Timo Zerrer
Christian Decker
Copyright Year
2021
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-62962-8_23

Premium Partner