nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Taking Gradients Through Experiments: LSTMs and Memory Proximal Policy Optimization for Black-Box Quantum Control

verfasst von : Moritz August, José Miguel Hernández-Lobato

Erschienen in: High Performance Computing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this work we introduce a general method to solve quantum control tasks as an interesting reinforcement learning problem not yet discussed in the machine learning community. We analyze the structure of the reinforcement learning problems typically arising in quantum physics and argue that agents parameterized by long short-term memory (LSTM) networks trained via stochastic policy gradients yield a versatile method to solving them. In this context we introduce a variant of the proximal policy optimization (PPO) algorithm called the memory proximal policy optimization (MPPO) which is based on the previous analysis. We argue that our method can by design be easily combined with numerical simulations as well as real experiments providing the reward signal. We demonstrate how the method can incorporate physical domain knowledge and present results of numerical experiments showing that it achieves state-of-the-art performance for several learning tasks in quantum control with discrete and continuous control parameters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Exploring the Effects of Code Optimizations on CPU Frequency Margins

Nächstes Kapitel Towards Prediction of Turbulent Flows at High Reynolds Numbers Using High Performance Computing Data and Deep Learning

August, M., Ni, X.: Using recurrent neural networks to optimize dynamical decoupling for quantum memory. Phys. Rev. A 95(1), 012335 (2017)CrossRef

Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.: Quantum machine learning. Nature 549(7671), 195–202 (2017)CrossRef

Bukov, M., Day, A.G., Sels, D., Weinberg, P., Polkovnikov, A., Mehta, P.: Machine learning meets quantum state preparation. the phase diagram of quantum control. arXiv preprint arXiv:1705.00565 (2017)

Caneva, T., Calarco, T., Montangero, S.: Chopped random-basis quantum optimization. Phys. Rev. A 84(2), 022326 (2011)CrossRef

Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

Cohen, C., Tannoudji, B.D., Laloë, F.: Quantum Mechanics, vol. i and ii. Hermann and Wiley, Paris and Hoboken (1977)

Doria, P., Calarco, T., Montangero, S.: Optimal control technique for many-body quantum dynamics. Phys. Rev. Lett. 106, 190501 (2011). https://doi.org/10.1103/PhysRevLett.106.190501CrossRef

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

Khaneja, N., Reiss, T., Kehlet, C., Schulte-Herbrüggen, T., Glaser, S.J.: Optimal control of coupled spin dynamics: design of nmr pulse sequences by gradient ascent algorithms. J. Magn. Reson. 172(2), 296–305 (2005)CrossRef

10.

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

11.

Melnikov, A.A., et al.: Active learning machine learns to create new quantum experiments. In: Proceedings of the National Academy of Sciences, p. 201714936 (2018)

12.

Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

13.

Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRef

14.

Nielsen, M.A., Chuang, I.: Quantum computation and quantum information (2002)

15.

Palittapongarnpim, P., Wittek, P., Zahedinejad, E., Vedaie, S., Sanders, B.C.: Learning in quantum control: high-dimensional global optimization for noisy quantum dynamics. Neurocomputing 268, 116–126 (2017)CrossRef

16.

Quiroz, G., Lidar, D.A.: Optimized dynamical decoupling via genetic algorithms. Phys. Rev. A 88, 052306 (2013). https://doi.org/10.1103/PhysRevA.88.052306CrossRef

17.

Robbins, H.: Some aspects of the sequential design of experiments. In: Lai, T.L., Siegmund, D. (eds.) Herbert Robbins Selected Papers, pp. 169–177. Springer, Newyork (1985)CrossRef

18.

Sakurai, J.J., Commins, E.D.: Modern Quantum Mechanics, Revised edn. AAPT, College Park (1995)

19.

Schollwöck, U.: The density-matrix renormalization group in the age of matrix product states. Ann. Phys. 326(1), 96–192 (2011)MathSciNetCrossRef

20.

Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)

21.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

22.

Silver, D., et al.: Mastering chess and Shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)

23.

Silver, D., et al.: Mastering the game of Go without human knowledge. Nature 550(7676), 354–359 (2017)CrossRef

24.

Souza, A.M., Álvarez, G.A., Suter, D.: Robust dynamical decoupling for quantum computing and quantum memory. Phys. Rev. Lett. 106, 240501 (2011). https://doi.org/10.1103/PhysRevLett.106.240501CrossRef

25.

Viola, L., Knill, E., Lloyd, S.: Dynamical decoupling of open quantum systems. Phys. Rev. Lett. 82, 2417–2421 (1999). https://doi.org/10.1103/PhysRevLett.82.2417MathSciNetCrossRefMATH

26.

Wigley, P.B., et al.: Fast machine-learning online optimization of ultra-cold-atom experiments. Sci. Rep. 6, 25890 (2016)CrossRef

27.

Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. In: Sutton, R.S. (ed.) Reinforcement Learning. SECS, vol. 173, pp. 5–32. Springer, Boston (1992). https://doi.org/10.1007/978-1-4615-3618-5_2CrossRef

Titel: Taking Gradients Through Experiments: LSTMs and Memory Proximal Policy Optimization for Black-Box Quantum Control
verfasst von: Moritz August
José Miguel Hernández-Lobato
Verlag: Springer International Publishing
Buch: High Performance Computing
Print ISBN: 978-3-030-02464-2

Electronic ISBN: 978-3-030-02465-9

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-02465-9_43

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"