skip to main content
10.1145/3377930.3389847acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article
Open Access
Best Paper

Neuroevolution of self-interpretable agents

Published:26 June 2020Publication History

ABSTRACT

Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight. It is a consequence of the selective attention in perception that lets us remain focused on important parts of our world without distraction from irrelevant details. Motivated by selective attention, we study the properties of artificial agents that perceive the world through the lens of a self-attention bottleneck. By constraining access to only a small fraction of the visual input, we show that their policies are directly interpretable in pixel space. We find neuroevolution ideal for training self-attention architectures for vision-based reinforcement learning (RL) tasks, allowing us to incorporate modules that can include discrete, non-differentiable operations which are useful for our agent. We argue that self-attention has similar properties as indirect encoding, in the sense that large implicit weight matrices are generated from a small number of key-query parameters, thus enabling our agent to solve challenging vision based tasks with at least 1000x fewer parameters than existing methods. Since our agent attends to only task critical visual hints, they are able to generalize to environments where task irrelevant elements are modified while conventional methods fail.1

Skip Supplemental Material Section

Supplemental Material

p414-tang-suppl.mp4

mp4

412.6 KB

References

  1. 2012. Vision: Processing Information. Retrieved January 10, 2020 from https://www.brainfacts.org/thinking-sensing-and-behaving/vision/2012/vision-processing-informationGoogle ScholarGoogle Scholar
  2. Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. 2018. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems. 9505--9515. https://arxiv.org/abs/1810.03292Google ScholarGoogle Scholar
  3. Rishabh Agarwal, Chen Liang, Dale Schuurmans, and Mohammad Norouzi. 2019. Learning to generalize from sparse and underspecified rewards. arXiv preprint arXiv:1902.07198 (2019). https://arxiv.org/abs/1902.07198Google ScholarGoogle Scholar
  4. Jimmy Ba, Geoffrey E Hinton, Volodymyr Mnih, Joel Z Leibo, and Catalin Ionescu. 2016. Using fast weights to attend to the recent past. In Advances in Neural Information Processing Systems. 4331--4339. https://arxiv.org/abs/1610.06258Google ScholarGoogle Scholar
  5. Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2014. Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755 (2014). https://arxiv.org/abs/1412.7755Google ScholarGoogle Scholar
  6. Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens, and Quoc V Le. 2019. Attention augmented convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 3286--3295. https://arxiv.org/abs/1904.09925Google ScholarGoogle ScholarCross RefCross Ref
  7. Benjamin Beyret, José Hernández-Orallo, Lucy Cheke, Marta Halina, Murray Shanahan, and Matthew Crosby. 2019. The Animal-AI Environment: Training and Testing Animal-Like Artificial Cognition. arXiv preprint arXiv:1909.07483 (2019). https://arxiv.org/abs/1909.07483Google ScholarGoogle Scholar
  8. Peter Bloem. 2019. Transformers from Scratch. http://www.peterbloem.nl/ (2019). http://www.peterbloem.nl/blog/transformersGoogle ScholarGoogle Scholar
  9. Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. http://arxiv.org/abs/1606.01540 cite arxiv:1606.01540.Google ScholarGoogle Scholar
  10. Yuning Chai. 2019. Patchwork: A Patch-wise Attention Network for Efficient Object Detection and Segmentation in Video Streams. CoRR abs/1904.01784 (2019). arXiv:1904.01784 http://arxiv.org/abs/1904.01784Google ScholarGoogle Scholar
  11. Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. Neural ordinary differential equations. In Advances in neural information processing systems. 6571--6583. https://arxiv.org/abs/1806.07366Google ScholarGoogle Scholar
  12. Brian Cheung, Eric Weiss, and Bruno Olshausen. 2016. Emergence of foveal image sampling from learning to attend in visual scenes. arXiv preprint arXiv:1611.09430 (2016). https://arxiv.org/abs/1611.09430Google ScholarGoogle Scholar
  13. Jinyoung Choi, Beom-Jin Lee, and Byoung-Tak Zhang. 2017. Multi-Focus Attention Network for Efficient Deep Reinforcement Learning. In The Workshops of the The Thirty-First AAAI Conference on Artificial Intelligence, Saturday, February 4-9, 2017, San Francisco, California, USA. http://aaai.org/ocs/index.php/WS/AAAIW17/paper/view/15100Google ScholarGoogle Scholar
  14. Jeff Clune, Benjamin E Beckmann, Charles Ofria, and Robert T Pennock. 2009. Evolving coordinated quadruped gaits with the HyperNEAT generative encoding. In 2009 iEEE congress on evolutionary computation. IEEE, 2764--2771. https://bit.ly/2SqUrNJGoogle ScholarGoogle Scholar
  15. Jeff Clune, Kenneth O Stanley, Robert T Pennock, and Charles Ofria. 2011. On the performance of indirect encoding across the continuum of regularity. IEEE Transactions on Evolutionary Computation 15, 3 (2011), 346--367. https://bit.ly/2V8g3QGGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  16. Karl Cobbe, Christopher Hesse, Jacob Hilton, and John Schulman. 2019. Leveraging Procedural Generation to Benchmark Reinforcement Learning. arXiv preprint arXiv:1912.01588 (2019). https://arxiv.org/abs/1912.01588Google ScholarGoogle Scholar
  17. Karl Cobbe, Oleg Klimov, Chris Hesse, Taehoon Kim, and John Schulman. 2018. Quantifying generalization in reinforcement learning. arXiv preprint arXiv:1812.02341 (2018). https://arxiv.org/abs/1812.02341Google ScholarGoogle Scholar
  18. Jean-Baptiste Cordonnier, Andreas Loukas, and Martin Jaggi. 2019. On the Relationship between Self-Attention and Convolutional Layers. CoRR abs/1911.03584 (2019). arXiv:1911.03584 http://arxiv.org/abs/1911.03584Google ScholarGoogle Scholar
  19. Giuseppe Cuccu, Julian Togelius, and Philippe Cudré-Mauroux. 2019. Playing atari with six neurons. In Proceedings of the 18th international conference on autonomous agents and multiagent systems. International Foundation for Autonomous Agents and Multiagent Systems, 998--1006.Google ScholarGoogle Scholar
  20. Stanislas Dehaene. 2014. Consciousness and the brain: Deciphering how the brain codes our thoughts. Penguin. https://en.wikipedia.org/wiki/Consciousness_and_the_BrainGoogle ScholarGoogle Scholar
  21. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. Google ScholarGoogle ScholarCross RefCross Ref
  22. Lei Ding, Hao Tang, and Lorenzo Bruzzone. 2019. Improving Semantic Segmentation of Aerial Images Using Patch-based Attention. ArXiv (2019). https://arxiv.org/abs/1911.08877Google ScholarGoogle Scholar
  23. Vincent Dumoulin, Ethan Perez, Nathan Schucher, Florian Strub, Harm de Vries, Aaron Courville, and Yoshua Bengio. 2018. Feature-wise transformations. Distill (2018). Google ScholarGoogle ScholarCross RefCross Ref
  24. Gamaleldin Elsayed, Simon Kornblith, and Quoc V Le. 2019. Saccader: Improving Accuracy of Hard Attention Models for Vision. In Advances in Neural Information Processing Systems. 700--712. https://arxiv.org/abs/1908.07644Google ScholarGoogle Scholar
  25. Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, and Pieter Abbeel. 2016. Deep spatial autoencoders for visuomotor learning. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 512--519. https://arxiv.org/abs/1509.06113Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Daniel Freeman, David Ha, and Luke Metz. 2019. Learning to Predict Without Looking Ahead: World Models Without Forward Prediction. In Advances in Neural Information Processing Systems. 5380--5391. https://learningtopredict.github.io/Google ScholarGoogle Scholar
  27. Jeremy Freeman and Eero P Simoncelli. 2011. Metamers of the ventral stream. Nature neuroscience 14, 9 (2011), 1195. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3164938/Google ScholarGoogle Scholar
  28. Adam Gaier and David Ha. 2019. Weight agnostic neural networks. In Advances in Neural Information Processing Systems. 5365--5379. https://weightagnostic.github.ioGoogle ScholarGoogle Scholar
  29. Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, and Marc G Bellemare. 2019. Deepmdp: Learning continuous latent space models for representation learning. arXiv preprint arXiv:1906.02736 (2019). https://arxiv.org/abs/1906.02736Google ScholarGoogle Scholar
  30. Nils Gessert, Thilo Sentker, Frederic Madesta, Rudiger Schmitz, Helge Kniep, Ivo Baltruschat, Rene Werner, and Alexander Schlaefer. 2019. Skin Lesion Classification Using CNNs with Patch-Based Attention and Diagnosis-Guided Loss Weighting. IEEE Transactions on Biomedical Engineering (2019). https://arxiv.org/abs/1905.02793Google ScholarGoogle Scholar
  31. Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, and Bernhard Schölkopf. 2019. Recurrent independent mechanisms. arXiv preprint arXiv:1909.10893 (2019). https://arxiv.org/abs/1909.10893Google ScholarGoogle Scholar
  32. Roger Grosse and James Martens. 2016. A kronecker-factored approximate fisher matrix for convolution layers. In International Conference on Machine Learning. 573--582. http://www.jmlr.org/proceedings/papers/v48/grosse16.pdfGoogle ScholarGoogle Scholar
  33. D. Ha. 2017. Evolving Stable Strategies. http://blog.otoro.net/ (2017). http://blog.otoro.net/2017/11/12/evolving-stable-strategies/Google ScholarGoogle Scholar
  34. David Ha. 2017. A Visual Guide to Evolution Strategies. http://blog.otoro.net (2017). http://blog.otoro.net/2017/10/29/visual-evolution-strategies/Google ScholarGoogle Scholar
  35. David Ha. 2018. Reinforcement Learning for Improving Agent Design. arXiv:1810.03779 (2018). https://designrl.github.ioGoogle ScholarGoogle Scholar
  36. David Ha, Andrew Dai, and Quoc V Le. 2017. Hypernetworks. In Fifth International Conference on Learning Representations (ICLR 2017). https://openreview.net/forum?id=rkpACe1lxGoogle ScholarGoogle Scholar
  37. David Ha and Jürgen Schmidhuber. 2018. Recurrent World Models Facilitate Policy Evolution. In Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 2451--2463. https://worldmodels.github.ioGoogle ScholarGoogle Scholar
  38. David Ha and Jürgen Schmidhuber. 2018. World models. arXiv preprint arXiv:1803.10122 (2018). https://worldmodels.github.io/Google ScholarGoogle Scholar
  39. Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. 2018. Learning latent dynamics for planning from pixels. arXiv preprint arXiv:1811.04551 (2018). https://planetrl.github.io/Google ScholarGoogle Scholar
  40. Nikolaus Hansen. 2006. The CMA Evolution Strategy: A Comparing Review. Springer Berlin Heidelberg, Berlin, Heidelberg, 75--102. Google ScholarGoogle ScholarCross RefCross Ref
  41. Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. 2019. CMA-ES/pycma on Github. Zenodo, https://doi.org/10.5281/zenodo.2559634 Google ScholarGoogle ScholarCross RefCross Ref
  42. Uri Hasson, Samuel A Nastase, and Ariel Goldstein. 2020. Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks. Neuron 105, 3 (2020), 416--434. https://www.biorxiv.org/content/10.1101/764258v2.fullGoogle ScholarGoogle ScholarCross RefCross Ref
  43. Matthew Hausknecht, Piyush Khandelwal, Risto Miikkulainen, and Peter Stone. 2012. HyperNEAT-GGP: A HyperNEAT-based Atari general game player. In Proceedings of the 14th annual conference on Genetic and evolutionary computation. 217--224. http://nn.cs.utexas.edu/downloads/papers/hausknecht.gecco12.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  44. Donald O Hebb. 1949. The organization of behavior. na. https://en.wikipedia.org/wiki/Hebbian_theoryGoogle ScholarGoogle Scholar
  45. Irina Higgins, Arka Pal, Andrei Rusu, Loic Matthey, Christopher Burgess, Alexander Pritzel, Matthew Botvinick, Charles Blundell, and Alexander Lerchner. 2017. Darla: Improving zero-shot transfer in reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 1480--1490. https://arxiv.org/abs/1707.08475Google ScholarGoogle Scholar
  46. Felix Hill, Andrew Lampinen, Rosalia Schneider, Stephen Clark, Matthew Botvinick, James L McClelland, and Adam Santoro. 2019. Emergent systematic generalization in a situated agent. arXiv preprint arXiv:1910.00571 (2019). https://arxiv.org/abs/1910.00571Google ScholarGoogle Scholar
  47. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Han Hu, Zheng Zhang, Zhenda Xie, and Stephen Lin. 2019. Local relation networks for image recognition. In Proceedings of the IEEE International Conference on Computer Vision. 3464--3473. https://arxiv.org/abs/1904.11491Google ScholarGoogle ScholarCross RefCross Ref
  49. Arthur Juliani, Ahmed Khalifa, Vincent-Pierre Berges, Jonathan Harper, Ervin Teng, Hunter Henry, Adam Crespi, Julian Togelius, and Danny Lange. 2019. Obstacle tower: A generalization challenge in vision, control, and planning. arXiv preprint arXiv:1902.01378 (2019). https://arxiv.org/abs/1902.01378Google ScholarGoogle Scholar
  50. Daniel Kahneman. 2011. Thinking, fast and slow. Farrar, Straus and Giroux, New York. https://en.wikipedia.org/wiki/Thinking,_Fast_and_SlowGoogle ScholarGoogle Scholar
  51. Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, et al. 2019. Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374 (2019).Google ScholarGoogle Scholar
  52. Ken Kansky, Tom Silver, David A Mély, Mohamed Eldawy, Miguel Lázaro-Gredilla, Xinghua Lou, Nimrod Dorfman, Szymon Sidor, Scott Phoenix, and Dileep George. 2017. Schema networks: Zero-shot transfer with a generative causal model of intuitive physics. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 1809--1818. https://arxiv.org/abs/1706.04317Google ScholarGoogle Scholar
  53. Michal Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaskowski. 2016. ViZDoom: A Doom-based AI research platform for visual reinforcement learning. In IEEE Conference on Computational Intelligence and Games, CIG 2016, Santorini, Greece, September 20-23, 2016. 1--8. Google ScholarGoogle ScholarCross RefCross Ref
  54. Oleg Klimov. 2016. CarRacing-v0. Retrieved January 17, 2020 from https://gym.openai.com/envs/CarRacing-v0/Google ScholarGoogle Scholar
  55. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30--37. https://datajobs.com/data-science-repo/Recommender-Systems-[Netflix].pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jan Koutník, Giuseppe Cuccu, Jürgen Schmidhuber, and Faustino Gomez. 2013. Evolving large-scale neural networks for vision-based reinforcement learning. In Proceedings of the 15th annual conference on Genetic and evolutionary computation. 1061--1068. http://people.idsia.ch/~juergen/compressednetworksearch.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  57. Tejas D Kulkarni, Ankush Gupta, Catalin Ionescu, Sebastian Borgeaud, Malcolm Reynolds, Andrew Zisserman, and Volodymyr Mnih. 2019. Unsupervised learning of object keypoints for perception and control. In Advances in Neural Information Processing Systems. 10723--10733. https://bit.ly/2wLiEFZGoogle ScholarGoogle Scholar
  58. Kimin Lee, Kibok Lee, Jinwoo Shin, and Honglak Lee. 2020. Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=HJgcvJBFvBGoogle ScholarGoogle Scholar
  59. Joel Z Leibo, Cyprien de Masson d'Autume, Daniel Zoran, David Amos, Charles Beattie, Keith Anderson, Antonio García Castañeda, Manuel Sanchez, Simon Green, Audrunas Gruslys, et al. 2018. Psychlab: a psychology laboratory for deep reinforcement learning agents. arXiv preprint arXiv:1801.08116 (2018). https://arxiv.org/abs/1801.08116Google ScholarGoogle Scholar
  60. Xiaoteng Ma. 2019. Car Racing with PyTorch. Retrieved March 6, 2020 from https://github.com/xtma/pytorch_car_caringGoogle ScholarGoogle Scholar
  61. Arien Mack, Irvin Rock, et al. 1998. Inattentional blindness. MIT press. https://en.wikipedia.org/wiki/Inattentional_blindnessGoogle ScholarGoogle Scholar
  62. Horia Mania, Aurelia Guy, and Benjamin Recht. 2018. Simple random search of static linear policies is competitive for reinforcement learning. In Advances in Neural Information Processing Systems. 1800--1809. https://bit.ly/38sMEEnGoogle ScholarGoogle Scholar
  63. Thomas Miconi, Jeff Clune, and Kenneth O Stanley. 2018. Differentiable plasticity: training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464 (2018). https://arxiv.org/abs/1804.02464Google ScholarGoogle Scholar
  64. Volodymyr Mnih, Nicolas Heess, Alex Graves, et al. 2014. Recurrent models of visual attention. In Advances in neural information processing systems. 2204--2212. https://arxiv.org/abs/1406.6247Google ScholarGoogle Scholar
  65. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013). https://arxiv.org/abs/1312.5602Google ScholarGoogle Scholar
  66. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529--533. https://daiwk.github.io/assets/dqn.pdfGoogle ScholarGoogle Scholar
  67. Alexander Mott, Daniel Zoran, Mike Chrzanowski, Daan Wierstra, and Danilo Jimenez Rezende. 2019. Towards Interpretable Reinforcement Learning Using Attention Augmented Agents. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada. 12329--12338. https://bit.ly/2Ul97zaGoogle ScholarGoogle Scholar
  68. Nils Müller and Tobias Glasmachers. 2018. Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies. In Parallel Problem Solving from Nature - PPSN XV, Anne Auger, Carlos M. Fonseca, Nuno Lourenço, Penousal Machado, Luís Paquete, and Darrell Whitley (Eds.). Springer International Publishing, Cham, 411--423.Google ScholarGoogle Scholar
  69. Tsendsuren Munkhdalai and Hong Yu. 2017. Meta networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2554--2563. https://arxiv.org/abs/1703.00837Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Charles Packer, Katelyn Gao, Jernej Kos, Philipp Krähenbühl, Vladlen Koltun, and Dawn Song. 2018. Assessing generalization in deep reinforcement learning. arXiv preprint arXiv:1810.12282 (2018). https://arxiv.org/abs/1810.12282Google ScholarGoogle Scholar
  71. Philip Paquette. 2017. DoomTakeCover-v0. Retrieved January 17, 2020 from https://gym.openai.com/envs/DoomTakeCover-v0/Google ScholarGoogle Scholar
  72. Niki Parmar, Prajit Ramachandran, Ashish Vaswani, Irwan Bello, Anselm Levskaya, and Jon Shlens. 2019. Stand-Alone Self-Attention in Vision Models. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada. 68--80. http://papers.nips.cc/paper/8302-stand-alone-self-attention-in-vision-modelsGoogle ScholarGoogle Scholar
  73. Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. (2019). https://bit.ly/31PMViqGoogle ScholarGoogle Scholar
  74. Sebastian Risi and Kenneth O Stanley. 2012. An enhanced hypercube-based encoding for evolving the placement, density, and connectivity of neurons. Artificial Life 18, 4 (2012), 331--363. https://eplex.cs.ucf.edu/papers/risi_alife12.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  75. Sebastian Risi and Kenneth O Stanley. 2013. Confronting the challenge of learning a flexible neural controller for a diversity of morphologies. In Proceedings of the 15th annual conference on Genetic and evolutionary computation. 255--262. https://eplex.cs.ucf.edu/papers/risi_gecco13b.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  76. Sebastian Risi and Kenneth O. Stanley. 2019. Deep neuroevolution of recurrent and discrete world models. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, Prague, Czech Republic, July 13-17, 2019. 456--462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Sebastian Risi and Kenneth O. Stanley. 2020. Improving Deep Neuroevolution via Deep Innovation Protection. CoRR abs/2001.01683 (2020). arXiv:2001.01683 http://arxiv.org/abs/2001.01683Google ScholarGoogle Scholar
  78. Edward Rosten, Gerhard Reitmayr, and Tom Drummond. 2005. Real-time video annotations for augmented reality. In International Symposium on Visual Computing. Springer, 294--302. http://www.edrosten.com/work/rosten_2005_annotations.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  79. Tara N Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. 2013. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 6655--6659. https://bit.ly/39ZEF26Google ScholarGoogle ScholarCross RefCross Ref
  80. T. Salimans, J. Ho, X. Chen, S. Sidor, and I. Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. Preprint arXiv:1703.03864 (2017). https://arxiv.org/abs/1703.03864Google ScholarGoogle Scholar
  81. Juergen Schmidhuber. 1993. A 'self-referential' weight matrix. In International Conference on Artificial Neural Networks. Springer, 446--450. https://mediatum.ub.tum.de/doc/814784/file.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  82. Juergen Schmidhuber. 1997. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks 10, 5 (1997), 857--873. ftp://ftp.idsia.ch/pub/juergen/loconet.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  83. Juergen Schmidhuber and Rudolf Huber. 1991. Learning to generate artificial fovea trajectories for target detection. International Journal of Neural Systems 2, 01n02 (1991), 125--134. http://people.idsia.ch/~juergen/attentive.htmlGoogle ScholarGoogle ScholarCross RefCross Ref
  84. Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy P. Lillicrap, and David Silver. 2019. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. CoRR abs/1911.08265 (2019). arXiv:1911.08265 http://arxiv.org/abs/1911.08265Google ScholarGoogle Scholar
  85. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017). https://arxiv.org/abs/1707.06347Google ScholarGoogle Scholar
  86. Xingyou Song, Yiding Jiang, Stephen Tu, Yilun Du, and Behnam Neyshabur. 2020. Observational Overfitting in Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=HJli2hNKDHGoogle ScholarGoogle Scholar
  87. Ivan Sorokin, Alexey Seleznev, Mikhail Pavlov, Aleksandr Fedorov, and Anastasiia Ignateva. 2015. Deep Attention Recurrent Q-Network. ArXiv abs/1512.01693 (2015). https://arxiv.org/abs/1512.01693Google ScholarGoogle Scholar
  88. E. S Spelke and K. D. Kinzler. 2007. Core knowledge. Developmental Science 10 (2007), 89--96. http://www.wjh.harvard.edu/~lds/pdfs/SpelkeKinzler07.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  89. Kenneth O Stanley. 2007. Compositional pattern producing networks: A novel abstraction of development. Genetic programming and evolvable machines 8, 2 (2007), 131--162. https://eplex.cs.ucf.edu/papers/stanley_gpem07.pdfGoogle ScholarGoogle Scholar
  90. Kenneth O Stanley, David B D'Ambrosio, and Jason Gauci. 2009. A hypercube-based encoding for evolving large-scale neural networks. Artificial life 15, 2 (2009), 185--212. http://eplex.cs.ucf.edu/hyperNEATpage/Google ScholarGoogle Scholar
  91. Kenneth O Stanley and Risto Miikkulainen. 2003. A taxonomy for artificial embryogeny. Artificial Life 9, 2 (2003), 93--130. http://nn.cs.utexas.edu/?stanley:alife03Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Marijn F Stollenga, Jonathan Masci, Faustino Gomez, and Jürgen Schmidhuber. 2014. Deep networks with internal selective attention through feedback connections. In Advances in neural information processing systems. 3545--3553. https://arxiv.org/abs/1407.3068Google ScholarGoogle Scholar
  93. Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O Stanley, and Jeff Clune. 2017. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv preprint arXiv:1712.06567 (2017). https://arxiv.org/abs/1712.06567Google ScholarGoogle Scholar
  94. Gencer Sumbul and Begüm Demir. 2019. A CNN-RNN Framework with a Novel Patch-Based Multi-Attention Mechanism for Multi-Label Image Classification in Remote Sensing. arXiv preprint arXiv:1902.11274 (2019).Google ScholarGoogle Scholar
  95. Supasorn Suwajanakorn, Noah Snavely, Jonathan J Tompson, and Mohammad Norouzi. 2018. Discovery of latent 3d keypoints via end-to-end geometric reasoning. In Advances in Neural Information Processing Systems. 2059--2070. https://keypointnet.github.io/Google ScholarGoogle Scholar
  96. Yujin Tang and David Ha. 2019. How to run evolution strategies on Google Kubernetes Engine. https://cloud.google.com/blog (2019). https://cloud.google.com/blog/products/ai-machine-learning/how-to-run-evolution-strategies-on-google-kubernetes-engineGoogle ScholarGoogle Scholar
  97. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  98. Johannes von Oswald, Christian Henning, João Sacramento, and Benjamin F. Grewe. 2020. Continual learning with hypernetworks. In International Conference on Learning Representations. https://openreview.net/forum?id=SJgwNerKvBGoogle ScholarGoogle Scholar
  99. Edward Vul, Deborah Hanus, and Nancy Kanwisher. 2009. Attention as inference: selection is probabilistic; responses are all-or-none samples. Journal of Experimental Psychology: General 138, 4 (2009), 546. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2822457/Google ScholarGoogle ScholarCross RefCross Ref
  100. Marek Wydmuch, Michal Kempka, and Wojciech Jaskowski. 2019. ViZDoom Competitions: Playing Doom From Pixels. IEEE Trans. Games 11, 3 (2019), 248--259. Google ScholarGoogle ScholarCross RefCross Ref
  101. Chang Ye, Ahmed Khalifa, Philip Bontrager, and Julian Togelius. 2020. Rotation, Translation, and Cropping for Zero-Shot Generalization. arXiv preprint arXiv:2001.09908 (2020). https://arxiv.org/abs/2001.09908Google ScholarGoogle Scholar
  102. Anthony M Zador. 2019. A critique of pure learning and what artificial neural networks can learn from animal brains. Nature communications 10, 1 (2019), 1--7. https://www.nature.com/articles/s41467-019-11786-6Google ScholarGoogle Scholar
  103. Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, and Peter Battaglia. 2019. Deep reinforcement learning with relational inductive biases. In International Conference on Learning Representations. https://openreview.net/forum?id=HkxaFoC9KQGoogle ScholarGoogle Scholar
  104. Amy Zhang, Yuxin Wu, and Joelle Pineau. 2018. Natural environment benchmarks for reinforcement learning. arXiv preprint arXiv:1811.06032 (2018). https://arxiv.org/abs/1811.06032Google ScholarGoogle Scholar
  105. Chenyang Zhao, Olivier Siguad, Freek Stulp, and Timothy M Hospedales. 2019. Investigating generalisation in continuous deep reinforcement learning. arXiv preprint arXiv:1902.07015 (2019). https://arxiv.org/abs/1902.07015Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
    June 2020
    1349 pages
    ISBN:9781450371285
    DOI:10.1145/3377930

    Copyright © 2020 Owner/Author

    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 26 June 2020

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,669of4,410submissions,38%

    Upcoming Conference

    GECCO '24
    Genetic and Evolutionary Computation Conference
    July 14 - 18, 2024
    Melbourne , VIC , Australia

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader