Abstract
Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific workloads and datasets. Motivated by this shortcoming and inspired by recent advances in applying machine learning to data management challenges, we introduce Neo (Neural Optimizer), a novel learning-based query optimizer that relies on deep neural networks to generate query executions plans. Neo bootstraps its query optimization model from existing optimizers and continues to learn from incoming queries, building upon its successes and learning from its failures. Furthermore, Neo naturally adapts to underlying data patterns and is robust to estimation errors. Experimental results demonstrate that Neo, even when bootstrapped from a simple optimizer like PostgreSQL, can learn a model that offers similar performance to state-of-the-art commercial optimizers, and in some cases even surpass them.
- Embedding tools, https://github.com/parimarjan/db-embedding-tools.Google Scholar
- Ext-JOB queries, https://git.io/extended_job.Google Scholar
- PostgreSQL database, http://www.postgresql.org/.Google Scholar
- T. Anthony, Z. Tian, and D. Barber. Thinking Fast and Slow with Deep Learning and Tree Search. In Advances in Neural Information Processing Systems 30, NIPS '17, pages 5366--5376, 2017. Google ScholarDigital Library
- J. L. Ba, J. R. Kiros, and G. E. Hinton. Layer Normalization. arXiv:1607.06450 {cs, stat}, July 2016.Google Scholar
- B. Babcock and S. Chaudhuri. Towards a Robust Query Optimizer: A Principled and Practical Approach. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD '05, pages 119--130, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- R. Bellman. A Markovian Decision Process. Indiana University Mathematics Journal, 6(4):679--684, 1957.Google ScholarCross Ref
- Y. Bengio. Deep Learning of Representations for Unsupervised and Transfer Learning. In Proceedings of ICML Workshop on Unsupervised and Transfer Learning, ICML WUTL '12, pages 17--36, June 2012. Google ScholarDigital Library
- R. Bordawekar and O. Shmueli. Using Word Embedding to Enable Semantic Queries in Relational Databases. In Proceedings of the 1st Workshop on Data Management for End-to-End Machine Learning (DEEM), DEEM '17, pages 5:1--5:4, 2017. Google ScholarDigital Library
- P. P. Brahma, D. Wu, and Y. She. Why Deep Learning Works: A Manifold Disentanglement Perspective. IEEE Transactions on Neural Networks and Learning Systems, 27(10):1997--2008, Oct. 2016.Google ScholarCross Ref
- S. Chaudhuri. An Overview of Query Optimization in Relational Systems. In ACM SIGMOD Symposium on Principles of Database Systems, SIGMOD '98, pages 34--43, 1998. Google ScholarDigital Library
- R. Dechter and J. Pearl. Generalized Best-first Search Strategies and the Optimality of A*. J. ACM, 32(3):505--536, July 1985. Google ScholarDigital Library
- M. Faruqui, J. Dodge, S. K. Jauhar, C. Dyer, E. H. Hovy, and N. A. Smith. Retrofitting Word Vectors to Semantic Lexicons. In The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL '15, pages 1606--1615, 2015.Google ScholarCross Ref
- R. C. Fernandez and S. Madden. Termite: A System for Tunneling Through Heterogeneous Data. In AIDM @ SIGMOD 2019, aiDM '19, 2019. Google ScholarDigital Library
- L. Giakoumakis and C. A. Galindo-Legaria. Testing SQL Server's Query Optimizer: Challenges, Techniques and Experiences. IEEE Data Eng. Bull., 31:36--43, 2008.Google Scholar
- X. Glorot, A. Bordes, and Y. Bengio. Deep Sparse Rectifier Neural Networks. In G. Gordon, D. Dunson, and M. Dudík, editors, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of PMLR '11, pages 315--323, Fort Lauderdale, FL, USA, Apr. 2011. PMLR.Google Scholar
- A. Heidari, J. McGrath, I. F. Ilyas, and T. Rekatsinas. HoloDetect: Few-Shot Learning for Error Detection. arXiv:1904.02285 {cs}, Apr. 2019. Google ScholarDigital Library
- T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, G. Dulac-Arnold, I. Osband, J. Agapiou, J. Z. Leibo, and A. Gruslys. Deep Q-learning from Demonstrations. In Thirty-Second AAAI Conference on Artifical Intelligence, AAAI '18, New Orleans, Apr. 2017. IEEE.Google Scholar
- I. F. Ilyas, V. Markl, P. Haas, P. Brown, and A. Aboulnaga. CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies. In ACM SIGMOD International Conference on Management of Data, SIGMOD '04, pages 647--658, 2004. Google ScholarDigital Library
- T. Kaftan, M. Balazinska, A. Cheung, and J. Gehrke. Cuttlefish: A Lightweight Primitive for Adaptive Query Processing. arXiv preprint, Feb. 2018.Google Scholar
- D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In 3rd International Conference for Learning Representations, ICLR '15, San Diego, CA, 2015.Google Scholar
- A. Kipf, T. Kipf, B. Radke, V. Leis, P. Boncz, and A. Kemper. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR '19, 2019.Google Scholar
- T. Kraska, M. Alizadeh, A. Beutel, Ed Chi, Ani Kristo, Guillaume Leclerc, Samuel Madden, Hongzi Mao, and Vikram Nathan. SageDB: A Learned Database System. In 9th Biennial Conference on Innovative Data Systems Research, CIDR '19, 2019.Google Scholar
- T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD '18, pages 489--504, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- S. Krishnan, Z. Yang, K. Goldberg, J. Hellerstein, and I. Stoica. Learning to Optimize Join Queries With Deep Reinforcement Learning. arXiv:1808.03196 {cs}, Aug. 2018.Google Scholar
- V. Leis, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, and T. Neumann. How Good Are Query Optimizers, Really? PVLDB, 9(3):204--215, 2015. Google ScholarDigital Library
- V. Leis, B. Radke, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, and T. Neumann. Query optimization through the looking glass, and what we found running the Join Order Benchmark. The VLDB Journal, pages 1--26, Sept. 2017. Google ScholarDigital Library
- H. Liu, M. Xu, Z. Yu, V. Corvinelli, and C. Zuzarte. Cardinality Estimation Using Neural Networks. In Proceedings of the 25th Annual International Conference on Computer Science and Software Engineering, CASCON '15, pages 53--59, Riverton, NJ, USA, 2015. IBM Corp.Google ScholarDigital Library
- W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi. A survey of deep neural network architectures and their applications. Neurocomputing, 234:11--26, Apr. 2017.Google ScholarCross Ref
- G. Lohman. Is Query Optimization a '"Solved" Problem? In ACM SIGMOD Blog, ACM Blog '14, 2014.Google Scholar
- J. Long, E. Shelhamer, and T. Darrell. Fully Convolutional Networks for Semantic Segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), CVPR '15, June 2015.Google Scholar
- H. Mao, M. Schwarzkopf, S. B. Venkatakrishnan, Z. Meng, and M. Alizadeh. Learning Scheduling Algorithms for Data Processing Clusters. arXiv:1810.01963 {cs, stat}, 2018.Google Scholar
- G. Marcus. Innateness, AlphaZero, and Artificial Intelligence. arXiv:1801.05667 {cs}, Jan. 2018.Google Scholar
- R. Marcus, P. Negi, H. Mao, C. Zhang, M. Alizadeh, T. Kraska, O. Papaemmanouil, and N. Tatbul. Neo: Towards A Learned Query Optimizer. arXiv:1904.03711 {cs}, Apr. 2019.Google Scholar
- R. Marcus and O. Papaemmanouil. Deep Reinforcement Learning for Join Order Enumeration. In First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM '18, Houston, TX, 2018. Google ScholarDigital Library
- R. Marcus and O. Papaemmanouil. Towards a Hands-Free Query Optimizer through Deep Learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR '19, 2019.Google Scholar
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient Estimation of Word Representations in Vector Space. arXiv:13013781 {cs}, Jan. 2013.Google Scholar
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, and G. Ostrovski. Human-level control through deep reinforcement learning. Nature, 518(7540):529--533, 2015.Google ScholarCross Ref
- G. Moerkotte, T. Neumann, and G. Steidl. Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors. PVLDB, 2(1):982--993, 2009. Google ScholarDigital Library
- L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI '16, pages 1287--1293, Phoenix, Arizona, 2016. AAAI Press. Google ScholarDigital Library
- S. Mudgal, H. Li, T. Rekatsinas, A. Doan, Y. Park, G. Krishnan, R. Deep, E. Arcaute, and V. Raghavendra. Deep Learning for Entity Matching: A Design Space Exploration. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD '18, pages 19--34, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- B. Nevarez. Inside the SQL Server Query Optimizer. Red Gate books, Mar. 2011. Google ScholarDigital Library
- J. Ortiz, M. Balazinska, J. Gehrke, and S. S. Keerthi. Learning State Representations for Query Optimization with Deep Reinforcement Learning. In 2nd Workshop on Data Managmeent for End-to-End Machine Learning, DEEM '18, 2018. Google ScholarDigital Library
- Y. Park, S. Zhong, and B. Mozafari. QuickSel: Quick Selectivity Learning with Mixture Models. arXiv:1812.10568 {cs}, Dec. 2018.Google Scholar
- M. Poess and C. Floyd. New TPC Benchmarks for Decision Support and Web Commerce. SIGMOD Records, 29(4):64--71, Dec. 2000. Google ScholarDigital Library
- A. G. Read. DeWitt clauses: Can we protect purchasers without hurting Microsoft. Rev. Litig., 25:387, 2006.Google Scholar
- R. Rehůřek and P. Sojka. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, LREC '10, pages 45--50. ELRA, May 2010.Google Scholar
- S. Schaal. Learning from Demonstration. In Proceedings of the 9th International Conference on Neural Information Processing Systems, NIPS'96, pages 1040--1046, Cambridge, MA, USA, 1996. MIT Press. Google ScholarDigital Library
- M. Schaarschmidt, A. Kuhnle, B. Ellis, K. Fricke, F. Gessert, and E. Yoneki. LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations. arXiv:1808.07903 {cs, stat}, Aug. 2018.Google Scholar
- J. Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61:85--117, Jan. 2015. Google ScholarDigital Library
- P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access Path Selection in a Relational Database Management System. In J. Mylopolous and M. Brodie, editors, SIGMOD '89, SIGMOD '89, pages 511--522, San Francisco (CA), 1989. Morgan Kaufmann. Google ScholarDigital Library
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484--489, Jan. 2016.Google ScholarCross Ref
- M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2's LEarning Optimizer. In VLDB, VLDB '01, pages 19--28, 2001. Google ScholarDigital Library
- R. S. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, Cambridge, MA, USA, 1st edition, 1998. Google ScholarDigital Library
- N. Tran, A. Lamb, L. Shrinivas, S. Bodagala, and J. Dave. The Vertica Query Optimizer: The case for specialized query optimizers. In 2014 IEEE 30th International Conference on Data Engineering, ICDE '14, pages 1108--1119, Mar. 2014.Google ScholarCross Ref
- I. Trummer, S. Moseley, D. Maram, S. Jo, and J. Antonakakis. SkinnerDB: Regret-bounded Query Evaluation via Reinforcement Learning. PVLDB, 11(12):2074--2077, 2018. Google ScholarDigital Library
- K. Tzoumas, T. Sellis, and C. Jensen. A Reinforcement Learning Approach for Adaptive Query Processing. Technical Reports, June 2008.Google Scholar
- L. van der Maaten and G. Hinton. Visualizing Data using t-SNE. Journal of Machine Learning Research, 9(Nov):2579--2605, 2008.Google Scholar
- F. Waas and A. Pellenkoft. Join Order Selection (Good Enough Is Easy). In Advances in Databases, BNCD '00, pages 51--67. Springer, Berlin, Heidelberg, July 2000. Google ScholarDigital Library
- W. Wang, M. Zhang, G. Chen, H. V. Jagadish, B. C. Ooi, and K.-L. Tan. Database Meets Deep Learning: Challenges and Opportunities. SIGMOD Rec., 45(2):17--22, Sept. 2016. Google ScholarDigital Library
- L. Yu, J. Wang, K. R. Lai, and X. Zhang. Refining Word Embeddings Using Intensity Scores for Sentiment Analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(3):671--681, Mar. 2018. Google ScholarDigital Library
- J.-Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, and E. Shechtman. Toward Multimodal Image-to-Image Translation. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, NIPS '17, pages 465--476. Curran Associates, Inc., 2017. Google ScholarDigital Library
- S. Zilberstein. Using Anytime Algorithms in Intelligent Systems. AI Magazine, 17(3):73--73, Mar. 1996.Google Scholar
Index Terms
- Neo: a learned query optimizer
Recommendations
Exploration in neo-Hebbian reinforcement learning: Computational approaches to the exploration–exploitation balance with bio-inspired neural networks
AbstractRecent theoretical and experimental works have connected Hebbian plasticity with the reinforcement learning (RL) paradigm, producing a class of trial-and-error learning in artificial neural networks known as neo-Hebbian plasticity. ...
Comments