research-article

Database Meets Deep Learning: Challenges and Opportunities

Authors:
Wei Wang

National University of Singapore

National University of Singapore
View Profile

,
Meihui Zhang

Singapore University of Technology and Design

Singapore University of Technology and Design
View Profile

,
Gang Chen

Zhejiang University

Zhejiang University
View Profile

,
H. V. Jagadish

University of Michigan

University of Michigan
View Profile

,
Beng Chin Ooi

National University of Singapore

National University of Singapore
View Profile

,
Kian-Lee Tan

National University of Singapore

National University of Singapore
View Profile

Authors Info & Claims

ACM SIGMOD Record Volume 45 Issue 2June 2016pp 17–22https://doi.org/10.1145/3003665.3003669

Published:28 September 2016Publication History

ACM SIGMOD Record

Abstract

Deep learning has recently become very popular on account of its incredible success in many complex datadriven applications, including image classification and speech recognition. The database community has worked on data-driven applications for many years, and therefore should be playing a lead role in supporting this new wave. However, databases and deep learning are different in terms of both techniques and applications. In this paper, we discuss research problems at the intersection of the two fields. In particular, we discuss possible improvements for deep learning systems from a database perspective, and analyze database applications that may benefit from deep learning techniques.

References

F. Bastien, P. Lamblin, R. Pascanu, J. Bergstra, I. J. Goodfellow, A. Bergeron, N. Bouchard, and Y. Bengio. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012.Google Scholar
J. Chen, R. Monga, S. Bengio, and R. Józefowicz. Revisiting distributed synchronous SGD. CoRR, abs/1604.00981, 2016.Google Scholar
T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. CoRR, abs/1512.01274, 2015.Google Scholar
T. Chen, B. Xu, C. Zhang, and C. Guestrin. Training deep nets with sublinear memory cost. CoRR, abs/1604.06174, 2016.Google Scholar
A. Coates, B. Huval, T. Wang, D. J. Wu, B. C. Catanzaro, and A. Y. Ng. Deep learning with COTS HPC systems. In ICML, pages 1337--1345, 2013.Google ScholarDigital Library
R. Collobert, K. Kavukcuoglu, and C. Farabet. Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop, number EPFL-CONF-192376, 2011.Google Scholar
M. Courbariaux, Y. Bengio, and J.-P. David. Low precision arithmetic for deep learning. arXiv preprint arXiv:1412.7024, 2014.Google Scholar
H. Cui, H. Zhang, G. R. Ganger, P. B. Gibbons, and E. P. Xing. Geeps: Scalable deep learning on distributed gpus with a gpu-specialized parameter server. In EuroSys, page 4. ACM, 2016. Google ScholarDigital Library
J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. W. Senior, P. A. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In NIPS, pages 1232--1240, 2012. Google ScholarDigital Library
X. L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, and W. Zhang. From data fusion to knowledge fusion. PVLDB, 7(10):881--892, 2014. Google ScholarDigital Library
M. A. et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015.Google Scholar
J. Gao, H. Jagadish, and B. C. Ooi. Active sampler: Light-weight accelerator for complex data analytics at scale. arXiv preprint arXiv:1512.03880, 2015.Google Scholar
Y. Goldberg. A primer on neural network models for natural language processing. CoRR, abs/1510.00726, 2015.Google Scholar
C. Guo, C. S. Jensen, and B. Yang. Towards total traffic awareness. ACM SIGMOD Record, 43(3):18--23, 2014. Google ScholarDigital Library
S. Gupta, W. Zhang, and J. Milthorpe. Model accuracy and runtime tradeoff in distributed deep learning. arXiv preprint arXiv:1509.04210, 2015.Google Scholar
S. Hadjis, C. Zhang, I. Mitliagkas, and C. Ré. Omnivore: An optimizer for multi-device deep learning on cpus and gpus. CoRR, abs/1606.04487, 2016.Google Scholar
J. R. Haritsa. The picasso database query optimizer visualizer. Proceedings of the VLDB Endowment, 3(1-2):1517--1520, 2010. Google ScholarDigital Library
Y. B. Ian Goodfellow and A. Courville. Deep learning. Book in preparation for MIT Press, 2016.Google Scholar
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093, 2014.Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1097--1105, 2012. Google ScholarDigital Library
G. Lacey, G. W. Taylor, and S. Areibi. Deep learning on fpgas: Past, present, and future. CoRR,abs/1602.04283, 2016.Google Scholar
Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarCross Ref
M. L. Lee, M. Kitsuregawa, B. C. Ooi, K.-L. Tan, and A. Mondal. Towards self-tuning data placement in parallel database systems. In ACM SIGMOD Record, volume 29, pages 225--236. ACM, 2000. Google ScholarDigital Library
F. Li and H. Jagadish. Constructing an interactive natural language interface for relational databases. PVLDB, 8(1):73--84, 2014. Google ScholarDigital Library
F. Li, B. C. Ooi, M. T. Özsu, and S. Wu. Distributed data management using mapreduce. ACM Comput. Surv., 46(3):31:1--31:42, 2014. Google ScholarDigital Library
D. R. Mould. Models for disease progression: New approaches and uses. Clinical Pharmacology & Therapeutics, 92(1):125--131, 2012.Google ScholarCross Ref
B. C. Ooi, K. Tan, Q. T. Tran, J. W. L. Yip, G. Chen, Z. J. Ling, T. Nguyen, A. K. H. Tung, and M. Zhang. Contextual crowd intelligence. SIGKDD Explorations, 16(1):39--46, 2014. Google ScholarDigital Library
B. C. Ooi, K.-L. Tan, S. Wang, W. Wang, Q. Cai, G. Chen, J. Gao, Z. Luo, A. K. H. Tung, Y. Wang, Z. Xie, M. Zhang, and K. Zheng. SINGA: A distributed deep learning platform. In ACM Multimedia, 2015. Google ScholarDigital Library
C. Ré, D. Agrawal, M. Balazinska, M. I. Cafarella, M. I. Jordan, T. Kraska, and R. Ramakrishnan. Machine learning and databases: The sound of things to come or a cacophony of hype? In SIGMOD, pages 283--284, 2015. Google ScholarDigital Library
F. Seide, H. Fu, J. Droppo, G. Li, and D. Yu. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In INTERSPEECH, pages 1058--1062, 2014.Google Scholar
D. Silver and et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484--489, 2016.Google ScholarCross Ref
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.Google Scholar
R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In NIPS, pages 926--934, 2013. Google ScholarDigital Library
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104--3112, 2014. Google ScholarDigital Library
K.-L. Tan, Q. Cai, B. C. Ooi, W.-F. Wong, C. Yao, and H. Zhang. In-memory databases: Challenges and opportunities from software and hardware perspectives. ACM SIGMOD Record, 44(2):35--40, 2015. Google ScholarDigital Library
O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton. Grammar as a foreign language. arXiv:1412.7449, 2014.Google Scholar
Q. H. Vu, M. Lupu, and B. C. Ooi. Peer-to-peer computing. Springer, 2010.Google ScholarCross Ref
W. Wang, G. Chen, T. T. A. Dinh, J. Gao, B. C. Ooi, K.-L. Tan, and S. Wang. SINGA: Putting deep learning in the hands of multimedia users. In ACM Multimedia, 2015. Google ScholarDigital Library
W. Wang, B. C. Ooi, X. Yang, D. Zhang, and Y. Zhuang. Effective multi-modal retrieval based on stacked auto-encoders. PVLDB, 7(8):649--660, 2014. Google ScholarDigital Library
W. Wang, X. Yang, B. C. Ooi, D. Zhang, and Y. Zhuang. Effective deep learning-based multi-modal retrieval. The VLDB Journal, pages 1--23, 2015. Google ScholarDigital Library
J. Wei, W. Dai, A. Qiao, Q. Ho, H. Cui, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P. Xing. Managed communication and consistency for fast data-parallel iterative analytics. In SoCC, pages 381--394, 2015. Google ScholarDigital Library
R. Wu, S. Yan, Y. Shan, Q. Dang, and G. Sun. Deep image: Scaling up image recognition. CoRR, abs/1501.02876, 2015.Google Scholar
T. Wu, L. Chen, P. Hui, C. J. Zhang, and W. Li. Hear the whole story: Towards the diversity of opinion in crowdsourcing markets. PVLDB, 8(5):485--496, 2015. Google ScholarDigital Library
C. Yao, D. Agrawal, G. Chen, Q. Lin, B. C. Ooi, W. F. Wong, and M. Zhang. Exploiting single-threaded model in multi-core in-memory systems. IEEE Trans. Knowl. Data Eng., 2016.Google ScholarDigital Library
M. D. Zeiler. Adadelta: An adaptive learning rate method. arXiv:1212.5701, 2012.Google Scholar
H. Zhang, G. Chen, B. C. Ooi, K. Tan, and M. Zhang. In-memory big data management and processing: A survey. IEEE Trans. Knowl. Data Eng., 27(7):1920--1948, 2015.Google ScholarDigital Library

Recommendations

Advanced Deep Learning with Keras: Apply deep learning techniques, autoencoders, GANs, variational autoencoders, deep reinforcement learning, policy gradients, and more
Read More
Security Meets Deep Learning
Read More
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGMOD Record Volume 45, Issue 2
June 2016
66 pages
ISSN:0163-5808
DOI:10.1145/3003665
Editors:
Yanlei Diao
University of Massachusetts Amherst
,
Vanessa Braganholo
Universidade Federal Fluminense
,
Marco Brambilla
Politecnico di Milano
,
Chee Yong Chan
National University of Singapore
,
Rada Chirkova
North Carolina State University
,
Zackary Ives
University of Pennsylvania
,
Anastasios Kementsietsidis
Google Research
,
Jeffrey Naughton
University of Wisconsin-Madison
,
Frank Neven
Hasselt University
,
Olga Papaemmanoui
Brandeis Univesity
,
Aditya Parameswaran
University of Illinois
,
Anish Das Sarma
Google Research
,
Alkis Simitsis
HP Labs
,
Wang-Chiew Tan
University of California Santa Cruz
,
Nesime Tatbul
MIT CSAIL
,
Marianne Winslett
University of Illinois
,
Jun Yang
Duke University
Issue’s Table of Contents
Copyright © 2016 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 September 2016
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 84
  Total Citations
  View Citations
- 2,221
  Total Downloads
- Downloads (Last 12 months)286
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Database Meets Deep Learning: Challenges and Opportunities

ACM SIGMOD Record

Abstract

References

Cited By

Recommendations

Advanced Deep Learning with Keras: Apply deep learning techniques, autoencoders, GANs, variational autoencoders, deep reinforcement learning, policy gradients, and more

Security Meets Deep Learning

Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning