research-article

Rafiki: machine learning as an analytics service system

Authors:
Wei Wang

National University of Singapore

National University of Singapore
View Profile

,
Jinyang Gao

National University of Singapore

National University of Singapore
View Profile

,
Meihui Zhang

University of Electronic Science and Technology of China

University of Electronic Science and Technology of China
View Profile

,
Sheng Wang

National University of Singapore

National University of Singapore
View Profile

,
Gang Chen

Zhejiang University

Zhejiang University
View Profile

,
Teck Khim Ng

National University of Singapore

National University of Singapore
View Profile

,
Beng Chin Ooi

National University of Singapore

National University of Singapore
View Profile

,
Jie Shao

Beijing Institute of Technology

Beijing Institute of Technology
View Profile

,
Moaz Reyad

National University of Singapore

National University of Singapore
View Profile

Proceedings of the VLDB Endowment Volume 12 Issue 2pp 128–140https://doi.org/10.14778/3282495.3282499

Published:01 October 2018Publication History

Proceedings of the VLDB Endowment

Abstract

Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications. Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake, and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expert knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.

References

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: A system for large-scale machine learning. In OSDI 16, pages 265--283, GA, 2016. USENIX Association. Google ScholarDigital Library
J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13:281--305, Feb. 2012. Google ScholarCross Ref
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. P. Kuksa. Natural language processing (almost) from scratch. CoRR, abs/1103.0398, 2011.Google Scholar
D. Crankshaw, X. Wang, G. Zhou, M. J. Franklin, J. E. Gonzalez, and I. Stoica. Clipper: A low-latency online prediction serving system. In NSDI, pages 613--627, Boston, MA, 2017. USENIX Association. Google ScholarDigital Library
C. Curino, E. Philip Charles Jones, R. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, and N. Zeldovich. Relational cloud: A database-as-a-service for the cloud. CIDR, pages 235--240, April 2011.Google Scholar
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.Google ScholarCross Ref
J. Gao, W. Wang, M. Zhang, G. Chen, H. V. Jagadish, G. Li, T. K. Ng, B. C. Ooi, S. Wang, and J. Zhou. PANDA: facilitating usable AI development. CoRR, abs/1804.09997, 2018.Google Scholar
D. Golovin, B. Solnik, S. Moitra, G. Kochanski, J. E. Karro, and D. Sculley, editors. Google Vizier: A Service for Black-Box Optimization, 2017.Google Scholar
H. Hacigumus, B. Iyer, and S. Mehrotra. Providing database as a service. In ICDE, pages 29--38, Feb 2002. Google ScholarDigital Library
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.Google Scholar
F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1 mb model size. CoRR, abs/1602.07360, 2016.Google Scholar
D. Kang, J. Emmons, F. Abuzaid, P. Bailis, and M. Zaharia. Noscope: Optimizing neural network queries over video at scale. PVLDB, 10(11): 1586--1597, Aug. 2017. Google ScholarDigital Library
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106--1114, 2012. Google ScholarDigital Library
L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn., 51(2): 181--207, May 2003. Google ScholarDigital Library
Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarCross Ref
T. Li, J. Zhong, J. Liu, W. Wu, and C. Zhang. Ease.ml: Towards multi-tenant resource sharing for machine learning workloads. PVLDB, 11(5):607--620, Jan. 2018. Google ScholarDigital Library
Y. Li. Deep reinforcement learning: An overview. CoRR, abs/1701.07274, 2017.Google Scholar
Z. J. Ling, Q. T. Tran, J. Fan, G. C. H. Koh, T. Nguyen, C. S. Tan, J. W. L. Yip, and M. Zhang. Gemini: An integrative healthcare analytics system. PVLDB, 7(13):1766--1771, Aug. 2014. Google ScholarDigital Library
V. Mnih, A. Puigdomènech Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous Methods for Deep Reinforcement Learning. ArXiv e-prints, Feb. 2016.Google Scholar
A. N. Modi, C. Y. Koo, C. Y. Foo, C. Mewald, D. M. Baylor, E. Breck, H.-T. Cheng, J. Wilkiewicz, L. Koc, L. Lew, M. A. Zinkevich, M. Wicke, M. Ispir, N. Polyzotis, N. Fiedel, S. E. Haykal, S. Whang, S. Roy, S. Ramesh, V. Jain, X. Zhang, and Z. Haque. Tfx: A tensorflow-based production-scale machine learning platform. In KDD 2017, 2017.Google Scholar
B. C. Ooi, K. Tan, S. Wang, W. Wang, Q. Cai, G. Chen, J. Gao, Z. Luo, A. K. H. Tung, Y. Wang, Z. Xie, M. Zhang, and K. Zheng. SINGA: A distributed deep learning platform. In ACM Multimedia, pages 685--688, 2015. Google ScholarDigital Library
H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean. Efficient Neural Architecture Search via Parameter Sharing. ArXiv e-prints, Feb. 2018.Google Scholar
C. Ré, D. Agrawal, M. Balazinska, M. I. Cafarella, M. I. Jordan, T. Kraska, and R. Ramakrishnan. Machine learning and databases: The sound of things to come or a cacophony of hype? In SIGMOD, pages 283--284, 2015. Google ScholarDigital Library
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. CoRR, abs/1707.06347, 2017.Google Scholar
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, and M. Young. Machine learning: The high interest credit card of technical debt. In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop), 2014.Google Scholar
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.Google Scholar
J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian Optimization of Machine Learning Algorithms. ArXiv e-prints, June 2012.Google Scholar
J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, M. M. A. Patwary, Prabhat, and R. P. Adams. Scalable Bayesian Optimization Using Deep Neural Networks. ArXiv e-prints, Feb. 2015.Google Scholar
I. Sutskever, J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and momentum in deep learning. ICML'13, pages III-1139--III-1147. JMLR.org, 2013. Google ScholarDigital Library
T. Swearingen, W. Drevo, B. Cyphers, A. Cuesta-Infante, A. Ross, and K. Veeramachaneni. ATM: A distributed, collaborative, scalable system for automated machine learning. In 2017 IEEE BigData 2017, Boston, MA, USA, December 11-14, 2017, pages 151--162, 2017.Google ScholarCross Ref
W. Wang, X. Yang, B. C. Ooi, D. Zhang, and Y. Zhuang. Effective deep learning-based multi-modal retrieval. The VLDB Journal, pages 1--23, 2015. Google ScholarDigital Library
W. Wang, M. Zhang, G. Chen, H. Jagadish, B. C. Ooi, and K.-L. Tan. Database meets deep learning: Challenges and opportunities. ACM SIGMOD Record, 45(2):17--22, 2016. Google ScholarDigital Library
H. Zhang, L. Zeng, W. Wu, and C. Zhang. How good are machine learning clouds for binary classification with good features? CoRR, abs/1707.09562, 2017.Google Scholar
K. Zheng, W. Wang, J. Gao, K. Y. Ngiam, B. C. Ooi, and J. W. L. Yip. Capturing feature-level irregularity in disease progression modeling. In CIKM, pages 1579--1588, 2017. Google ScholarDigital Library

Index Terms

Rafiki: machine learning as an analytics service system
1. Computing methodologies
  1. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads
Middleware '17: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference

High performance computing (HPC) applications, such as metagenomics and other big data systems, need to store and analyze huge volumes of semi-structured data. Such applications often rely on NoSQL-based datastores, and optimizing these databases is a ...
Read More
Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Read More
Big Data Analytics with R and Hadoop
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 12, Issue 2
October 2018
98 pages
ISSN:2150-8097
Editors:
Lei Chen
HKUST
,
Fatma Özcan
IBM Research - Almaden
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 October 2018
Published in pvldb Volume 12, Issue 2
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 319
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Rafiki: machine learning as an analytics service system

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads

Transductive Multilabel Learning via Label Set Propagation

Big Data Analytics with R and Hadoop

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Rafiki: machine learning as an analytics service system

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads

Transductive Multilabel Learning via Label Set Propagation

Big Data Analytics with R and Hadoop

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media