skip to main content
research-article

Rafiki: machine learning as an analytics service system

Published:01 October 2018Publication History
Skip Abstract Section

Abstract

Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications. Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user's daily intake, and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expert knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.

References

  1. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: A system for large-scale machine learning. In OSDI 16, pages 265--283, GA, 2016. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13:281--305, Feb. 2012. Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. P. Kuksa. Natural language processing (almost) from scratch. CoRR, abs/1103.0398, 2011.Google ScholarGoogle Scholar
  4. D. Crankshaw, X. Wang, G. Zhou, M. J. Franklin, J. E. Gonzalez, and I. Stoica. Clipper: A low-latency online prediction serving system. In NSDI, pages 613--627, Boston, MA, 2017. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Curino, E. Philip Charles Jones, R. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, and N. Zeldovich. Relational cloud: A database-as-a-service for the cloud. CIDR, pages 235--240, April 2011.Google ScholarGoogle Scholar
  6. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. Gao, W. Wang, M. Zhang, G. Chen, H. V. Jagadish, G. Li, T. K. Ng, B. C. Ooi, S. Wang, and J. Zhou. PANDA: facilitating usable AI development. CoRR, abs/1804.09997, 2018.Google ScholarGoogle Scholar
  8. D. Golovin, B. Solnik, S. Moitra, G. Kochanski, J. E. Karro, and D. Sculley, editors. Google Vizier: A Service for Black-Box Optimization, 2017.Google ScholarGoogle Scholar
  9. H. Hacigumus, B. Iyer, and S. Mehrotra. Providing database as a service. In ICDE, pages 29--38, Feb 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.Google ScholarGoogle Scholar
  11. F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1 mb model size. CoRR, abs/1602.07360, 2016.Google ScholarGoogle Scholar
  12. D. Kang, J. Emmons, F. Abuzaid, P. Bailis, and M. Zaharia. Noscope: Optimizing neural network queries over video at scale. PVLDB, 10(11): 1586--1597, Aug. 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106--1114, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn., 51(2): 181--207, May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  16. T. Li, J. Zhong, J. Liu, W. Wu, and C. Zhang. Ease.ml: Towards multi-tenant resource sharing for machine learning workloads. PVLDB, 11(5):607--620, Jan. 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Li. Deep reinforcement learning: An overview. CoRR, abs/1701.07274, 2017.Google ScholarGoogle Scholar
  18. Z. J. Ling, Q. T. Tran, J. Fan, G. C. H. Koh, T. Nguyen, C. S. Tan, J. W. L. Yip, and M. Zhang. Gemini: An integrative healthcare analytics system. PVLDB, 7(13):1766--1771, Aug. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. Mnih, A. Puigdomènech Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous Methods for Deep Reinforcement Learning. ArXiv e-prints, Feb. 2016.Google ScholarGoogle Scholar
  20. A. N. Modi, C. Y. Koo, C. Y. Foo, C. Mewald, D. M. Baylor, E. Breck, H.-T. Cheng, J. Wilkiewicz, L. Koc, L. Lew, M. A. Zinkevich, M. Wicke, M. Ispir, N. Polyzotis, N. Fiedel, S. E. Haykal, S. Whang, S. Roy, S. Ramesh, V. Jain, X. Zhang, and Z. Haque. Tfx: A tensorflow-based production-scale machine learning platform. In KDD 2017, 2017.Google ScholarGoogle Scholar
  21. B. C. Ooi, K. Tan, S. Wang, W. Wang, Q. Cai, G. Chen, J. Gao, Z. Luo, A. K. H. Tung, Y. Wang, Z. Xie, M. Zhang, and K. Zheng. SINGA: A distributed deep learning platform. In ACM Multimedia, pages 685--688, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean. Efficient Neural Architecture Search via Parameter Sharing. ArXiv e-prints, Feb. 2018.Google ScholarGoogle Scholar
  23. C. Ré, D. Agrawal, M. Balazinska, M. I. Cafarella, M. I. Jordan, T. Kraska, and R. Ramakrishnan. Machine learning and databases: The sound of things to come or a cacophony of hype? In SIGMOD, pages 283--284, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. CoRR, abs/1707.06347, 2017.Google ScholarGoogle Scholar
  25. D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, and M. Young. Machine learning: The high interest credit card of technical debt. In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop), 2014.Google ScholarGoogle Scholar
  26. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.Google ScholarGoogle Scholar
  27. J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian Optimization of Machine Learning Algorithms. ArXiv e-prints, June 2012.Google ScholarGoogle Scholar
  28. J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, M. M. A. Patwary, Prabhat, and R. P. Adams. Scalable Bayesian Optimization Using Deep Neural Networks. ArXiv e-prints, Feb. 2015.Google ScholarGoogle Scholar
  29. I. Sutskever, J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and momentum in deep learning. ICML'13, pages III-1139--III-1147. JMLR.org, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Swearingen, W. Drevo, B. Cyphers, A. Cuesta-Infante, A. Ross, and K. Veeramachaneni. ATM: A distributed, collaborative, scalable system for automated machine learning. In 2017 IEEE BigData 2017, Boston, MA, USA, December 11-14, 2017, pages 151--162, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  31. W. Wang, X. Yang, B. C. Ooi, D. Zhang, and Y. Zhuang. Effective deep learning-based multi-modal retrieval. The VLDB Journal, pages 1--23, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W. Wang, M. Zhang, G. Chen, H. Jagadish, B. C. Ooi, and K.-L. Tan. Database meets deep learning: Challenges and opportunities. ACM SIGMOD Record, 45(2):17--22, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Zhang, L. Zeng, W. Wu, and C. Zhang. How good are machine learning clouds for binary classification with good features? CoRR, abs/1707.09562, 2017.Google ScholarGoogle Scholar
  34. K. Zheng, W. Wang, J. Gao, K. Y. Ngiam, B. C. Ooi, and J. W. L. Yip. Capturing feature-level irregularity in disease progression modeling. In CIKM, pages 1579--1588, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Rafiki: machine learning as an analytics service system
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 12, Issue 2
      October 2018
      98 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      • Published: 1 October 2018
      Published in pvldb Volume 12, Issue 2

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader