skip to main content
10.1145/2465351.2465354acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Optimus: a dynamic rewriting framework for data-parallel execution plans

Published:15 April 2013Publication History

ABSTRACT

In distributed data-parallel computing, a user program is compiled into an execution plan graph (EPG), typically a directed acyclic graph. This EPG is the core data structure used by modern distributed execution engines for task distribution, job management, and fault tolerance. Once submitted for execution, the EPG remains largely unchanged at runtime except for some limited modifications. This makes it difficult to employ dynamic optimization techniques that could substantially improve the distributed execution based on runtime information.

This paper presents Optimus, a framework for dynamically rewriting an EPG at runtime. Optimus extends dynamic rewrite mechanisms present in systems such as Dryad and CIEL by integrating rewrite policy with a high-level data-parallel language, in this case DryadLINQ. This integration enables optimizations that require knowledge of the semantics of the computation, such as language customizations for domain-specific computations including matrix algebra. We describe the design and implementation of Optimus, outline its interfaces, and detail a number of rewriting techniques that address problems arising in distributed execution including data skew, dynamic data re-partitioning, handling unbounded iterative computations, and protecting important intermediate data for fault tolerance. We evaluate Optimus with real applications and data and show significant performance gains compared to manual optimization or customized systems. We demonstrate the versatility of dynamic EPG rewriting for data-parallel computing, and argue that it is an essential feature of any general-purpose distributed dataflow execution engine.

References

  1. P. K. Agarwal, G. Cormode, Z. Huang, J. M. Phillips, Z. Wei, and K. Yi. Mergeable summaries. In PODS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou. Re-optimizing data parallel computing. In NSDI '12, April 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the outliers in mapreduce clusters using mantri. In OSDI'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Avnur and J. M. Hellerstein. Eddies: Continuously adaptive query processing. In ACM SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Babu, P. Bizarro, and D. DeWitt. Proactive re-optimization with Rio. In ACM SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Bennett and S. Lanning. The Netflix prize. In ACM SIGKDD 2007.Google ScholarGoogle Scholar
  7. K. Beyer, P. J. Haas, B. Reinwald, Y. Sismanis, and R. Gemulla. On synopses for distinct-value estimation under multiset operations. In SIGMOD 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Bu, B. Howe, M. Balazinska, and M. Ernst. Haloop: Efficient iterative data processing on large clusters. In VLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Chambers, A. Raniwala, F. Perry, S. Adams, R. Henry, R. Bradshaw, and N. Weizenbaum. FlumeJava: easy, efficient data-parallel pipelines. In PLDI'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Chaudhuri, R. Motwani, and V. Narasayya. On random sampling over joins. In SIGMOD 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chaudhuri, R. Motwani, and V. R. Narasayya. Random sampling for histogram construction: How much is enough? In SIGMOD 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Cormode and M. Hadjieleftheriou. Finding frequent items in data streams. In VLDB 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1):107--113, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Deshpande, Z. Ives, and V. Raman. Adaptive query processing. Foundations and Trends in Databases, 1(1):1--140, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri. Practical skew handling in parallel joins. In VLDB 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Gonina, A. Kannan, J. Shafer, and M. Budiu. Parallelizing large-scale data processing applications with data skew: a case study in product-offer matching. In Intl. Workshop on MapReduce and its Applications (MAPREDUCE), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Graefe. Volcano - an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng., 6(1):120--135, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Gunda, L. Ravindranath, C. A. Thekkath, Y. Yu, and L. Zhuang. Nectar: Automatic Management of Data and Computation in Datacenters. In OSDI'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. The Hadoop project. http://hadoop.apache.org/.Google ScholarGoogle Scholar
  20. Hadoop NextGen MapReduce (YARN). http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html.Google ScholarGoogle Scholar
  21. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI '11, March 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In EuroSys 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Kabra and D. J. DeWitt. Efficient mid-query re-optimization of sub-optimal query execution plans. In ACM SIGMOD, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Kannan, I. E. Givoni, R. Agrawal, and A. Fuxman. Matching unstructured product offers to structured product specifications. In ACM SIGKDD 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Q. Ke, V. Prabhakaran, Y. Xie, Y. Yu, J. Wu, and J. Yang. Optimizing data partitioning for data-parallel computing. In HotOS '11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Y. Ko, I. Hoque, B. Cho, and I. Gupta. On availability of intermediate data in cloud computations. In HotOS '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. SkewTune: Mitigating skew in MapReduce applications. In ACM SIGMOD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mahout project. http://mahout.apache.org/.Google ScholarGoogle Scholar
  29. V. Markl, V. Raman, D. E. Simmen, G. M. Lohman, and H. Pirahesh. Robust query processing through progressive optimization. In ACM SIGMOD, pages 659--670, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Mishra and M. H. Eich. Join processing in relational databases. ACM Comput. Surv., 24(1):63--113, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand. Ciel: a universal execution engine for distributed data-flow computing. In NSDI 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Okcan and M. Riedewald. Processing theta-joins using mapreduce. In ACM SIGMOD, pages 949--960, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: a not-so-foreign language for data processing. In SIGMOD 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker. A comparison of approaches to large-scale data analysis. In ACM SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Power and J. Li. Piccolo: Building fast, distributed programs with partitioned tables. In OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Z. Qian, X. Chen, N. Kang, M. Chen, Y. Yu, T. Moscibroda, and Z. Zhang. MadLINQ: Large-scale distributed matrix computation for the cloud. In EuroSys 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a Map-Reduce framework. Proc. VLDB Endow., 2(2), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Y. Xie, F. Yu, and M. Abadi. De-anonymizing the internet using unreliable IDs. In SIGCOMM 2009, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Y. Yu, P. K. Gunda, and M. Isard. Distributed aggregation for data-parallel computing: interfaces and implementations. In SOSP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Zhang, R. C. Hucheng Zhou, X. Fan, Z. Guo, H. Lin, J. Y. Li, W. Lin, J. Zhou, and L. Zhou. Optimizing data shuffling in data-parallel computation by understanding user-defined functions. In NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimus: a dynamic rewriting framework for data-parallel execution plans

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              EuroSys '13: Proceedings of the 8th ACM European Conference on Computer Systems
              April 2013
              401 pages
              ISBN:9781450319942
              DOI:10.1145/2465351

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 15 April 2013

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              EuroSys '13 Paper Acceptance Rate28of143submissions,20%Overall Acceptance Rate241of1,308submissions,18%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader