skip to main content
research-article

Chasing similarity: distribution-aware aggregation scheduling

Published:01 November 2018Publication History
Skip Abstract Section

Abstract

Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an optional local pre-aggregation step and then repartitions the intermediate result across the network. While local pre-aggregation works well for low-cardinality aggregations, the network communication cost remains significant for high-cardinality aggregations even after local pre-aggregation. The problem is that the repartition-based algorithm for high-cardinality aggregation does not fully utilize the network.

In this work, we first formulate a mathematical model that captures the performance of parallel aggregation. We prove that finding optimal aggregation plans from a known data distribution is NP-hard, assuming the Small Set Expansion conjecture. We propose GRASP, a GReedy Aggregation Scheduling Protocol that decomposes parallel aggregation into phases. GRASP is distribution-aware as it aggregates the most similar partitions in each phase to reduce the transmitted data size in subsequent phases. In addition, GRASP takes the available network bandwidth into account when scheduling aggregations in each phase to maximize network utilization. The experimental evaluation on real data shows that GRASP outperforms repartition-based aggregation by 3.5x and LOOM by 2.0x.

References

  1. M. Al-Fares, A. Loukissas, and A. Vahdat. A Scalable, Commodity Data Center Network Architecture. SIGCOMM Comput. Commun. Rev., 38(4):63--74, Aug. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Austrin, T. Pitassi, and Y. Wu. Inapproximability of Treewidth, One-shot Pebbling, and Related Layout Problems. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 13--24. Springer, 2012.Google ScholarGoogle Scholar
  3. A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise Independent Permutations (Extended Abstract). In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC '98, pages 327--336, New York, NY, USA, 1998. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Cayley. A theorem on trees. Quarterly Journal of Pure Applied Mathematics, 23:376--378, 1889.Google ScholarGoogle Scholar
  5. M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing Data Transfers in Computer Clusters with Orchestra. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 98--109, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Cieslewicz and K. A. Ross. Adaptive Aggregation on Chip Multiprocessors. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages 339--350. VLDB Endowment, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Costa, A. Donnelly, A. I. T. Rowstron, and G. O'Shea. Camdoop: Exploiting In-network Aggregation for Big Data Applications. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, April 25--27, 2012, pages 29--42, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Culhane, K. Kogan, C. Jayalath, and P. Eugster. LOOM: Optimal Aggregation Overlays for In-memory Big Data Processing. In Proceedings of the 6th USENIX Conference on Hot Topics in Cloud Computing, HotCloud'14, pages 13--13, Berkeley, CA, USA, 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Culhane, K. Kogan, C. Jayalath, and P. Eugster. Optimal communication structures for big data aggregation. In 2015 IEEE Conference on Computer Communications, INFOCOM 2015, Kowloon, Hong Kong, April 26 - May 1, 2015, pages 1643--1651, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  10. D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri. Practical Skew Handling in Parallel Joins. In Proceedings of the 18th International Conference on Very Large Data Bases, VLDB '92, pages 27--40, San Francisco, CA, USA, 1992. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Elseidy, A. Elguindy, A. Vitorovic, and C. Koch. Scalable and Adaptive Online Joins. PVLDB, 7(6):441--452, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Gan, J. Ding, K. S. Tai, V. Sharan, and P. Bailis. Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries. CoRR, abs/1803.01969, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Gionis, P. Indyk, and R. Motwani. Similarity Search in High Dimensions via Hashing. In Proceedings of the 25th International Conference on Very Large Data Bases, VLDB '99, pages 518--529, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Graefe. Query Evaluation Techniques for Large Databases. ACM Comput. Surv., 25(2):73--170, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. GRASP. https://code.osu.edu/pythia/grasp.Google ScholarGoogle Scholar
  16. J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In Proceedings of the Twelfth International Conference on Data Engineering, February 26 - March 1, 1996, New Orleans, Louisiana, pages 152--159, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. G. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. In Proceedings of the ACM SIGCOMM 2009 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Barcelona, Spain, August 16--21, 2009, pages 51--62, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. I. Gupta, R. v. Renesse, and K. P. Birman. Scalable Fault-Tolerant Aggregation in Large Process Groups. In Proceedings of the 2001 International Conference on Dependable Systems and Networks (Formerly: FTCS), DSN '01, pages 433--442, Washington, DC, USA, 2001. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. He and J. McAuley. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11 - 15, 2016, pages 507--517, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. https://htor.inf.ethz.ch/research/topologies/.Google ScholarGoogle Scholar
  21. P. Indyk and R. Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC '98, pages 604--613, New York, NY, USA, 1998. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Jiang and G. Agrawal. Efficient SIMD and MIMD Parallelization of Hash-based Aggregation by Conflict Mitigation. In Proceedings of the International Conference on Supercomputing, ICS '17, pages 24:1--24:11, New York, NY, USA, 2017. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Larson. Data Reduction by Partial Preaggregation. In Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26 - March 1, 2002, pages 706--715, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. Leis, B. Radke, A. Gubichev, A. Mirchev, P. A. Boncz, A. Kemper, and T. Neumann. Query Optimization Through the Looking Glass, and What We Found Running the Join Order Benchmark. VLDB J., 27(5):643--668, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Liu, A. Salmasi, S. Blanas, and A. Sidiropoulos. Chasing Similarity: Distribution-aware Aggregation Scheduling (Extended Version). CoRR, abs/1810.00511, 2018.Google ScholarGoogle Scholar
  26. Y. Lu, A. Shanbhag, A. Jindal, and S. Madden. AdaptDB: Adaptive Partitioning for Distributed Joins. PVLDB, 10(5):589--600, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Luo, J. Nelson, L. Ceze, A. Phanishayee, and A. Krishnamurthy. Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training. In Proceedings of the ACM Symposium on Cloud Computing, SoCC 2018, Carlsbad, CA, USA, October 11--13, 2018, pages 41--54, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: A Tiny AGgregation Service for Ad-Hoc Sensor Networks. In 5th Symposium on Operating System Design and Implementation (OSDI 2002), Boston, Massachusetts, USA, December 9--11, 2002, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. The Design of an Acquisitional Query Processor for Sensor Networks. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD '03, pages 491--502, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Madden, R. Szewczyk, M. J. Franklin, and D. E. Culler. Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks. In 4th IEEE Workshop on Mobile Computing Systems and Applications (WMCSA 2002), 20--21 June 2002, Callicoon, NY, USA, pages 49--58, 2002. Google ScholarGoogle Scholar
  31. L. Mai, L. Rupprecht, A. Alim, P. Costa, M. Migliavacca, P. Pietzuch, and A. L. Wolf. NetAgg: Using Middleboxes for Application-specific On-path Aggregation in Data Centres. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, CoNEXT '14, pages 249--262, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: Interactive Analysis of Web-Scale Datasets. PVLDB, 3(1):330--339, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. I. Müller, A. Arteaga, T. Hoefler, and G. Alonso. Reproducible Floating-Point Aggregation in RDBMSs. CoRR, abs/1802.09883, 2018.Google ScholarGoogle Scholar
  34. I. Müller, P. Sanders, A. Lacurie, W. Lehner, and F. Färber. Cache-Efficient Aggregation: Hashing Is Sorting. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1123--1136, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. O. Polychroniou, A. Raghavan, and K. A. Ross. Rethinking SIMD Vectorization for In-Memory Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1493--1508, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. O. Polychroniou, R. Sen, and K. A. Ross. Track Join: Distributed Joins with Minimal Network Traffic. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 1483--1494, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. P. Raghavendra and D. Steurer. Graph Expansion and the Unique Games Conjecture. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 755--764. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. P. Raghavendra, D. Steurer, and M. Tulsiani. Reductions Between Expansion Problems. In Computational Complexity (CCC), 2012 IEEE 27th Annual Conference on, pages 64--73. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. V. Raman, G. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman, T. Malkemus, R. Mueller, I. Pandis, B. Schiefer, D. Sharpe, R. Sidle, A. Storm, and L. Zhang. DB2 with BLU Acceleration: So Much More Than Just a Column Store. PVLDB, 6(11):1080--1091, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. W. Rödiger, S. Idicula, A. Kemper, and T. Neumann. Flow-Join: Adaptive Skew Handling for Distributed Joins over High-speed Networks. In 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016, pages 1194--1205, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  41. W. Rödiger, T. Mühlbauer, A. Kemper, and T. Neumann. High-Speed Query Processing over High-Speed Networks. PVLDB, 9(4):228--239, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. W. Rödiger, T. Mühlbauer, P. Unterbrunner, A. Reiser, A. Kemper, and T. Neumann. Locality-sensitive Operators for Parallel Main-memory Database Clusters. In IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014, pages 592--603, 2014.Google ScholarGoogle Scholar
  43. V. Satuluri and S. Parthasarathy. Bayesian Locality Sensitive Hashing for Fast Similarity Search. PVLDB, 5(5):430--441, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. A. Shah, J. M. Hellerstein, S. Chandrasekaran, and M. J. Franklin. Flux: An Adaptive Partitioning Operator for Continuous Query Systems. In Proceedings of the 19th International Conference on Data Engineering, March 5-8, 2003, Bangalore, India, pages 25--36, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  45. A. Shatdal and J. F. Naughton. Adaptive Parallel Aggregation Algorithms. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, SIGMOD '95, pages 104--114, New York, NY, USA, 1995. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. E. Vermote-NASA GSFC and MODAPS SIPS - NASA. (2015). MOD09 MODIS/Terra L2 Surface Reflectance, 5-Min Swath 250m, 500m, and 1km. NASA LP DAAC.Google ScholarGoogle Scholar
  47. E. Vermote-NASA GSFC and MODAPS SIPS - NASA. (2015). MOD09A1 MODIS/Surface Reflectance 8-Day L3 Global 500m SIN Grid. NASA LP DAAC.Google ScholarGoogle Scholar
  48. L. Wang, M. Zhou, Z. Zhang, M. Shan, and A. Zhou. NUMA-Aware Scalable and Efficient In-Memory Aggregation on Large Domains. IEEE Trans. Knowl. Data Eng., 27(4):1071--1084, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  49. J. L. Wolf, P. S. Yu, J. Turek, and D. M. Dias. A Parallel Hash Join Algorithm for Managing Data Skew. IEEE Trans. Parallel Distrib. Syst., 4(12):1355--1371, Dec. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Y. Xu, P. Kostamaa, X. Zhou, and L. Chen. Handling Data Skew in Parallel Joins in Shared-nothing Systems. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD '08, pages 1043--1052, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. P. Yalagandula and M. Dahlin. A Scalable Distributed Information Management System. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM '04, pages 379--390, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Y. Ye, K. A. Ross, and N. Vesdapunt. Scalable Aggregation on Multicore Processors. In Proceedings of the Seventh International Workshop on Data Management on New Hardware, DaMoN '11, pages 1--9, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. https://www.yelp.com/dataset/documentation/json.Google ScholarGoogle Scholar
  54. Y. Yu, P. K. Gunda, and M. Isard. Distributed Aggregation for Data-parallel Computing: Interfaces and Implementations. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP '09, pages 247--260, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. E. Zamanian, C. Binnig, and A. Salama. Locality-aware Partitioning in Parallel Database Systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 17--30, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 12, Issue 3
    November 2018
    138 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 November 2018
    Published in pvldb Volume 12, Issue 3

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader