Abstract
Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an optional local pre-aggregation step and then repartitions the intermediate result across the network. While local pre-aggregation works well for low-cardinality aggregations, the network communication cost remains significant for high-cardinality aggregations even after local pre-aggregation. The problem is that the repartition-based algorithm for high-cardinality aggregation does not fully utilize the network.
In this work, we first formulate a mathematical model that captures the performance of parallel aggregation. We prove that finding optimal aggregation plans from a known data distribution is NP-hard, assuming the Small Set Expansion conjecture. We propose GRASP, a GReedy Aggregation Scheduling Protocol that decomposes parallel aggregation into phases. GRASP is distribution-aware as it aggregates the most similar partitions in each phase to reduce the transmitted data size in subsequent phases. In addition, GRASP takes the available network bandwidth into account when scheduling aggregations in each phase to maximize network utilization. The experimental evaluation on real data shows that GRASP outperforms repartition-based aggregation by 3.5x and LOOM by 2.0x.
- M. Al-Fares, A. Loukissas, and A. Vahdat. A Scalable, Commodity Data Center Network Architecture. SIGCOMM Comput. Commun. Rev., 38(4):63--74, Aug. 2008. Google ScholarDigital Library
- P. Austrin, T. Pitassi, and Y. Wu. Inapproximability of Treewidth, One-shot Pebbling, and Related Layout Problems. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 13--24. Springer, 2012.Google Scholar
- A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise Independent Permutations (Extended Abstract). In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC '98, pages 327--336, New York, NY, USA, 1998. ACM. Google ScholarDigital Library
- A. Cayley. A theorem on trees. Quarterly Journal of Pure Applied Mathematics, 23:376--378, 1889.Google Scholar
- M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing Data Transfers in Computer Clusters with Orchestra. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 98--109, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- J. Cieslewicz and K. A. Ross. Adaptive Aggregation on Chip Multiprocessors. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages 339--350. VLDB Endowment, 2007. Google ScholarDigital Library
- P. Costa, A. Donnelly, A. I. T. Rowstron, and G. O'Shea. Camdoop: Exploiting In-network Aggregation for Big Data Applications. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, April 25--27, 2012, pages 29--42, 2012. Google ScholarDigital Library
- W. Culhane, K. Kogan, C. Jayalath, and P. Eugster. LOOM: Optimal Aggregation Overlays for In-memory Big Data Processing. In Proceedings of the 6th USENIX Conference on Hot Topics in Cloud Computing, HotCloud'14, pages 13--13, Berkeley, CA, USA, 2014. USENIX Association. Google ScholarDigital Library
- W. Culhane, K. Kogan, C. Jayalath, and P. Eugster. Optimal communication structures for big data aggregation. In 2015 IEEE Conference on Computer Communications, INFOCOM 2015, Kowloon, Hong Kong, April 26 - May 1, 2015, pages 1643--1651, 2015.Google ScholarCross Ref
- D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri. Practical Skew Handling in Parallel Joins. In Proceedings of the 18th International Conference on Very Large Data Bases, VLDB '92, pages 27--40, San Francisco, CA, USA, 1992. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- M. Elseidy, A. Elguindy, A. Vitorovic, and C. Koch. Scalable and Adaptive Online Joins. PVLDB, 7(6):441--452, 2014. Google ScholarDigital Library
- E. Gan, J. Ding, K. S. Tai, V. Sharan, and P. Bailis. Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries. CoRR, abs/1803.01969, 2018. Google ScholarDigital Library
- A. Gionis, P. Indyk, and R. Motwani. Similarity Search in High Dimensions via Hashing. In Proceedings of the 25th International Conference on Very Large Data Bases, VLDB '99, pages 518--529, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- G. Graefe. Query Evaluation Techniques for Large Databases. ACM Comput. Surv., 25(2):73--170, 1993. Google ScholarDigital Library
- GRASP. https://code.osu.edu/pythia/grasp.Google Scholar
- J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In Proceedings of the Twelfth International Conference on Data Engineering, February 26 - March 1, 1996, New Orleans, Louisiana, pages 152--159, 1996. Google ScholarDigital Library
- A. G. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. In Proceedings of the ACM SIGCOMM 2009 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Barcelona, Spain, August 16--21, 2009, pages 51--62, 2009. Google ScholarDigital Library
- I. Gupta, R. v. Renesse, and K. P. Birman. Scalable Fault-Tolerant Aggregation in Large Process Groups. In Proceedings of the 2001 International Conference on Dependable Systems and Networks (Formerly: FTCS), DSN '01, pages 433--442, Washington, DC, USA, 2001. IEEE Computer Society. Google ScholarDigital Library
- R. He and J. McAuley. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11 - 15, 2016, pages 507--517, 2016. Google ScholarDigital Library
- https://htor.inf.ethz.ch/research/topologies/.Google Scholar
- P. Indyk and R. Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC '98, pages 604--613, New York, NY, USA, 1998. ACM. Google ScholarDigital Library
- P. Jiang and G. Agrawal. Efficient SIMD and MIMD Parallelization of Hash-based Aggregation by Conflict Mitigation. In Proceedings of the International Conference on Supercomputing, ICS '17, pages 24:1--24:11, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- P. Larson. Data Reduction by Partial Preaggregation. In Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26 - March 1, 2002, pages 706--715, 2002. Google ScholarDigital Library
- V. Leis, B. Radke, A. Gubichev, A. Mirchev, P. A. Boncz, A. Kemper, and T. Neumann. Query Optimization Through the Looking Glass, and What We Found Running the Join Order Benchmark. VLDB J., 27(5):643--668, 2018. Google ScholarDigital Library
- F. Liu, A. Salmasi, S. Blanas, and A. Sidiropoulos. Chasing Similarity: Distribution-aware Aggregation Scheduling (Extended Version). CoRR, abs/1810.00511, 2018.Google Scholar
- Y. Lu, A. Shanbhag, A. Jindal, and S. Madden. AdaptDB: Adaptive Partitioning for Distributed Joins. PVLDB, 10(5):589--600, 2017. Google ScholarDigital Library
- L. Luo, J. Nelson, L. Ceze, A. Phanishayee, and A. Krishnamurthy. Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training. In Proceedings of the ACM Symposium on Cloud Computing, SoCC 2018, Carlsbad, CA, USA, October 11--13, 2018, pages 41--54, 2018. Google ScholarDigital Library
- S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: A Tiny AGgregation Service for Ad-Hoc Sensor Networks. In 5th Symposium on Operating System Design and Implementation (OSDI 2002), Boston, Massachusetts, USA, December 9--11, 2002, 2002. Google ScholarDigital Library
- S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. The Design of an Acquisitional Query Processor for Sensor Networks. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD '03, pages 491--502, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- S. Madden, R. Szewczyk, M. J. Franklin, and D. E. Culler. Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks. In 4th IEEE Workshop on Mobile Computing Systems and Applications (WMCSA 2002), 20--21 June 2002, Callicoon, NY, USA, pages 49--58, 2002. Google Scholar
- L. Mai, L. Rupprecht, A. Alim, P. Costa, M. Migliavacca, P. Pietzuch, and A. L. Wolf. NetAgg: Using Middleboxes for Application-specific On-path Aggregation in Data Centres. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, CoNEXT '14, pages 249--262, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: Interactive Analysis of Web-Scale Datasets. PVLDB, 3(1):330--339, 2010. Google ScholarDigital Library
- I. Müller, A. Arteaga, T. Hoefler, and G. Alonso. Reproducible Floating-Point Aggregation in RDBMSs. CoRR, abs/1802.09883, 2018.Google Scholar
- I. Müller, P. Sanders, A. Lacurie, W. Lehner, and F. Färber. Cache-Efficient Aggregation: Hashing Is Sorting. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1123--1136, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- O. Polychroniou, A. Raghavan, and K. A. Ross. Rethinking SIMD Vectorization for In-Memory Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1493--1508, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- O. Polychroniou, R. Sen, and K. A. Ross. Track Join: Distributed Joins with Minimal Network Traffic. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 1483--1494, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- P. Raghavendra and D. Steurer. Graph Expansion and the Unique Games Conjecture. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 755--764. ACM, 2010. Google ScholarDigital Library
- P. Raghavendra, D. Steurer, and M. Tulsiani. Reductions Between Expansion Problems. In Computational Complexity (CCC), 2012 IEEE 27th Annual Conference on, pages 64--73. IEEE, 2012. Google ScholarDigital Library
- V. Raman, G. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman, T. Malkemus, R. Mueller, I. Pandis, B. Schiefer, D. Sharpe, R. Sidle, A. Storm, and L. Zhang. DB2 with BLU Acceleration: So Much More Than Just a Column Store. PVLDB, 6(11):1080--1091, 2013. Google ScholarDigital Library
- W. Rödiger, S. Idicula, A. Kemper, and T. Neumann. Flow-Join: Adaptive Skew Handling for Distributed Joins over High-speed Networks. In 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016, pages 1194--1205, 2016.Google ScholarCross Ref
- W. Rödiger, T. Mühlbauer, A. Kemper, and T. Neumann. High-Speed Query Processing over High-Speed Networks. PVLDB, 9(4):228--239, 2015. Google ScholarDigital Library
- W. Rödiger, T. Mühlbauer, P. Unterbrunner, A. Reiser, A. Kemper, and T. Neumann. Locality-sensitive Operators for Parallel Main-memory Database Clusters. In IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014, pages 592--603, 2014.Google Scholar
- V. Satuluri and S. Parthasarathy. Bayesian Locality Sensitive Hashing for Fast Similarity Search. PVLDB, 5(5):430--441, 2012. Google ScholarDigital Library
- M. A. Shah, J. M. Hellerstein, S. Chandrasekaran, and M. J. Franklin. Flux: An Adaptive Partitioning Operator for Continuous Query Systems. In Proceedings of the 19th International Conference on Data Engineering, March 5-8, 2003, Bangalore, India, pages 25--36, 2003.Google ScholarCross Ref
- A. Shatdal and J. F. Naughton. Adaptive Parallel Aggregation Algorithms. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, SIGMOD '95, pages 104--114, New York, NY, USA, 1995. ACM. Google ScholarDigital Library
- E. Vermote-NASA GSFC and MODAPS SIPS - NASA. (2015). MOD09 MODIS/Terra L2 Surface Reflectance, 5-Min Swath 250m, 500m, and 1km. NASA LP DAAC.Google Scholar
- E. Vermote-NASA GSFC and MODAPS SIPS - NASA. (2015). MOD09A1 MODIS/Surface Reflectance 8-Day L3 Global 500m SIN Grid. NASA LP DAAC.Google Scholar
- L. Wang, M. Zhou, Z. Zhang, M. Shan, and A. Zhou. NUMA-Aware Scalable and Efficient In-Memory Aggregation on Large Domains. IEEE Trans. Knowl. Data Eng., 27(4):1071--1084, 2015.Google ScholarCross Ref
- J. L. Wolf, P. S. Yu, J. Turek, and D. M. Dias. A Parallel Hash Join Algorithm for Managing Data Skew. IEEE Trans. Parallel Distrib. Syst., 4(12):1355--1371, Dec. 1993. Google ScholarDigital Library
- Y. Xu, P. Kostamaa, X. Zhou, and L. Chen. Handling Data Skew in Parallel Joins in Shared-nothing Systems. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD '08, pages 1043--1052, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- P. Yalagandula and M. Dahlin. A Scalable Distributed Information Management System. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM '04, pages 379--390, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- Y. Ye, K. A. Ross, and N. Vesdapunt. Scalable Aggregation on Multicore Processors. In Proceedings of the Seventh International Workshop on Data Management on New Hardware, DaMoN '11, pages 1--9, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- https://www.yelp.com/dataset/documentation/json.Google Scholar
- Y. Yu, P. K. Gunda, and M. Isard. Distributed Aggregation for Data-parallel Computing: Interfaces and Implementations. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP '09, pages 247--260, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- E. Zamanian, C. Binnig, and A. Salama. Locality-aware Partitioning in Parallel Database Systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 17--30, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
Recommendations
Chasing: an efficient streaming mechanism for scalable and resilient video-on-demand service over peer-to-peer networks
NETWORKING'06: Proceedings of the 5th international IFIP-TC6 conference on Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; Mobile and Wireless Communications SystemsProvisioning scalable and resilient Video-on-Demand (VoD) service is both challenging and interesting. Recently, peer-to-peer (P2P) networks are introduced to address the scalability of VoD service over Internet. Most of existing work follows the line ...
Flexible aggregate similarity search
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataAggregate similarity search, a.k.a. aggregate nearest neighbor (Ann) query, finds many useful applications in spatial and multimedia databases. Given a group Q of M query objects, it retrieves the most (or top-k) similar object to Q from a database P, ...
A scalable approach to chasing multiple moving targets with multiple agents
IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial IntelligenceChasing multiple mobile targets with multiple agents is important in several applications, such as computer games and police chasing scenarios. Existing approaches can compute optimal policies. However, they have a limited scalability, as they implement ...
Comments