ABSTRACT
This work explores fundamental modeling and algorithmic issues arising in the well-established MapReduce framework. First, we formally specify a computational model for MapReduce which captures the functional flavor of the paradigm by allowing for a flexible use of parallelism. Indeed, the model diverges from a traditional processor-centric view by featuring parameters which embody only global and local memory constraints, thus favoring a more data-centric view. Second, we apply the model to the fundamental computation task of matrix multiplication presenting upper and lower bounds for both dense and sparse matrix multiplication, which highlight interesting tradeoffs between space and round complexity. Finally, building on the matrix multiplication results, we derive further space-round tradeoffs on matrix inversion and matching.
- R. Amossen, A. Campagna, and R. Pagh. Better size estimation for sparse matrix products. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, volume 6302 of Lecture Notes in Computer Science, pages 406--419, 2010. Google ScholarDigital Library
- Z. Bar-Yossef, T. S. Jayram, R. Kumar, D. Sivakumar, and L. Trevisan. Counting distinct elements in a data stream. In Proc. of the 6th Intl. Workshop on Randomization and Approximation Techniques, pages 1--10, 2002. Google ScholarDigital Library
- G. Bilardi and A. Pietracaprina. Theoretical models of computation. In D. Padua, editor, Encyclopedia of Parallel Computing, pages 1150--1158. Springer, 2011.Google Scholar
- A. Buluç and J. R. Gilbert. Challenges and advances in parallel sparse matrix-matrix multiplication. In Proc. of 37th Intl. Conference on Parallel Processing, pages 503--510, 2008. See also CoRR abs/1006.2183. Google ScholarDigital Library
- F. Chierichetti, R. Kumar, and A. Tomkins. Max-cover in Map-Reduce. In Proc. of the 19th World Wide Web Conference, pages 231--240, 2010. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008. Google ScholarDigital Library
- J. Feldman, S. Muthukrishnan, A. Sidiropoulos, C. Stein, and Z. Svitkina. On distributing symmetric streaming computations. ACM Transactions on Algorithms, 6(4), 2010. Google ScholarDigital Library
- J. Gilbert, V. Shah, and S. Reinhardt. A unified framework for numerical and combinatorial computing. Computing in Science Engineering, 10(2):20--25, 2008. Google ScholarDigital Library
- M. Goodrich, N. Sitchinava, and Q. Zhang. Sorting, searching, and simulation in the MapReduce framework. In Proc. of the 22nd International Symp. on Algorithms and Computation, 2011. See also CoRR abs/1004.470. Google ScholarDigital Library
- M. T. Goodrich. Communication-efficient parallel sorting. SIAM Journal on Computing, 29(2):416--432, 1999. Google ScholarDigital Library
- G. Greiner and R. Jacob. The I/O complexity of sparse matrix dense matrix multiplication. In Proc. of 9th Latin American Theoretical Informatics, volume 6034, pages 143--156, 2010. Google ScholarDigital Library
- F. G. Gustavson. Two fast algorithms for sparse matrices: Multiplication and permuted transposition. ACM Transactions on Mathematical Software, 4(3):250--269, 1978. Google ScholarDigital Library
- J. W. Hong and H. T. Kung. I/O complexity: The red-blue pebble game. In Proceedings of the 13th ACM Symp. on Theory of Computing, pages 326--333, 1981. Google ScholarDigital Library
- D. Irony, S. Toledo, and A. Tiskin. Communication lower bounds for distributed-memory matrix multiplication. Journal of Parallel and Distributed Computing, 64(9):1017--1026, 2004. Google ScholarDigital Library
- J. JáJá. An Introduction to Parallel Algorithms. Addison Wesley Longman Publishing Co., Inc., 1992. Google ScholarDigital Library
- H. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In Proc. of the 21st ACM-SIAM Symp. On Discrete Algorithms, pages 938--948, 2010. Google ScholarDigital Library
- L. R. Kerr. The Effect of Algebraic Structure on the Computational Complexity of Matrix Multiplication. PhD thesis, Cornell University, 1970. Google ScholarDigital Library
- C. P. Kruskal, L. Rudolph, and M. Snir. Techniques for parallel manipulation of sparse matrices. Theoretical Computer Science, 64(2):135--157, 1989. Google ScholarDigital Library
- S. Lattanzi, B. Moseley, S. Suri, and S. Vassilvitskii. Filtering: a method for solving graph problems in MapReduce. In Proc. of the 23rd ACM Symp. on Parallel Algorithms and Architectures, pages 85--94, 2011. Google ScholarDigital Library
- J. Lin and C. Dyer. Data-Intensive Text Processing with MapReduce. Morgan & Claypool, 2010. Google ScholarDigital Library
- G. Manzini. Sparse matrix computations on the hypercube and related networks. Journal of Parallel and Distributed Computing, 21(2):169--183, 1994. Google ScholarDigital Library
- W. F. McColl and A. Tiskin. Memory-efficient matrix multiplication in the BSP model. Algorithmica, 24(3):287--297, 1999.Google ScholarCross Ref
- M. Middendorf, H. Schmeck, H. Schröder, and G. Turner. Multiplication of matrices with different sparseness properties on dynamically reconfigurable meshes. VLSI Design, 9(1):69--81, 1999.Google ScholarCross Ref
- M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, Cambridge MA, 2005. Google ScholarDigital Library
- K. Mulmuley, U. V. Vazirani, and V. V. Vazirani. Matching is as easy as matrix inversion. Combinatorica, 7(1):105--113, 1987. Google ScholarDigital Library
- V. Pan and J. Reif. Efficient parallel solution of linear systems. In Proc. of the 17th ACM Symp. on Theory of Computing, pages 143--152, 1985. Google ScholarDigital Library
- G. Penn. Efficient transitive closure of sparse matrices over closed semirings. Theoretical Computer Science, 354(1):72--81, 2006. Google ScholarDigital Library
- C. Tsourakakis, U. Kang, G. Miller, and C. Faloutsos. DOULION: counting triangles in massive graphs with a coin. In Proc. of the 15th ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining, pages 837--849, 2009. Google ScholarDigital Library
- L. Valiant. A bridging model for parallel computation. Communications of the ACM, 33(8):103--111, Aug. 1990. Google ScholarDigital Library
- R. Yuster and U. Zwick. Detecting short directed cycles using rectangular matrix multiplication and dynamic programming. In Proc. of 15th ACM-SIAM Symp. On Discrete Algorithms, pages 254--260, 2004. Google ScholarDigital Library
- R. Yuster and U. Zwick. Fast sparse matrix multiplication. ACM Transactions on Algorithms, 1(1):2--13, 2005. Google ScholarDigital Library
Index Terms
- Space-round tradeoffs for MapReduce computations
Recommendations
Scalable matrix inversion using MapReduce
HPDC '14: Proceedings of the 23rd international symposium on High-performance parallel and distributed computingMatrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today'...
Structures preserved by matrix inversion
In this paper we investigate some matrix structures on $\cee^{n\times n}$ that have a good behavior under matrix inversion. The first type of structure is closely related to low displacement rank matrices. Next, we show that for a matrix having a low ...
Parallel Matrix Computations Using a Reconfigurable Pipelined Optical Bus
We present fast and cost-efficient parallel algorithms for a number of important and fundamental matrix computation problems on linear arrays with reconfigurable pipelined optical bus systems. These problems include computing the inverse, the ...
Comments