skip to main content
10.1145/1536414.1536445acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article

Numerical linear algebra in the streaming model

Published:31 May 2009Publication History

ABSTRACT

We give near-optimal space bounds in the streaming model for linear algebra problems that include estimation of matrix products, linear regression, low-rank approximation, and approximation of matrix rank. In the streaming model, sketches of input matrices are maintained under updates of matrix entries; we prove results for turnstile updates, given in an arbitrary order. We give the first lower bounds known for the space needed by the sketches, for a given estimation error ε. We sharpen prior upper bounds, with respect to combinations of space, failure probability, and number of passes. The sketch we use for matrix A is simply STA, where S is a sign matrix. Our results include the following upper and lower bounds on the bits of space needed for 1-pass algorithms. Here A is an n x d matrix, B is an n x d' matrix, and c := d+d'. These results are given for fixed failure probability; for failure probability δ>0, the upper bounds require a factor of log(1/δ) more space. We assume the inputs have integer entries specified by O(log(nc)) bits, or O(log(nd)) bits. (Matrix Product) Output matrix C with F(ATB-C) ≤ ε F(A) F(B). We show that Θ(cε-2log(nc)) space is needed. (Linear Regression) For d'=1, so that B is a vector b, find x so that Ax-b ≤ (1+ε) minx' ∈ Reald Ax'-b. We show that Θ(d2ε-1 log(nd)) space is needed. (Rank-k Approximation) Find matrix tAk of rank no more than k, so that F(A-tAk) ≤ (1+ε) F{A-Ak}, where Ak is the best rank-k approximation to A. Our lower bound is Ω(kε-1(n+d)log(nd)) space, and we give a one-pass algorithm matching this when A is given row-wise or column-wise. For general updates, we give a one-pass algorithm needing [O(kε-2(n + d/ε2)log(nd))] space. We also give upper and lower bounds for algorithms using multiple passes, and a sketching analog of the CUR decomposition.

References

  1. D. Achlioptas and F. Mcsherry. Fast computation of low-rank matrix approximations. J. ACM, 54(2):9, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. J. Comput. Syst. Sci., 64(3):719--747, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137--147, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Z. Bar-Yossef. The complexity of massive data set computations, 2002.Google ScholarGoogle Scholar
  5. Z. Bar-Yossef. Sampling lower bounds via information theory. In STOC, pages 335--344, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In ICALP, pages 693--703, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. I. Chu and G. Schnitger. The communication complexity of several problems in matrix computation. J. Complexity, 7(4):395--407, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  8. J. I. Chu and G. Schnitger. Communication complexity of matrix computation over finite fields. Mathematical Systems Theory, 28(3):215--228, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  9. D. Coppersmith. Rectangular matrix multiplication revisited. J. Complexity, 13(1):42--49, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Cormode and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. J. Algorithms, 55(1):58--75, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Deshpande and S. Vempala. Adaptive sampling and fast low-rank matrix approximation. In APPROX-RANDOM, pages 292--303, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Drineas, R. Kannan, and M. W. Mahoney. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing, 36(1):184--206, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Drineas, M. W. Mahoney, and S. Muthukrishnan. Relative-error CUR matrix decompositions. SIAM Journal on Matrix Analysis and Applications, 30(2):844--881, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlós. Faster least squares approximation. Technical report, 2007. arXiv:0710.1435.Google ScholarGoogle Scholar
  15. D. Feldman, M. Monemizadeh, C. Sohler, and D. Woodruff. Coresets and sketches for subspace approximation problems, 2008.Google ScholarGoogle Scholar
  16. A. Frieze, R. Kannan, and S. Vempala. Fast monte-carlo algorithms for finding low-rank approximations. J. ACM, 51(6):1025--1041, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Har-Peled. Low-rank approximation in linear time, 2006.Google ScholarGoogle Scholar
  18. X. Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257--299, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. B. Miltersen, N. Nisan, S. Safra, and A. Wigderson. On data structures and asymmetric communication complexity. J. Comput. Syst. Sci., 57(1):37--49, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Muthukrishnan. Data streams: algorithms and applications. Foundations and Trends in Theoretical Computer Science, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Nelson and D. Woodruff. Revisiting norm estimating in data streams, 2008.Google ScholarGoogle Scholar
  23. T. Sarlós. Improved approximation algorithms for large matrices via random projections. In FOCS '06: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pages 143--152, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. H. Vu. Spectral norm of random matrices. Combinatorica, 27(6):721--736, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Numerical linear algebra in the streaming model

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing
          May 2009
          750 pages
          ISBN:9781605585062
          DOI:10.1145/1536414

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 31 May 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,469of4,586submissions,32%

          Upcoming Conference

          STOC '24
          56th Annual ACM Symposium on Theory of Computing (STOC 2024)
          June 24 - 28, 2024
          Vancouver , BC , Canada

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader