research-article

Numerical linear algebra in the streaming model

Authors:
Kenneth L. Clarkson

IBM Almaden, San Jose, CA, USA

IBM Almaden, San Jose, CA, USA
View Profile

,
David P. Woodruff

IBM Almaden, San Jose, CA, USA

IBM Almaden, San Jose, CA, USA
View Profile

STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computingMay 2009Pages 205–214https://doi.org/10.1145/1536414.1536445

Published:31 May 2009Publication History

STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing

Pages 205–214

ABSTRACT

We give near-optimal space bounds in the streaming model for linear algebra problems that include estimation of matrix products, linear regression, low-rank approximation, and approximation of matrix rank. In the streaming model, sketches of input matrices are maintained under updates of matrix entries; we prove results for turnstile updates, given in an arbitrary order. We give the first lower bounds known for the space needed by the sketches, for a given estimation error ε. We sharpen prior upper bounds, with respect to combinations of space, failure probability, and number of passes. The sketch we use for matrix A is simply S^TA, where S is a sign matrix. Our results include the following upper and lower bounds on the bits of space needed for 1-pass algorithms. Here A is an n x d matrix, B is an n x d' matrix, and c := d+d'. These results are given for fixed failure probability; for failure probability δ>0, the upper bounds require a factor of log(1/δ) more space. We assume the inputs have integer entries specified by O(log(nc)) bits, or O(log(nd)) bits. (Matrix Product) Output matrix C with F(A^TB-C) ≤ ε F(A) F(B). We show that Θ(cε^-2log(nc)) space is needed. (Linear Regression) For d'=1, so that B is a vector b, find x so that Ax-b ≤ (1+ε) min_{x' ∈ Real^d} Ax'-b. We show that Θ(d²ε^-1 log(nd)) space is needed. (Rank-k Approximation) Find matrix tA_k of rank no more than k, so that F(A-tA_k) ≤ (1+ε) F{A-A_k}, where A_k is the best rank-k approximation to A. Our lower bound is Ω(kε^-1(n+d)log(nd)) space, and we give a one-pass algorithm matching this when A is given row-wise or column-wise. For general updates, we give a one-pass algorithm needing [O(kε^-2(n + d/ε²)log(nd))] space. We also give upper and lower bounds for algorithms using multiple passes, and a sketching analog of the CUR decomposition.

References

D. Achlioptas and F. Mcsherry. Fast computation of low-rank matrix approximations. J. ACM, 54(2):9, 2007. Google ScholarDigital Library
N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. J. Comput. Syst. Sci., 64(3):719--747, 2002.Google ScholarDigital Library
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137--147, 1999. Google ScholarDigital Library
Z. Bar-Yossef. The complexity of massive data set computations, 2002.Google Scholar
Z. Bar-Yossef. Sampling lower bounds via information theory. In STOC, pages 335--344, 2003. Google ScholarDigital Library
M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In ICALP, pages 693--703, 2002. Google ScholarDigital Library
J. I. Chu and G. Schnitger. The communication complexity of several problems in matrix computation. J. Complexity, 7(4):395--407, 1991.Google ScholarCross Ref
J. I. Chu and G. Schnitger. Communication complexity of matrix computation over finite fields. Mathematical Systems Theory, 28(3):215--228, 1995.Google ScholarCross Ref
D. Coppersmith. Rectangular matrix multiplication revisited. J. Complexity, 13(1):42--49, 1997. Google ScholarDigital Library
G. Cormode and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. J. Algorithms, 55(1):58--75, 2005. Google ScholarDigital Library
A. Deshpande and S. Vempala. Adaptive sampling and fast low-rank matrix approximation. In APPROX-RANDOM, pages 292--303, 2006. Google ScholarDigital Library
P. Drineas, R. Kannan, and M. W. Mahoney. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing, 36(1):184--206, 2006. Google ScholarDigital Library
P. Drineas, M. W. Mahoney, and S. Muthukrishnan. Relative-error CUR matrix decompositions. SIAM Journal on Matrix Analysis and Applications, 30(2):844--881, 2008. Google ScholarDigital Library
P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlós. Faster least squares approximation. Technical report, 2007. arXiv:0710.1435.Google Scholar
D. Feldman, M. Monemizadeh, C. Sohler, and D. Woodruff. Coresets and sketches for subspace approximation problems, 2008.Google Scholar
A. Frieze, R. Kannan, and S. Vempala. Fast monte-carlo algorithms for finding low-rank approximations. J. ACM, 51(6):1025--1041, 2004. Google ScholarDigital Library
S. Har-Peled. Low-rank approximation in linear time, 2006.Google Scholar
X. Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257--299, 1998. Google ScholarDigital Library
E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. Google ScholarDigital Library
P. B. Miltersen, N. Nisan, S. Safra, and A. Wigderson. On data structures and asymmetric communication complexity. J. Comput. Syst. Sci., 57(1):37--49, 1998. Google ScholarDigital Library
S. Muthukrishnan. Data streams: algorithms and applications. Foundations and Trends in Theoretical Computer Science, 2005. Google ScholarDigital Library
J. Nelson and D. Woodruff. Revisiting norm estimating in data streams, 2008.Google Scholar
T. Sarlós. Improved approximation algorithms for large matrices via random projections. In FOCS '06: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pages 143--152, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
V. H. Vu. Spectral norm of random matrices. Combinatorica, 27(6):721--736, 2007. Google ScholarDigital Library

Index Terms

Numerical linear algebra in the streaming model

Recommendations

Low rank approximation and regression in input sparsity time
STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of Computing

We design a new distribution over poly(r ε^-1) x n matrices S so that for any fixed n x d matrix A of rank r, with probability at least 9/10, SAx₂ = (1 pm ε)Ax₂ simultaneously for all x ∈ R^d. Such a matrix S is called a subspace embedding. Furthermore, ...
Read More
Studies in numerical linear algebra
Read More
Sublinear time numerical linear algebra for structured matrices
AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

We show how to solve a number of problems in numerical linear algebra, such as least squares regression, ℓ_p-regression for any p ≥ 1, low rank approximation, and kernel regression, in time T(A)poly(log(nd)), where for a given input matrix A ∈ ℝ^n×d, T(A) ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing
May 2009
750 pages
ISBN:9781605585062
DOI:10.1145/1536414
Program Chair:
Michael Mitzenmacher
Harvard University
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 May 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
linear regression
low-rank approximation
matrix product
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,469of4,586submissions,32%
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 136
  Total Citations
  View Citations
- 977
  Total Downloads
- Downloads (Last 12 months)79
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Numerical linear algebra in the streaming model

STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Low rank approximation and regression in input sparsity time

Studies in numerical linear algebra

Sublinear time numerical linear algebra for structured matrices

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Numerical linear algebra in the streaming model

STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Low rank approximation and regression in input sparsity time

Studies in numerical linear algebra

Sublinear time numerical linear algebra for structured matrices

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media