2006 | OriginalPaper | Buchkapitel
Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods
verfasst von : Petros Drineas, Michael W. Mahoney, S. Muthukrishnan
Erschienen in: Algorithms – ESA 2006
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an
m
×
n
matrix
A
, decompose it as a product of three matrices,
C
,
U
, and
R
, where
C
consists of a small number of columns of
A
,
R
consists of a small number of rows of
A
, and
U
is a small carefully constructed matrix that guarantees that the product
CUR
is “close” to
A
. Applications of such decompositions include the computation of matrix “sketches”, speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved interpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an
m
×
n
matrix
A
, returns as output matrices
C
,
U
,
R
such that
$$\|{A-CUR}\|_F \leq (1+\epsilon)\|{A-A_k}\|_F$$
with probability at least 1–
δ
. Here,
A
k
is the “best” rank-
k
approximation (provided by truncating the Singular Value Decomposition of
A
), and ||
X
||
F
is the Frobenius norm of the matrix
X
. The number of columns in
C
and rows in
R
is a low-degree polynomial in
k
, 1/
ε
, and log(1/
δ
). Our main result is obtained by an extension of our recent relative error approximation algorithm for ℓ
2
regression from overconstrained problems to general ℓ
2
regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top
k
right singular vectors of
A
. In addition, it samples the columns and rows of
A
via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.