4.5 Proof of PCA lower bound
Theorem
2 states that a set of vectors named
N with
n dimensions must have a PCA matrix
T whose first
r (
r is an arbitrary number between 1 and
n) or entire rows can form a partial matrix
\(T_0\) of
T to allow the inequality
\(dist(T_{0}\cdot V_1, T_{0}\cdot V_2) \le dist(V_1, V_2) (dist_{PCA}(V_1, V_2) \le dist(V_1, V_2))\) to hold, where
\(V_1\) and
\(V_2\) are two arbitrary
n-dimensional vectors. This is the theoretical basis for the pruning condition, which states that if
\(dist_{PCA} (i, center(C_j) )>maxdknn(C_j)\), then
\(dist(i, u)>dknn(u)\) must hold for an arbitrary user vector
u in cluster
\(C_j\).
Note that Theorem
2 is widely used in the literature [
26,
27,
29,
35,
51,
53] where PCA is used to reverse the spatial features of high-dimensional data. All these papers refer to a report [
30] and state that the proof of Theorem
2 is in the report. However, the link to such a report is no longer accessible. For the completeness of this paper, we present the proof of Theorem
2 as follows.
\(M_1 =\) \(\begin{Bmatrix} E_{1,1} \ldots E_{1,n_1} \ldots E_{i,1} \ldots E_{i,n_i} \ldots E_{p,1} \dots E_{p,n_p} \end{Bmatrix}\) \(\Rightarrow M_2 = \) \(\begin{Bmatrix} F_{1,1} \ldots F_{1,n_1} \ldots F_{i,1} \ldots F_{i,n_i} \ldots F_{p,1} \ldots F_{p,n_p} \end{Bmatrix}\)
To prove \(M_2^T\) is a PCA matrix of N:
Pick an arbitrary group of rows \(F^T_{i,1}\), \(F^T_{i,2}\), ..., \(F^T_{i,n_i}\) from \(M_2^T\) (\(1\le i\le p\)) and take an arbitrary row \(F^T_{i,j}\) from the group. \(F_{i,j}\) is produced by Schmidt orthogonalization based on \(E_{i,1}\), \(E_{i,2}\), ..., \(E_{i,n_i}\) which is a group of linear-independent eigen vectors of M corresponding to the eigen value \(e_i\), as a result, \(|F_{i,j}|=1\ne 0\), and \(F_{i,j}\) is a linear combination of \(E_{i,1}\), \(E_{i,2}\), ..., \(E_{i,n_i}\). Therefore, \(F_{i,j}\) is an eigen vector of M corresponding to the eigen value \(e_i\). Because the vector group \(F_{i,1}\), \(F_{i,2}\), ..., \(F_{i,n_i}\) is produced by Schmidt orthogonalization based on \(E_{i,1}\), \(E_{i,2}\), ..., \(E_{i,n_i}\) which is a group of linear-independent eigen vectors, as a result, \(|F_{i,1}|=|F_{i,2}|=\ldots =|F_{i,n_i}|=1\ne 0\) and \(F_{i,x}\cdot F_{i,y}=0\ (x\ne y)\). Therefore, \(F_{i,1}\), \(F_{i,2}\), ..., \(F_{i,n_i}\) are linear independent. Based on the deduction above, \(M_2^T\) is a PCA matrix of N.
To prove \(dist(M_0 \cdot V_1, M_0 \cdot V_2)\le dist(V_1, V_2)\)(\(dist_{PCA} (V_1, V_2)\le dist(V_1, V_2)\)), where \(M_0\) is a matrix composed by the first R (R is an arbitrary number between 1 and n) or the whole rows of \(M_2^T\) and \(V_1\) and \(V_2\) are both of n-dimensional:
For two arbitrary vectors \(F_{i,x}\) and \(F_{j,y}\) \((x\ne y)\), if \(i \ne j\), \(F_{i,x} \cdot F_{j,y}=0\) because \(F_{i,x}\) and \(F_{j,y}\) are two eigen vectors corresponding to two different eigen values (\(e_i\) and \(e_j\)) of a real symmetric matrix M. Otherwise, \(F_{i,x} \cdot F_{j,y}=0\) because \(F_{i,x}\) and \(F_{j,y}\) are produced by Schmidt orthogonalization based on the same linear-independent vector group. And \(|F_{1,1}|=\dots = |F_{p,n_p}|=1\). As a result, \(M_2^T\) is a matrix consisting of a standard orthometric base. Therefore, \(dist(M_2^T \cdot V_1, M_2^T \cdot V_2)\)=\(((M_2^T \cdot V_1)^T \cdot (M_2^T \cdot V_2))^\frac{1}{2}\)=\((V_1^T\cdot M_2^T\cdot M_2\cdot V_2)^\frac{1}{2}\)=\((V_1^{T} \cdot E \cdot V_2)^\frac{1}{2} \)=\((V_1^T \cdot V_2)^\frac{1}{2}\)=\(dist(V_1, V_2)\). Assuming that \((M_2^T \cdot V_1)^T=(a_1, a_2,\dots , a_n)\) and \((M_2^T \cdot V_2)^T=(b_1, b_2,\dots , b_n)\), and \(M_0\) is composed by the first R rows of \(M_2^T\), it will be held that \(dist(M_0 \cdot V_1, M_0 \cdot V_2)= (\sum \limits _{i=1}\limits ^{R}{(a_i-b_i)^2})^\frac{1}{2}\le dist(M_2 \cdot V_1, M_2 \cdot V_2)=(\sum \limits _{i=1}\limits ^{n}{(a_i-b_i)^2})^\frac{1}{2}=dist(V_1, V_2).\)
As a result, we can find \(M_2^T\), which is a PCA matrix of N whose first R (R is an arbitrary number between 1 and n) or whole rows can compose a partial matrix \(M_0\) of \(M_2\) to let \(dist(M_0\cdot V_1, M_0\cdot V_2)\le dist(V_1, V_2) (dist_{PCA}(V_1, V_2)\le dist(V_1,V_2)) hold.\) So the theorem is proved.
\(\square \)